New Mexico High School Sports

Background for Rating Methodology

Results of varsity soccer/basketball games, as reported to the NMAA website, are used to calculate a relative strength of each varsity team in New Mexico. Starting in the fall of 2012, NMAA has been using the site for official game score results. In some cases (soccer, for example) there are other sites that report scores reliably, and when reasonable, data is extracted from those sites as well. However, for games that have been reported to more than one site, the NMAA website is considered official.

The strength index is defined so that the difference in the strength of the teams can be used to predict the score differential that would be expected to occur if the teams play each other. For basketball, a strength difference of 10 points predicts the stronger team will win by ten points, 20 points predicts a twenty point differential, and so on. For soccer, the strength index is set so that a difference of 100 strength points amounts to a one-goal differential. Every sport is adjusted to make the strength rating easy to evaluate in an approximate way. The greater the strength differential, the more likely it is that the stronger team will win any given contest.

The strength index is calculated using a Bayesian statistical approach (Markov Chain Monte Carlo), using all the games reported during the season as raw data. In soccer, for example, if team A wins by two goals over team B, and then team B wins by 3 goals over team C, then we have a piece of evidence that if team A were to play team C, we might expect a 5 goal differential in favor of team A. That is, the observed soccer data suggests team A is 200 points stronger than team B, and 500 points stronger than team C.

Only current season data is used, and every rating calculation starts fresh, making no prior assumptions about the strengths of each team determined earlier in the season. For this reason, we need to wait several weeks into the season before enough data is accumulated (i.e., games are played) to make the calculations meaningful.

Furthermore, the model reveals what we already know intuitively: teams do not always play at the same strength from day to day. In fact, the calculation method provides not just a single strength value for each team, but a statistical distribution of strength values that represents how the team varies in strength from one day to the next. This strength variation allows us to calculate an error bar that characterizes the uncertainty of the team's playing strength on any given day. The (single) strength value we quote for each team is simply the median value of the strength distribution.

Over all soccer teams in New Mexico, the average uncertainty (error bar) in determining the strength of a team corresponds to approximately 3 goals, implying that on any given day, two teams within 3 goals (300 points) of each other should be expected to have a close, competitive match. A related way of thinking about a game between two evenly matched teams is that we find either team will have about a 45% chance of winning the game outright, leaving a 10% chance the game will go to a shootout (tie).

The actual average error bar depends on the sport, and does not become representative until well into the season.

The model used makes some limiting assumptions:

  • It assumes that the strength distribution for each each team does not change with time throughout the season -- if a star player gets hurt, or sits out an important game then his team may well play weaker one day than another. We have no information that allows us to model that situation, and errors in this assumption increase the uncertainty in a team's strength estimate (the strength distribution gets "wider").
  • It assumes each game is independent of every other game.
  • The original method assumes each game has the same weight as all others. However, games that are blowouts are probably not as indicative of the relative strengths of teams as are games that have small score differentials. Once one team gets comfortably ahead of another, the playing strategy and players participating on both sides are often different than if the game were close. We have recently addressed these concerns, somewhat heuristically, and so the most recent method applies weighting factors to games -- games are weighted according to both the score differential and to a time factor that discounts games played a long time ago.
  • Nevertheless, the model is reasonably useful -- for boys and girls varsity soccer and basketball games, the strengths we calculate correctly predict the winner of approximately 80% - 85% of the games played.


    Go back to Main Page

    Questions or comments: send email to Bob Walker ()