Recent Changes and/or Additions to the Way Ratings are Calculated/Reported
I have included additional pages to rank teams not only by their strength ratings, but also by their RPI scores.
The strength ratings that I have always posted here include several factors designed to evaluate how well teams
are playing at this point in the season. While that approach is useful for predicting winners of upcoming games,
it is not the best system for evaluating which teams have had the best overall seasons. For that reason, I have
included an option to rank teams by their RPI scores as well as by the strength ratings. You will see this option
on the main pages of the site, where I show which ranking system is begin used, and provide a button to allow
you to quickly switch from rankings based on strength ratings, or rankings based on RPI scores.
Today I will upload the first draft of ratings for the 2014 season. There have not been any significant changes to the ratings methodology. For the first few weeks, I will use only the website data that comes from the MaxPreps website. Other changes -- state classifications have changed, so what was reported last year as 1A-3A is now 1A-4A, 4A is 5A, 5A is 6A, etc. Current district and class alignments are being used.
The software that reads game results and schedules is getting that information from the
MaxPreps.com website for New Mexico high school athletics.
Some data is available that I have not used yet -- discovering what games go to kicks from the mark (penalty-kick
shootouts), and which games are district games, tournament games, and so on. These have only a minor effect
on the team strength calculations.
I have reported a probability for winning on team schedule pages and on state tournament game predictions.
While there was nothing "wrong" with the calculation, the
probabilities did not mean what most people think they were
supposed to. The probabilities reported were appropriate for the winning
percentage in the scenario that
the two teams played a lot of games against each other -- like in the hundreds. That never happens of course.
Games that go to kicks from the mark (KFTM, or "PK's") are still counted as ties in terms of calculating the playing strength, but the winner of the kicks is credited with a win in terms of standings and for calculating the RPI score. A team that wins KFTM after a 1-1 tie at the end of overtimes is listed as "W 2-1" instead of "T 1-1".
I have entered notations to tell the ratings program when a game between two teams in the same district is NOT an official district game. It is a manual process, so if I miss finding one by hand, you'll need to send me email to fix it.
I have included a way to display comments about particular games on the team schedule pages. If a game entry has a comment, the background color of the cell displaying the game result will be shown in green. Mouse over the score, and the comment will display.
I included links so that this site provides an RSS feed -- just click on the "XML" button at the top of the home page if you want to subscribe, and then you will get notifications when I update the site with new results.
I have listed results of JV games for this update. I may take it out at the next round of calculations.
Comparing JV results to Varsity results is unreliable for several reasons:
So, maybe you should take the JV ratings with a grain of salt. Encourage your coaches to report their JV scores.
I have been trying to add results of reported JV games to the calculations, but there are not enough games between JV teams and varsity teams to establish a reliable placement of the JV pool of teams relative to the varsity pool of teams. The only JV games that can be included must involve the JV team with a rated varsity opponent, and the result is not allowed to be a "blowout" (which in this case means the final score difference has to be smaller than 5 goals). This requirement basically eliminates most JV contests. I will keep working on it, and perhaps toward the end of the season, we can get some interesting JV information. It would help if more JV games would be reported.
I added a web page for BV and GV teams to highlight the closest games coming up for the next week, based on how closely matched the team strengths are.
We are far enough into the season that I have set a requirement that teams must play at least 4 games against other rated teams in order to be included in the calculation. This will prevent wild results that can happen when a team is too weakly coupled to the rest of the field.
Results of a preliminary calculation for games played from the beginning of the fall season until 8/29/2009 have been posted. There are so few games that some of the results will be unrealistic, and you can expect large changes in the strength ratings for the first several weeks of the season. To be considered for a rating, a team must have played at least two games against a rated opponent.
An update including games reported by Thursday at 9:30 AM has been posted. Updated results now include games through Wednesday 2/18/2009. Strength calculations themselves are not very sensitive to just a few additional reported games, but since we are nearing the end of the season, it is nice to have the team schedule pages updated. There has been approximately an additional 60 games reported for each of the BV and GV teams.
Any games listed as a win by a score of 2-0 are considered to be forfeits. Forfeits have been included on the team schedule pages, but are not used in strength calculations. Forfeits are reflected in the team's win-loss records (overall, in class, and in district). They do not contribute to strength of schedule calculations, or to the RPI scores that are shown on the team schedule pages.
Note that the custom seems to be that only games that were won by a potentially forfeiting team are re-scored to show forfeits. As a result, any games played by a forfeiting team that were also lost are still given the normal weighting factor. The objective of these calculations is to assign a strength value to a team's playing ability; so a two-point loss (for example) to a strong opponent (in a game that would have been a forfeit has the team won instead) still provides a valid data point that tends to place the forfeiting team's playing ability close to that of the other team. As a result, forfeits do not strongly change the forfeiting team's rated strength, even though they certainly do affect the win-loss record, standing in district, if applicable, and so on.Feb 13, 2009: Bug fix in RPI scores
A small error in calculating RPI scores was fixed.
On the merit pages, we show additonal results that characterize how successful the team strength ratings are in predicting the winners of games. A piechart is shown that displays the relative fraction of games won or lost as expected, and of games won over stronger opponents, and games lost to weaker opponents.
Now that games are weighted based on how recently the game was played, and how close the winning margin was, we show two plots that characterize the winner prediction success rate based on these factors. The first plot shows the winner prediction success rate for each week of the season, and the last one shows the success rate as a function of how close the winning margin is predicted to be.
Beginning with results posted on Feb 2, 2009, the basketball ratings are based on calculations that weight some games more than others. After watching the ratings over several weeks, I have come to think that these weighted-game ratings are more indicative of a team's actual current playing strength.
The rationale for weighting games is to address two of the most obvious shortcomings inherent in the assumptions that are made when modeling the game score data for teams.
The first shortcoming is that teams do not play with the same strength value over the entire season. Key players may be unavailable for periods of time, or as the season progresses, a team may learn how to play together more effectively. For reasons like these, we apply a time-based weighting factor so that the most recently played games are considered with twice the weight of games played at the beginning of the season.
The second shortcoming is that coaching strategy often changes when a team gets comfortably ahead of (or behind) its opponent in a given game. Coaches may choose to give additional playing experience to players who may not be the strongest players available. So, we apply a score-based weighting factor to games so that close games count more than games with a large difference in score.
Details describing how these weighting factors are chosen is described on a separate web page. It turns out that we are actually more successful overall in predicting the winners of recently played games when using the new weighted-game ratings, because the software tries harder to find team strengths that are consistent with a recent 50-45 win than it does with a 50-25 early-season win.
Changes you may notice -- (1) The team strength ratings reported for any particular team may have changed quite a bit, for the strongest and weakest teams. This effect arises because the program does not have to work as hard to account for the large score differences that occir when strong teams play wekaer opponents, and help to rate these teams relative to each other more properly. Also, (2) the rankings reported this week may move more than they would have otherwise. In the columns that show the previous week's strength or previous week's ranking, those number may be different than what was posted, because they reflect changes from a recalculation of least week's results based on games weighted in the same fashion as this week.Jan 29, 2009: RPI Scores added to Team Schedule Pages
The format of team schedule pages has changed a little, so that more information is presented. The team summary information that was at the top of the page has been replaced by a table that shows:
Team strength is the same as the median strength rating reported on the summary pages. Team momentum is calculated by halving the number of games played, and computing the mean playing strength (column 6) for the first half and second half of games played. The difference between these mean playing strengths is reported as the team's "momentum."
The SOS score, strength of schedule, is the median strength rating for all of the team's rated opponents.
The RPI score is another strength-of-schedule score that considers only whether the team in question won or lost its games, without regard to the margin of victory, and the extent to which the team's wins and losses were to strong or weak opponents. So it considers not only the team's winning percentage, but also that of it's opponents. This RPI score is the "Ratings Percentage Index," as implemented by Jerry P. Palm at the CollegeRPI.com web site, and used as a component of the computer ranking index for BCS football and for college basketball teams. You can get a description of the formula used by following his FAQ link.
In short, the RPI is a weighted average of (a) the team's weighted winning percentage (weighted 1.4 for away wins and home losses, and 0.6 for home wins and away losses), and (b) the team's opponent's unweighted winning percentage (discarding the opponents games with the team whose RPI is being calculated, and (c) the team's opponent's opponent's unweighted winning percentages (again discarding games with the team and it's opponents). The RPI ranking is the rated team's ranking comparing this team's RPI with that of all other rated teams.
Under the Playing Strength column on team schedule pages, the way playing strengths are calculated has changed to improve their self-consistency.
Before, the adjustment was done for the scheduled team assuming its opponent's strength was "correct." This approach leads to an inconsistency for the predicted game result obtained from the difference of the adjusted playing strengths. Now, the difference between the actual game margin and the predicted one is split between the two teams, in a way to make the actual game result consistent with the adjusted playing strengths.
This change has also the desirable effect of reducing the large variation from game to game in a team's effective playing strength, and makes it easier to identify games that produced surprising results.
Several new pieces of information are shown on the team schedule pages.
Under the Opponent column, some notation marks have been added to selected games. Games marked with "!!" after the opponent name are "very good wins." A very good win is a game where the team's playing strength was among the top quartile (blowout games not counted), ideally with a win against one of the stronger opponents the team has faced. Games marked with a single "!" are "good wins" -- the runner-up game found when looking for the best win. Similarly, games marked with "??" are considered the "very worst losses", where the scheduled team played weakly against one of its weakest opponents. Games marked with "?" identify the runner-up worst game.
Under the Expectation column, a number is inserted in parentheses for games that have been played, and the number shows the difference between the actual score difference and the score difference predicted by the strength ratings. In other words, the number shows how many points the team scored higher or lower than what was expected by the ratings. For games that have not been played yet, the number in this column shows the predicted winning margin if positive, or losing margin if negative.
There have been enough games played in New Mexico now that it makes sense to enforce a restriction that we rate only teams that have played a minimum of four games against other rated opponents. This has the effect of tossing games that involve out of state teams that have played only a few games in NM. The strength rating (and distribution) for that team will in such a case reflect only a few games, and contributes nothing meaningful to the overall picture within the state.
Enforcing this constraint has another side effect -- the success rate in predicting game winners drops slightly, from about 87% to 84%, because the games we throw tend to give the expected result every time. It also changes the in-state rankings slightly, but in a statistically insignificant way.
The team schedule pages now show games that have not yet been played or have not yet been reported. This has the useful effect of allowing us to calculate a team's strength of schedule (SOS) based on the median strength of all opponents on the schedule, not just the opponents that have already been played. So, any given team's SOS will change during the season only to the extent that their opponents ratings go up and down, and not because some games have not yet been played or reported.
Putting future games on the schedule also allows us to report (on the team schedule pages) (a) the expected result of future games, and (b) the win/loss/tie probability of that future contest.
Go back to Main Page
Questions or comments: send email to Bob Walker ()