Only missing three games – the ACC Championship Game, Atlantic 10 Championship Game, and Big 10 Championship Game. Don’t have a good explanation why the SEC game was available and the other three were not from today – but it is the best I can do at the moment.
Obviously, take the data with a grain of salt. I have done some basic cleaning and quality checks against RPI data – but to be fair, it is also 1 am – and so my chances for errors have probably increased. 🙂
Joking aside, I am pretty sure that all the basic information that you might need should be here. And for those of you who are not familiar with this tradition, I will give you some more details.
As many of you know, one of my insane features is that I try to provide people with data about the teams in case they want to do research on the teams. Each year, we get several people who have demonstrated the power of statistics by building models in order to predict the games. Some of them have been extremely successful with this – especially Bill Kahn with his Bradley-Terry models, showing that even something extremely unpredictable as sports can be forecasted through good statistical techniques. But the part of this that has made me happy – and why I do this – is because a few people who were not statisticians but were taking a stats training course at work used this data for their class project and ended up having some success – including our 2006 champion, David Shaddick.
So, since that point, I decided to provide the scores to everyone in an attempt to provide people as much of a chance to try to leverage data to make their decisions. I realize that most of you will probably spend three to five minutes just looking at the teams and figuring who will do best – I probably don’t need a model to decide that the number 1 seeds will beat the 16 seeds… In fact, I typically spend so much effort maintaining the site that I just randomly pick late Wednesday evening.
However, if I can give people a chance to try to learn something about statistics in a very fun environment, it is well worth the effort. So, just click on 2013 Schedule in the Admin menu and get an Excel spreadsheet with summary box score and standing / RPI information for each game for every Division I team.
One potential error – I did not have the RPI data available to help me error check the final week of games. So, I made some educated guesses on venues – for example, considering Vanderbilt, St. John’s and Tulsa as hosts of their conference tournaments since the games were played in the same city as their schools. Not sure this is how the NCAA considers it, but I figured it was probably an accurate reflection.
If you notice something terribly wrong, let me know – no promises I have time to fix it, but at least everyone will know.
Enjoy the data!!!!