Background
The NBA Data Project started as an ECON 103 – Statistics final group project at Lewis & Clark College during the Spring 2015 semester. Visiting professor Zhaochen He limited the data sources for project to the American Fact Finder, FRED and DatabaseSports. We found that DatabaseSports had a downloadable archive of ABA/NBA player and team stats current through the end of the 2008-09 season.
The original CSVs were cleaned and then loaded into Microsoft Access for initial analysis. One of the goals of the final project was to do descriptive statistics/graphing & regression analysis using R. While I have past relational database experience R programming is completely new to me (as of April 2015) but I figured capturing some of the steps for others to review might be helpful (feedback appreciated).
Project Goals
- Make example NBA data easily accessible for those learning relational databases and analysis software packages.
- Share alternative data file format versions DatabaseBasketball.com archive – including MS Access, MS SQL Server, mySQL versions & R
- Iterate on database schema to use as “practice” datasource for analysis & visualization
Disclaimers
- I am not affiliated with the National Basketball League and do not intend on updating any player/team data outside of the source data’s scope (ie. update with stats newer than 2008-09).
- The validity of the source data from DatabaseBasketball.com has not been verified/validated. The primary intent is for education purposes.
- I am not a data scientist/statistician… and am a noob with regards to R programming. Feel free to give feedback but you probably shouldn’t use my code if money is on the line.