About

Background

The NBA Data Project started as an ECON 103 – Statistics final group project at Lewis & Clark College during the Spring 2015 semester.  Visiting professor Zhaochen He limited the data sources for project to the American Fact Finder, FRED and DatabaseSports.   We found that DatabaseSports had a downloadable archive of ABA/NBA player and team stats current through the end of the 2008-09 season.

The original CSVs were cleaned and then loaded into Microsoft Access for initial analysis.  One of the goals of the final project was to do descriptive statistics/graphing & regression analysis using R.  While I have past relational database experience R programming is completely new to me (as of April 2015) but I figured capturing some of the steps for others to review might be helpful (feedback appreciated).

Project Goals

  • Make example NBA data easily accessible for those learning relational databases and analysis software packages.
  • Share alternative data file format versions DatabaseBasketball.com archive – including MS Access, MS SQL Server, mySQL versions & R
  • Iterate on database schema to use as “practice” datasource for analysis & visualization

Disclaimers

  • I am not affiliated with the National Basketball League and do not intend on updating any player/team data outside of the source data’s scope (ie. update with stats newer than 2008-09).
  • The validity of the source data from DatabaseBasketball.com has not been verified/validated.   The primary intent is for education purposes.
  • I am not a data scientist/statistician… and am a noob with regards to R programming.   Feel free to give feedback but you probably shouldn’t use my code if money is on the line.

Leave a Reply