SABR logo

The Society for American Baseball Research
Presents
The SABR Minor Leagues Database

Now helping power baseball-reference.com!

The SABR Minor Leagues Database now powers the minor league history section of the newly-relaunched baseball-reference.com. Visit http://www.baseball-reference.com/minors to dive on in!

About the project

Mission statement

The SABR Minor Leagues Database is a community project to document the statistical history of professional baseball. The core focus of the project is to compile statistics for each league-season, using the best information available. Each season's statistics are critically examined before publication, and known errors and omissions from Guides and other sources are corrected.

Data coverage and sources

The project uses the league-season (one full season of one league) as the basic unit of statistical compilation. Leagues are scheduled for compilation in reverse chronological order. The current focus of new input and evaluation is the 1993 season.

Once in electronic form, we review all statistics for errors, both in transcription and in balance. We check whether team statistics are the sum of its players', and whether totals such as runs, hits, and so forth balance between batters and pitchers. This process ensures data quality, and also often catches errors in published totals. This process is also labor-intensive, which means it takes a while for a season to achieve quality certification in the database. Most league-seasons from 1994 through 2007 have been acquired in electronic format, and await completion of the review process. We appreciate your patience in allowing us to bring you a quality resource.

We use the best information available in compiling statistics for a league. Official league statistics and tabulations published in major guides are used for most leagues. We also build on research done by SABR members and others in correcting and extending those publications.

What leagues have completed the quality certification process?

The current list of leagues which have passed the quality certification process can be viewed here. When a league is certified, we believe that we have exhausted all known resources in compiling statistics for the league. This does not mean that the statistics do not still contain errors. We are especially eager to receive reports about possible new information to help correct errors or deficiencies in the statistics we present for these certified leagues.

Information from other sources

For leagues for which full statistics have not been compiled and vetted in electronic format, we offer selected statistics for players based on a database compiled by Ed Washuta and donated to SABR in 2007. Due to the sheer size of the task, we regret that changes to the Washuta data, including statistical errors and the addition of statistics for unlisted players, will only made in the case of extreme and egregious errors.

The biographical and demographic data displayed is drawn from several sources, including the Washuta database and the SABR Biographical Committee (for players with major league experience). The database is not the originating source for this information.

Is the database available in (MySQL, CSV, etc.) format?

The statistical history of minor league baseball is very poorly documented, and as such the database is provisional, and will remain so for the foreseeable future. Much like Retrosheet, we believe it is unwise to release downloadable datasets which are immature and have not been cross-checked for quality. It is our plan to offer downloads of a full year's worth of statistics (for all leagues) once all leagues in that year have completed the certification process. We plan to release statistics for the 2008 season in the near future, and will proceed backwards in time from there.

We may be able to offer to run specific queries against the dataset for research projects. Please contact John Zajc in the SABR office at jzajc@sabr.org for information on custom querying of the database.

How can I help?

The development of the database is powered entirely by volunteer effort. With a goal of providing statistics for over 4,000 league seasons, volunteers are needed to compile, cross-check, verify, and fill in gaps in statistics. Much of the work involved is data entry and validation. To find out more about volunteering, contact the database steering committee at sabrmilb@gmail.com.