A Networks and Machine Learning Approach to Determine the Best College Coaches of the 20th-21st Centuries

Tian-Shun Jiang, Zachary Polizzi, Christopher Yuan

Our objective is to find the five best college sports coaches of past century for three different sports. We decided to look at men's basketball, football, and baseball. We wanted to use an approach that could definitively determine team skill from the games played, and then use a machine-learning algorithm to calculate the correct coach skills for each team in a given year. We created a networks-based model to calculate team skill from historical game data. A digraph was created for each year in each sport. Nodes represented teams, and edges represented a game played between two teams. The arrowhead pointed towards the losing team. We calculated the team skill of each graph using a right-hand eigenvector centrality measure. This way, teams that beat good teams will be ranked higher than teams that beat mediocre teams. The eigenvector centrality rankings for most years were well correlated with tournament performance and poll-based rankings. We assumed that the relationship between coach skill $C_s$, player skill $P_s$, and team skill $T_s$ was $C_s \cdot P_s = T_s$. We then created a function to describe the probability that a given score difference would occur based on player skill and coach skill. We multiplied the probabilities of all edges in the network together to find the probability that the correct network would occur with any given player skill and coach skill matrix. We was able to determine player skill as a function of team skill and coach skill, eliminating the need to optimize two unknown matrices. The top five coaches in each year were noted, and the top coach of all time was calculated by dividing the number of times that coach ranked in the yearly top five by the years said coach had been active.

arrow_drop_up