Strategic analysis of American football is of high importance for live commentary, fans, coaches, and players. For the latter, this analysis is essential to prepare appropriately for the next match. Conventionally, this strategic analysis is done manually; that is, people watch and analyze videos. This is a tedious and time-consuming task. To alleviate some of the challenges related to manual strategy analysis, this thesis describes an approach that identifies defensive coverage schemes from single image frames. To increase robustness while reducing learning effort for classifiers, the approach includes several pre-processing steps. Professional game analysts often use the so-called “all 22” camera angle. “All 22” means that the camera is placed in a high position filming all 22 players on the field. The approach in this thesis uses the “all 22” camera angle as well. In order to be less susceptible to different lighting and weather conditions, a pose estimation network already robust against those conditions extracts body key points of the players. Most of the time, the camera is situated in the middle of the playing field. Therefore, plays and corresponding player key points on the camera’s left side look different from plays on the right since camera angles are very different. Thus, domain knowledge is incorporated to transform poses into a 3D model in two steps. In the first step, a projective transformation is calculated to get a top-down view of the playing field. This projective transformation is then applied
to the key points in the second step. Additionally, the projected key points are stacked in layers and centered above the center of the feet. This transformation results in a global 3D model of the playing field and players for a single frame, which is entirely independent of the initial frame’s camera angle, resolution, and aspect ratio. Finally, this 3D model is used for two sample tactical analyses. For both analyses, a baseline, a convolutional neural network, and a custom ResNet were trained. The first task was to distinguish between man and zone coverage. The best-performing model predicts the correct coverage on the validation and test data set with 85% accuracy. For the second analysis, four more specific coverage classes were analyzed. In this case, the best model predicts the correct class with a 58.07% (validation) and 46.00% (test) accuracy. Overall, the work described in this thesis resulted in a robust and modular pre-processing pipeline, which instead of end-to-end learning, incorporates modeled domain knowledge to facilitate learning of labels.
«
Strategic analysis of American football is of high importance for live commentary, fans, coaches, and players. For the latter, this analysis is essential to prepare appropriately for the next match. Conventionally, this strategic analysis is done manually; that is, people watch and analyze videos. This is a tedious and time-consuming task. To alleviate some of the challenges related to manual strategy analysis, this thesis describes an approach that identifies defensive coverage schemes from si...
»