Challenge Description
One way to optimize performance is to monitor the balance between exertion and regeneration. Force8 has been monitoring the data of the athletes collected over longer time periods.
The challenge is to identify a predictive pattern in the data that could help explain performance in the competition.
Challenge Owner
Force8 & Swissski make up the National Ski Swiss Federation that helps athletes to develop and is always looking for ways to optimize their performance. Force8 provides the athlete management system for Swissski and manages the data on the athletes.
Solution
Pitch
The observations derived from a REST questionnaire about training intensity and mood date back several years. The questions mainly addressed physical and psychological aspects and were worded differently each time to make them more interesting. The answers were collected twice a week. The performance of the athletes is available for the different events during the year. The goal was to use questionnaire data to predict the injuries an athlete might sustain during an event or training session and to gauge the performance based on the subjective data from the questionnaire.
We used logistic regression and Random Forest models to predict athlete’s injuries based on the most important factors such as mood, sleep, muscle pain, training intensity, and fatigue. We found that the Random Forest model was more accurate. One major challenge was to change the data into a suitable form. To predict the questionnaires' current values (Pat), we enriched the injury value of the next timestep (It+1). The models now predict if the athletes will suffer an injury in the next three days based on the answer from the current questionnaires.
While testing the models, we discovered a lack of memorization from previous questionnaires. We enriched the current features (Pat) with the previous values of the questionnaires (Pat-1) to memorize them and used this step to add a trend component to the models.
Enriching the features with this additional time step improved our RandomForestClassifier to a precision of 0.94 and a recall of 0.32. The model is still conservative, but if it predicts an injury, athletes should reduce their training or similar actions. We also identified the most important features (age and training intensity). Although age strongly affects predicting an injury, we think it is not age itself that influences the prediction. Age is more likely an indicator of the current position of an athlete’s life cycle. An athlete in the middle or end of his career is more likely to train more intensively or take more risk to improve performance than a young athlete. This can result in a higher chance of an injury.