Integrative data analysis to identify persistent post-concussion deficits and subsequent musculoskeletal injury risk: project structure and methods

Abstract
Concussions are a serious public health problem, with significant healthcare costs and risks. One of the most serious complications of concussions is an increased risk of subsequent musculoskeletal injuries (MSKI). However, there is currently no reliable way to identify which individuals are at highest risk for post-concussion MSKIs. This study proposes a novel data analysis strategy for developing a clinically feasible risk score for post-concussion MSKIs in student-athletes. The data set consists of one-time tests (eg, mental health questionnaires), relevant information on demographics, health history (including details regarding the concussion such as day of the year and time lost) and athletic participation (current sport and contact level) that were collected at a single time point as well as multiple time points (baseline and follow-up time points after the concussion) of the clinical assessments (ie, cognitive, postural stability, reaction time and vestibular and ocular motor testing). The follow-up time point measurements were treated as individual variables and as differences from the baseline. Our approach used a weight-of-evidence (WoE) transformation to handle missing data and variable heterogeneity and machine learning methods for variable selection and model fitting. We applied a training-testing sample splitting scheme and performed variable preprocessing with the WoE transformation. Then, machine learning methods were applied to predict the MSKI indicator prediction, thereby constructing a composite risk score for the training-testing sample. This methodology demonstrates the potential of using machine learning methods to improve the accuracy and interpretability of risk scores for MSKI.
Description
This article was originally published in BMJ Open Sport & Exercise Medicine. The version of record is available at: https://doi.org/10.1136/bmjsem-2023-001859. © Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.
Keywords
Citation
Anderson M, Claros CC, Qian W, et alIntegrative data analysis to identify persistent post-concussion deficits and subsequent musculoskeletal injury risk: project structure and methodsBMJ Open Sport & Exercise Medicine 2024;10:e001859. doi: 10.1136/bmjsem-2023-001859