Software

PCPP is an application developed in MATLAB, for the detection of abnormal infant movements associated with cerebral palsy. This system uses 2D skeletal data extracted from videos, and consists of a full pipeline providing data pre-processing, data normalization, feature extraction and classification. Evaluation metrics, such as accuracy, sensitivity, specificity, F1 score and Matthews Correlation Coefficient (MCC), are computed to facilitate full assessment of performance and allow for comparison with other methods from the literature. These evaluations are conducted on the MINI-RGBD and RVI-38 datasets using the code and data provided.


Introduction
Cerebral Palsy (CP) is one of the most common motor and movement disabilities in childhood, and has a lifetime impact on people with the condition. To identify infants who are at risk of CP, diagnostic tools such as the General Movements Assessment (GMA) [1] can be used at an early stage in development prior to further assessment (eg Magnetic Resonance Imaging (MRI)) [2]. Currently, the GMA is carried out manually by highly qualified assessors, requiring extensive experience to identify atypical infant movement patterns. However, this manual assessment can be sensitive to observer fatigue, as well as the ✩ DOI of original article: https://doi.org/10.1109/TNSRE.2021.3138185.
The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals. * Corresponding author. subjectivity of the assessor. Additionally, significant investment in both time and money are required for assessors to reach a suitable level of diagnostic accuracy. As a result, researchers in this field have been actively conducting interdisciplinary work which combines clinical knowledge, computer vision and artificial intelligence.
Early work in this area [3,4] typically focuses on extracting image features, such as optical flow, from videos of an infant's spontaneous movements. However, the low-level features extracted from image appearance using these methods are highly sensitive to variations caused by body size, clothing, camera movement, and external anomalies in shot. To address these issues, the use of pose estimation algorithms,  such as OpenPose [5], have provided encouraging results using skeletal data for abnormal movement detection, as reported in several recent works [6][7][8][9][10][11]. The software provided here, namely Pose-based Cerebral Palsy Prediction (PCPP), is developed as part of the abnormal infant movements detection pipeline recently proposed in [6,11].
The effectiveness of detecting abnormal infant movements using this software has been rigorously evaluated using a real-world dataset gathered in a clinical setting [11]. By making this software available to the public, along with the full implementation details provided in this article, we aim to further stimulate scientific research and industrial development in this area.

System overview
In this system, the input file format is JSON (JavaScript Object Notation), since OpenPose [5] uses JSON as the default output format for 2D pose estimation on each frame of a video. OpenPose [5] supports 3 different pose estimation versions, predicting the locations of either 15, 18 or 25 keypoints on the body. We selected the 25 keypoint version ( Fig. 1) since this contains the most information and is suggested as being the most accurate variant in the literature. However, our system only uses the pose and motion features from 15 of the extracted keypoints, as in [11], as less relevant keypoints (such as facial landmarks and feet) are excluded from the GMA. In the following subsection, the details of each module in our system will be discussed.

Data preprocessing
As discussed, the pose is estimated using real-world videos recorded during routine clinical care. However, the raw pose data (i.e. 2D joint locations) generated using OpenPose [5] can be noisy due to the uncontrolled lighting conditions and the self-occlusion of body parts of the infants. Furthermore, OpenPose [5] extract poses on a framebased manner. As a result, it is possible to have a discontinuity in the estimated poses in consecutive frames.
To alleviate this problem, we replace the estimated poses with a low confidence score by 1-D data interpolation as illustrated in Algorithm 1. Specifically, we calculate the average confidence score for each joint from all frames in the video. Next, the estimated joint locations which have a confidence score lower than the − 0.07 will be discarded and replaced with the results of interpolation. Here, the 1-D data interpolation function interp1 in MATLAB is used with the where the slope on interval between and +1 can be determined from a set of control points where is the number of points, is the derivative at the sample point , and the weights 1 and 2 are determined by: Finally, the motion is further smoothed by applying the moving mean function movemean in MATLAB with a 5-frame sliding window to ensure the continuity of the joint trajectories. An example of a noisy pose, corrected using our proposed data pre-processing, is illustrated in Fig. 2. if confidence score of current frame < average confidence score -0.07 then

Pose normalization
After data pre-processing, the holistic translation of each pose is corrected by subtracting the 2D location of the root joint from all body joints. In doing so, the 2D coordinates of the root joint are always at the origin (i.e. (0, 0)) to facilitate comparison of poses between different frames and sequences.
We further normalize the data by aligning the spinal column (i.e. the central line between joints 2 (sternum) and 9 (root)) with the vertical axis of the coordinate system. Specifically, we calculate the rotation required at frame to align the spinal column with the = (0, 1) by where 2 and 9 are the filtered 2D coordinates of joint 2 and 9, respectively, and = (( 2 − 9 ) × ) is used to determine the direction of the rotation (i.e. clockwise or counter-clockwise). Finally, the normalized position of each joint can be computed by where ∈ [1, 15].

Feature extraction
Using the normalized pose data, a wide range of histogram-based feature descriptors can be extracted Readers are referred to [6,11] for the design and details of each feature descriptor. For the implementation provided in this software system, both the 8-bin and 16-bin versions will be extracted for each type of feature descriptor.

Classification
Finally, the detection of abnormal infant movement is formulated as a binary classification of the extracted feature descriptor. MAT-LAB built-in classifiers were used, including Support Vector Machines (SVMs), Decision Trees, Nearest Neighbour (1-NN and 3-NN), Linear Discriminant Analysis (LDA), Ensemble, and Logistic Regression. In the implementation, we follow the experimental settings to have a leaveone-subject-out cross-validation as in [6,11]. Based on the classification results, our system outputs a wide range of evaluation metrics, including classification Accuracy, Sensitivity, Specificity, Precision, Recall, F1 Score, and Matthews Correlation Coefficient (MCC).

Software impact
As mentioned in Section 1, GMA is currently carried out by experienced assessors. Automating the GMA using computer vision and machine learning techniques can greatly reduce the resources required from health organizations and such resources can be re-distributed to further benefit patients. In addition to automating the GMA, the implementation of the GMA-inspired pose and motion features can potentially be used to visualize the abnormal movement patterns of infants, as demonstrated in [8,9]. This further opens the door to improving the interpretability of the prediction results obtained from the machine learning based algorithms. By moving towards explainable AI, additional feedback can be provided to clinical experts and AI researchers to improve AI-based healthcare applications, further enhancing their robustness and reliability.
Furthermore, the algorithms proposed in the literature in this area are typically not available to the public. As a result, it becomes difficult for researchers to compare the performance of newly proposed methods with the existing work, and extra effort is subsequently required to re-implement the algorithms as we have done in our recent work [11]. Additionally, the availability of data is another significant challenge in this area. This makes evaluating the effectiveness of each proposed system, due to the lack of benchmark or publicly available datasets, difficult. In this system, we share both the code along with the anonymized skeletal data, which enables the evaluation of our method on the MINI-GRBD [12] and RVI-38 [11] datasets. We strongly believe making our code and data available to the public will have a positive impact on future research in related areas.
In future versions, we would like to provide a Python version of the system to allow further comparisons, as well as integration with other deep learning based approaches such as [7,8,10]. We will also explore the feasibility of integrating visualization features, such as those proposed in [8][9][10], to enhance the interpretability of the abnormal infant movement detection system.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.