
Salvatore MD, Miller KM, Green NL, Gassman BJ, Ghazinour K, Shirima E and Saunders NW. Validity and feasibility of a cell phone application for the assessment of gait and other functional fitness outcomes. Phys Ther Rehabil. 2018; 5:19. http://dx.doi.org/10.7243/2055-2386-5-19
Megan D. Salvatore1, Kennady M. Miller2†, Natasha L. Green2†, Brianna J. Gassman2†, Kambiz Ghazinour3, Emil Shirima3 and Nathan W. Saunders4*
*Correspondence: Nathan W. Saunders Saundenw@mountunion.edu
†These authors contributed equally to this work.
1. Department of Physical Therapy faculty, University of Mount Union, USA.
2. Department of Human Performance and Sport Business student, University of Mount Union, USA.
3. Department of Computer Science, Kent State University, USA.
4. Department of Human Performance and Sport Business faculty, University of Mount Union, USA.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background: Our lab previously demonstrated that the manual assessment (with a cell phone stopwatch) of the 30-s Chair Stand, 8-ft Up-and-Go, and 6 Minute Walk Test was as or more valid and reliable than using body-worn sensors. Based on this finding, we developed an Android cell phone application (App) to improve the feasibility of data collection and to automatically calculate outcomes. The main purpose of the present study was to validate the App against slow-motion video analysis for the assessment of outcomes from the 30-s Chair Stand, 8-ft Up-and-Go, and 400 m Walk Test.
Methods: A subset of 15 subjects were randomly selected from our larger Steps Taken Against Neuromuscular Decline (STAND) Initiative cohort for this validity assessment. Each subject completed a single trial of the 30-s Chair Stand, 8-ft Up-and-Go, and 400 m Walk Test (in that order). One investigator assessed the subject using the App, while another filmed the test at 240 frames per second. The videos were played back frame-by-frame to manually timestamp the same events that were timestamped live using the App. The mean difference between the App and video (bias) was assessed with paired t-tests, and 95% Limits of Agreement were established to represent the random error between the two measures.
Results: No significant bias between the App and slow-motion video was detected for any 30-s Chair Stand or 400 m Walk Test outcome, or for the total time taken to complete the 8-ft Up-and-Go. Paired t-tests did, however, suggest a numerically small but systematic negative bias for all other 8-ft Up-and-Go outcomes, indicating that subcomponents of the 8-ft Up-and-Go are slightly faster when being assessed by the App, compared with slow-motion video. The 95% Limits of Agreement were most narrow for the 30-s Chair Stand and 400 m Walk Test components, as well as total time on the 8-ft Up-and-Go, with a broader limits for other 8-ft Up-and-Go components.
Conclusions: The App validly assessed all 30-s Chair Stand and 400 m Walk Test outcomes, as well as the total time taken to complete the 8-ft Up-and-Go. Given that the App is as or more valid than existing technologies, with respect to the outcomes it reports, its greater feasibility may make it an appropriate addition to clinical and fitness settings.
Keywords: Senior Fitness Test, 30-s Chair Stand, 8-ft Up-and-Go, 400 m Walk Test, gait assessment, older adults
As the population of older adults (age 65 and older) in the United States is expected to more than double in size in the next 50 years [1,2], our healthcare system struggles to prepare for this shift. It has become critical to adapt our paradigm of healthcare from the treatment of existing diseases to one of disease prevention and the promotion of health and wellness. In order to do this, we need to better understand the characteristics of healthy aging so that we know which ones have the biggest impact on quality of life. As we collect information about strength, balance, gait, and functional mobility in this population, it has become important to develop tests and measures with normative data and criterion-referenced standards that are reliable and valid, but also easy to administer at relatively low cost [3].
New technologies that incorporate and analyze data from sources like computerized gait analysis systems, force plates, accelerometers, motion capture systems, cell phone applications, and body worn sensors have provided many new ways to quantify human movement during transfers (e.g., sit-to-stand and stand-to-sit) and during gait. This allows researchers and clinicians to analyze the subcomponents of functional activity tests in order to determine which ones are having the largest impact on performance of the overall tasks. While many of these new technologies have proven to be reliable and valid for measuring components of gait and functional mobility tasks [4-10], their feasibility of use varies widely. Some of the equipment used for instrumented testing is cost-prohibitive, or requires a dedicated research space and specially-trained testers. Some of the equipment can be subject to technical glitches that can impact data collection during testing [11]. Additionally, the technology needs to remain stable enough to be able to track an individual’s change in function over time, or to collectively compare results at two different points in time in order to determine the effect of an intervention. Equipment that relies on technology that is continually updated can make it difficult to analyze changes over time.
Researchers and clinicians need an evaluation tool that will allow them to collect functional assessment data and outcomes from a wide variety of testing centers that represent patient populations across the continuum of healthcare settings. Ideally, this tool would not only be reliable and valid, but would also be affordable, easily adaptable to varying testing environments and set-ups, and easy to operate accurately with very little tester training and experience.
The purpose of the present study was to investigate the validity, inter-rater reliability, and feasibility of an Android cell phone application (App) that we developed to assess the 30-s Chair Stand, 8-ft Up-and-Go, and the 400 m Walk Test. It was hypothesized that the App would demonstrate good validity, compared with slow-motion video analysis, but would demonstrate much better feasibility.
Subjects
A subset of 15 subjects were randomly selected from our larger STAND Initiative cohort for this validity assessment. Subjects were required to be at least 40 years of age, but no exclusion criteria were indicated. Table 1 shows the broad range of age, height, weight, and fitness of these subjects. This study was approved by the University of Mount Union Institutional Review Board, and all subjects signed an informed consent prior to participation.
Table 1 : Subject characteristics.
Cell Phone application
An Android cell phone application (App) was developed to assess the 30-s Chair Stand, 8-ft Up-and-Go, and 400 m Walk Test (the 30-s Chair Stand and 8-ft Up-and-Go implemented according to the protocol developed by Rikli and Jones [3]). The App permits the user to manually timestamp events (e.g., sitting or standing position, heel strike, or when a person’s line of gravity intersects a marker) during these tests. At the conclusion of each test, the App then automatically calculates and reports the traditional test outcomes (i.e., number of chair stands in 30 seconds, time to complete the 8-ft Up-and-Go, and time to complete the 400 m Walk Test), as well as subcomponents of those tests (see Tables 2 and 3).
Table 2 : Agreement between a cell phone application and slow-motion video analysis for the scoring of functional fitness tests.
Table 3 : Inter-rater reliability of functional fitness test outcomes using the cell phone application.
For the 30-s Chair Stand, the investigator tapped the Start button on the command “GO.” Tapping the Start button initiated a 30 second countdown. The investigator then tapped a button every time the subject stood, and every time they sat. The App automatically stopped after 30 seconds, regardless of the subject’s position.
The setup for the 8-ft Up-and-Go is depicted in Figure 1. The floor was taped 1.5 feet from the front of the chair (line A) and 1.5 feet from the end (line B), leaving 5 feet between the lines for the assessment of steady-state gait speed. An object was placed on the floor 8 feet from the front edge of the chair to indicate the 180 degree turn. The investigator tapped the Start button on the command “GO.” Then they tapped a button each time the subject’s line of gravity intersected lines A and B (twice on the walk away from the chair, and twice on the walk back toward the chair). The investigator tapped the Stop button once the subject returned to a seated and upright position.
Figure 1 : Methodology for the assessment of 8-ft Up-and-Go outcomes using the cell phone application.
The setup for the 400 m Walk Test is depicted in Figure 2 (adapted from Salvatore et al. [11]). The course was 20 m long and had tape on the floor 2.5 m from each end, leaving a 15 m straightaway for the assessment of steady-state gait speed and cadence. Subjects were asked to walk 20 lengths of the course as quickly as possible without running. The investigator tapped the Start button on the command “GO.” The investigator then tapped a button each time the subject’s line of gravity intersected lines A and B, and tapped a button for each of 5 consecutive strides between the interior lines (this was done for all 20 lengths). A Stop button was pressed at the completion of 20 lengths.
Figure 2 : Methodology for the assessment of 400 m Walk Test outcomes using the cell phone application.
Validation
Each subject completed a single trial of the 30-s Chair Stand, 8-ft Up-and-Go, and 400 m Walk Test (in that order). One investigator assessed the subject using the App, while another filmed the test at 240 frames per second. The App automatically provides onboard results at the completion of each test. The videos were played back frame-by-frame to manually timestamp the same events that were timestamped live using the App. The investigator analyzing the videos was blind to the App results.
Inter-Rater Reliability
Unlike some cell phone applications and body-worn sensors that automatically assess movement characteristics with accelerometers and gyroscopes, the App evaluated here must be manually manipulated by the investigator, which potentially introduces a source of user error. It was therefore important to assess the inter-rater reliability. To achieve this aim 5 investigators (one Exercise Physiologist, one Physical Therapist, and three Physical Therapy students) with a wide range of experience using the App simultaneously evaluated 15 trials of the 30-s Chair Stand, 8-ft Up-and-Go, and 400 m Walk Test. The 15 trials for each test were completed by a single subject who was instructed to perform the first trial at a very slow speed. Then the subject was instructed to incrementally increase their speed for each of the remaining 14 trials. In doing so, the reliability analysis would be relevant to a broad population including frail and very fit individuals.
Statistical Analyses
All statistical analyses were performed using SPSS version 24. Given that the acceptable level of agreement between two measurement tools is situation-specific and must be evaluated based on the analytical goals and expected test-retest or individual differences [12], several different statistical analyses were performed for each fitness test outcome. Coefficient of Variation (CV) was calculated as 100 times the standard deviation of the two measures divided by the mean of the two measures. The correlation between the App and slowmotion video was assessed using Pearson’s Product-Moment Correlation Coefficients. Systematic bias between the two measures was calculated as the mean difference between the two measures and was tested for statistical significance using a paired t-test. The 95% Limits of Agreement (LOA) were calculated as +/-1.96 times the standard deviation of the two measures [13]. Though not presented in this paper, Bland-Altman plots were created for each fitness test outcome, which presented the mean of the two measures vs. the difference between the two measures for each subject [13]. Linear regression was used to test for proportional bias, the difference between measures being dependent on the magnitude of the results. Intraclass Correlation Coefficients (ICC) and CV were used to assess inter-rater reliability. Significance was established a priori at alpha=0.05.
Validity
Table 2 shows the level of relative and absolute agreement between the App and slow-motion video analysis of the fitness test outcomes. The CV was less than 2% for all 30-s Chair Stand and 400 m Walk Test outcomes. While slightly higher for the 8-ft Up-and-Go outcomes, the CV was still less than 10%. Pearson’s Product-Moment Correlation Coefficients, a measure of relative reliability, were excellent for all outcomes (r>0.9).
No significant bias between the App and slow-motion video was detected for any 30-s Chair Stand or 400 m Walk Test outcome, or for the total time taken to complete the 8-ft Up-and- Go. Paired t-tests did, however, suggest a numerically small but systematic negative bias for all other 8-ft Up-and-Go outcomes (bold font values), indicating that subcomponents of the 8-ft Up-and-Go are slightly faster when being assessed by the App, compared with slow-motion video. The 95% LOA is a measure of random error, with a broader range indicating more random error. Consistent with the CV and paired t-test results, the 95% LOA were most narrow for the 30-s Chair Stand and 400 m Walk Test components, as well as total time on the 8-ft Up-and-Go, with a broader limits for other 8-ft Up-and- Go components. Linear regression suggested no evidence of proportional bias for any outcome, indicating that the differences between the App and slow-motion video were similar for all participants, regardless of test performance.
Inter-Rater Reliability
Inter-rater reliability results are shown in Table 3. Only the CV for the sit-to-stand component of the 8-ft Up-and-Go was greater than 1.1%, and it was still quite good at 3.7%. Likewise, the lowest ICC for any outcome was 0.988. Collectively, these results suggest the 5 raters exhibited excellent inter-rater reliability.
We developed an Android cell phone application (App) to provide a feasible means of assessing multiple components of common functional fitness tests. The purpose of this study was to validate the outcomes delivered by the App against slow-motion video analysis. The App was statistically no different than video analysis for all 30-s Chair Stand and 400 m Walk Test outcomes, and for the total time taken to complete the 8-ft Up-and-Go. However, there was a small systematic negative bias indicated for all other 8-ft Up-and-Go outcomes. The App also exhibited excellent inter-rater reliability for all outcomes.
Many studies focus their validation analysis on the mean difference in an outcome between two measures, and a small mean difference is presented as evidence of good agreement. To an extent that is true because a small mean difference suggests little systematic bias, but the mean difference alone should be interpreted cautiously for two reasons. First, a mean difference of zero can be achieved when there is an even mixture of very large positive and negative differences. Second, the mean difference only accounts for systematic error, not random error. It is for these reasons that the 95% Limits of Agreement (LOA) should be reported along with the mean difference, as it is expected that 95% of the differences between two measurement outcomes will be within these limits.
30-s Chair Stand Test
The mean difference in the number of chair stands between the App and video in the present study (0.1 stands) was smaller than the difference between body-worn sensors and manual counting (0.5 stands) reported by Salvatore et al. [11] and the difference between an automated cell phone application and manual counting (-0.6 stands) reported by Adusumilli et al. [4]. More importantly, our 95% LOA (-0.6 to 0.8 stands) were much narrower than both studies (-3.0 to 2.0 stands, and -3.9 to 2.6 stands, respectively).
The mean difference in sit-to-stand time between the App and video (0.03 s) and 95% LOA (-0.12 to 0.19 s) were also better than a body worn sensor and force plate [14] (bias=0.1 s, 95% LOA=-0.25 to 0.5 s), similar to an automated cell phone application and force plate [8] (bias=0.004 s, 95% LOA=-0.1 to 0.1 s), and better than a body-worn sensor and video [7] (bias=0.06 to 0.18 s, no 95% LOA reported). To our knowledge, we are the first to report the number of sits and the stand-to-sit time, but the magnitude of error for those measures was quite comparable to what we reported for the number of stands and sit-to-stand time.
8-ft Up-and-Go
Several others have studied the agreement between two methods for the assessment of the total time taken to complete the 8-ft Up-and-Go [11] or some version of the Timed Up-and-Go [15-18]. The mean difference between the App and video here (0.08 s) was as good or better than what has been reported elsewhere [11,15,18]. Dubois et al. [16] reported a negligible systematic bias between body-worn sensors and a stopwatch (-0.001 s), but the 95% LOA were wide relative to ours (-2.242 to 2.240 s vs. -0.459 to 0.305 s). Mellone et al. [17] reported a lower mean difference (0 s) and narrower 95% LOA (-0.045 to 0.040 s) for the comparison of an automated cell phone application with a body-worn sensor, but it should be noted that the cell phone and sensor were affixed to the same location on the body in tandem, and both devices used the same sensor technology (i.e., an accelerometer). Both devices reporting the same information is not evidence that either is reporting valid information. Importantly, larger differences have been observed when comparing body-worn sensors to other technologies [11,15,16,18].
Like the present study, Beyea et al. [15] additionally investigated the sit-to-stand time, gait speed, turn duration, and stand-tosit time. While the differences were statistically significant, the App here exhibited less mean difference from video analysis, compared with the difference between body-worn sensors and motion capture investigated by Beyea et al. [15] for the sit-to-stand time (-0.09 s vs. -0.14 s, respectively), turn duration (-0.10 s vs. -0.23 s, respectively), and the stand-to-sit time (-0.14 s vs -0.21 s, respectively). However, the body-worn sensors and motion capture agreed better than the App and video with respect to average gait speed (-0.017 m/s vs. -0.103 m/s, respectively). Unfortunately, Beyea et al. [15] did not report 95% LOA, so we are unable to compare the random error observed here with theirs.
400 m Walk Test
Given that we are the first to assess steady-state gait speed, cadence, and turn duration during the 400 m Walk Test, our findings must be discussed relative to other much shorter gait tests. The steady-state gait speed of the App was only 0.002 m/s different than slow-motion video analysis. To put this into perspective, for someone walking 1.500 m/s (assessed by the App) and 1.498 (assessed by slow-motion video), the time comparison for the completion of a 400 m Walk Test would be 266.6 s vs. 267.0 s. Even at the 95% LOA (-0.03 to 0.03 m/s), the time difference would only be 5.5 s.
Good gait speed agreement has also been shown between a Multi-Kinect v2 Instrumented 10-Meter Walkway and motion capture [19] (0.011 m/s for comfortable walking speed and 0.006 m/s for fast walking speed), between a GAITRite mat and body-worn sensors [20] (0.024 m/s for slow, -0.050 m/s for normal, and -0.302 m/s for fast walking speeds), and between Microsoft Xbox One Kinect and motion capture [21] (0.010 m/s for comfortable walking speed, and 0.000 m/s for fast walking speed). The 95% LOA, for all studies that reported them, were relatively narrow and comparable to what was observed here.
The difference in cadence between the App and video here (0.5 steps/min) is well supported by other studies, having reported differences in the range of 0 to 2 steps/min [11,19,22,23].
App Usability
More important than the statistical agreement between two measures, though, is how much systematic and random error is tolerable in a specific setting, and how the magnitude of that error relates to the feasibility of the measurement protocol (i.e., is a slightly more valid measure worth the additional cost to purchase, expertise to use, and restricted laboratory space that may be needed?). We suggest that as long as the error in measurement is much smaller than the expected change, the error is sufficiently small, regardless of a statistical finding. We previously published the minimum detectable change for many of the outcomes discussed here [11]. In that study, subjects completed two trials of the 30-s Chair Stand, 8-ft Upand- Go, and 6 Minute Walk Test, with a 1-hour rest between trials. The minimum detectable change values derived were: 2.6 chair stands, 1.02 s for the time taken to complete the 8-ft Up-and-Go, 0.100 m/s for the steady-state gait speed during the walk test, 7.0 steps/min for cadence during the walk test, and 0.34 s for turn duration during the walk test.
Only the App had 95% LOA that were smaller than the minimum detectable change for the 30-s Chair Stand, and total time for the 8-ft Up-and-Go. The App, and all technologies reviewed here, exhibited error in the assessment of gait speed and cadence that was smaller than the minimum detectable change. The App was the only technology to assess turn duration during the 400 m Walk Test, and its 95% LOA were much smaller than the minimum detectable change.
Given that there are many valid technologies available for the assessment of functional fitness, we would argue that the selection of a technology should be based primarily on feasibility. The App doesn’t require 1) the purchase of equipment, 2) set-up time or calibration, 3) equipment maintenance, 4) post-test processing of data, 5) the person being tested to be dressed a certain way, 6) the person being tested to wear a device or marker, or 7) an isolated testing space. In fact, it should be noted that the data for this validation study were collected in the heavily traversed hallway of a university academic building. It was often the case that classes let out while a subject was being tested, an event that did not disrupt the assessment process. It is also worthy of comment that the excellent inter-rater reliability exhibited for all outcomes was achieved with the 5 raters using different phones and Android software versions.
Limitations
It would be irresponsible to omit the limitations of our validation study design and the App itself. First, while slow-motion video analysis at 240 frames per second was a good standard, motion capture would have provided a gold-standard reference. Though we did have access to that technology, it would have been impractical to validate the 400 m Walk Test in that environment. Second, the App requires the technician to manually timestamp events. The excellent inter-rater reliability observed in this study might not be replicated by five other technicians with differences in clinical training, exposure to the App, and attention during a test. Finally, while the App was found to be valid for the outcomes it assesses, it is not capable of assessing many of the outcomes reported by other studies (e.g., jerk, ground reaction forces, step duration, and joint angles). The App is not suitable to replace existing technologies, but instead provides a valid and feasible means of assessing the functional fitness outcomes it does report.
The App validly assessed all 30-s Chair Stand and 400 m Walk Test outcomes, as well as the total time taken to complete the 8-ft Up-and-Go. While there was a small systematic negative bias indicated for all other 8-ft Up-and-Go outcomes, even those numerical differences were smaller than the magnitude of change one would reasonably expect following exercise therapy/training. The App also exhibited excellent inter-rater reliability for all outcomes. Given that the App is as or more valid than existing technologies, with respect to the outcomes it reports, its greater feasibility may make it an appropriate addition to clinical and fitness settings.
The authors declare that they have no competing interests.
Authors' contributions | MDS | KMM | NLG | BJG | KG | ES | NWS |
Research concept and design | √ | √ | √ | √ | √ | √ | √ |
Collection and/or assembly of data | √ | √ | √ | √ | √ | √ | √ |
Data analysis and interpretation | √ | √ | √ | √ | √ | √ | √ |
Writing the article | √ | √ | √ | √ | -- | -- | √ |
Critical revision of the article | √ | √ | √ | √ | -- | -- | √ |
Final approval of article | √ | √ | √ | √ | -- | -- | √ |
Statistical analysis | -- | √ | -- | -- | -- | -- | √ |
While their contributions did not warrant authorship, we would like to thank (in no particular order) Brianna Blohm, Cameron Ressel, Abigail Matsushima, Uriel Ibarra-Moreno, Kristen Fouts, Joshua Lawhorne, Meghan Hess, AlexandraColacino, Samuel Todd, Valerie Russel, and Alyssa Braun for their integral role in designing this study and collecting data.
Editor: Catherine Ortega, University of Texas Health Science Center, USA.
Received: 09-Sept-2018 Final Revised: 05-Nov-2018
Accepted: 08-Nov-2018 Published: 23-Nov-2018
Salvatore MD, Miller KM, Green NL, Gassman BJ, Ghazinour K, Shirima E and Saunders NW. Validity and feasibility of a cell phone application for the assessment of gait and other functional fitness outcomes. Phys Ther Rehabil. 2018; 5:19. http://dx.doi.org/10.7243/2055-2386-5-19
Copyright © 2015 Herbert Publications Limited. All rights reserved.