Running Head: Objective Measures of Physical Activity in COPD
Funding Support: Funding for the Chronic Lung Disease Biomarker Qualification and Clinical Outcome Assessment Qualification Consortium was provided by AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, and Chiesi
Date of Acceptance: July 26, 2021 │ Published Online: August 25, 2021
Abbreviations: physical activity, PA; chronic obstructive pulmonary disease, COPD; Chronic Lung Disease Biomarker and Clinical Outcome Assessment Qualification Consortium, CBQC; Innovative Medicines Initiative, IMI; Physical Activity as Crucial Patient-Reported Outcome in COPD project, PRO-active; clinical outcome assessment, COA; Food and Drug Administration, FDA; European Medicines Agency, EMA; St George’s Respiratory Questionnaire, SGRQ; patient-reported outcomes, PROs; metabolic equivalent of tasks, METs; resting metabolic rate, RMR; oxygen uptake, VO2; moderate to vigorous physical activity, MVPA; vector magnitude units, VMU; physical activity level, PAL; micro electromechanical systems, MEMs; COPD Genetic Epidemiology, COPDGene® Global initiative for chronic Obstructive Lung Disease, GOLD; standardized response means, SRM; minimal important difference, MID; minimal clinical important difference, MCID; context of use, COU
Citation: Demeyer H, Mohan D, Burtin C, et al. Objectively measured physical activity in patients with COPD: recommendations from an international task force on physical activity. Chronic Obstr Pulm Dis. 2021; 8(4): 528-550. doi: http://doi.org/10.15326/jcopdf.2021.0213
In 2016, the Chronic Lung Disease Biomarker and Clinical Outcome Assessment Qualification Consortium (CBQC) of the COPD Foundation launched an initiative to explore whether measures of physical activity (PA) could be qualified as efficacy endpoints or as biomarkers and used in clinical trials submitted to regulatory authorities.1 PA was suggested by the COPD Foundation as an important end-point from the perspective of people with COPD and, although with less certainty, as a potential short-term surrogate for important COPD outcomes, such as occurrence of exacerbations and survival that take longer than the typical study duration (months) to assess.
A group of experts convened in Leuven, Belgium with further meetings during international conferences of the American Thoracic Society and the European Respiratory Society to outline a position. After extensive review of the existing literature, the panel concluded that, while much effort has been made to promote objectively measured PA as a valid and responsive endpoint in COPD,2,3 much uncertainty existed regarding the best methodology, monitoring instruments, and most acceptable and accurate physical activity endpoints. This “white paper” provides a summary for the rationale behind using objectively measured PA and proposes a standardized methodology for assessment, including standard operating procedures for future research. The task force included a global panel of key opinion leaders from the field as well as key industry partners conducting research in COPD with physical activity endpoints. The consortium aspires that the proposed recommendations will become widely adopted and pave the way to further research. This will ensure that sufficient data can be accumulated using standardized procedures to successfully propose a physical activity endpoint for regulatory qualification in the future.
This paper elaborates on the rationale for using PA as an endpoint in clinical trials as well as a proposed methodology that can be adopted in future trials to make results more comparable. Whenever possible, the CBQC has pooled data from existing studies to answer key methodologic questions. To that end, data from the U.S.-based COPD Genetic Epidemiology study4 and the EU-based Innovative Medicines Initiative (IMI)-Joint Understanding (JU) Physical Activity as a Crucial Patient-Reported Outcome in COPD project (PRO-active) consortium5 were used as well as studies from individual investigator members of the consortium. We propose a minimum set of data required to report PA as the outcome of a study. This, however, does not preclude investigators from recording more data in order to have a richer assessment of PA patterns when there is a need to answer specific research questions. Although this project originated from the COPD Foundation, a patient organization, we acknowledge that for this specific project the CBQC lacks a patient representative. However, several previous projects, e.g., the IMI PRO-active project, involved patients in the study design and execution and provided information on patient acceptability of PA monitoring.6 The task force has been managed by the COPD Foundation, acknowledging that this patient organization considers PA and its assessment important to patients.
The Concept of Physical Activity
PA is defined as “any bodily movement produced by skeletal muscles resulting in an increase in energy expenditure of the body.”7 It reflects the overall amount of PA undertaken by people. As a concept, PA is distinct from exercise capacity which relates to the ability to undertake PA and to the performance on tests of physical function. An individual’s PA is constrained by the limits of their exercise capacity, but as a behavior it is also dependent on psychological, social, cultural, environmental, and/or economic factors.8 An endpoint model linking these relevant concepts is provided in Figure 1. Pharmacological and non-pharmacological interventions in COPD can target 1 or more physiological system functions (e.g., bronchodilators reducing expiratory flow limitation, section ① in Figure 1), exercise capacity (e.g., exercise training, section ② in Figure 1) or PA (e.g., self-management, coaching interventions, policy measure to enhance PA, section ③ in Figure 1). Providing detailed information on effective interventions to increase PA, which was recently detailed in a systematic review,9 goes beyond the scope of this paper.
In this review, PA as a clinical outcome assessment (COA) is discussed within the framework of the regulatory qualification requirements for novel methodologies for medicine development as detailed by both the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA). In this context, it is important to identify the concept of interest and the context of use of the COA (see below). PA biomarkers can be used as predictive biomarkers for prognosis. In clinical trials, PA can be used to enrich a sample for treatment response and/or to quantify the effect of interventions.
The Importance of Physical Activity
In adults, lack of PA is associated with several potentially modifiable adverse outcomes and comorbidities. These include obesity, cardiovascular diseases, diabetes, cancer, poor mobility, impaired bone health, depression, cognitive impairment, impaired health-related quality of life, and all-cause mortality.10-16 Recent data in the general population show that better survival was associated with increasing activity, independent of its intensity.17,18 For example, in 1 study of 16,741 women aged 62-101, it was found that the number of steps accumulated per day, rather than the intensity, was of clinical importance for mortality, with survival rates increasing up to approximately 7500 steps per day.18 Difficulty participating in PA is a cardinal feature and consequence of COPD, occurring in the context of symptoms of breathlessness and fatigue. Breathlessness during PA typically drives an avoidance of PA.
Relation to Clinically Important COPD Outcomes
PA is related to diverse health outcomes in individuals with chronic respiratory diseases, although the evidence available for these is variable in amount and quality, depending on each specific outcome. Higher levels of PA have consistently been related to a lower risk of acute exacerbations, hospitalization, and death for COPD patients,19 across different individual characteristics, geographic settings, and instruments for measuring PA and independent of spirometric severity and other predictors of COPD prognosis.20 Moreover, physical inactivity is likely an important contributor to much of the multimorbidity typically observed in patients with COPD.
Other important and clinically relevant endpoints relate to disease progression. Longitudinal studies are scarce, with only a few based on an objective PA assessment (summarized in Table 1).21-23 First, many studies report a cross-sectional association between PA and lung function.19 Most studies propose the hypothesis that lung function determines PA, despite several studies in the general population supporting bi-directionality for this association.24-26 One recent study, though, found that higher PA was associated with an attenuated lung function decline in COPD.23 Second, longitudinal studies in COPD show conflicting results about the association between PA and exercise capacity, muscle strength, or body composition outcomes, which might partly be explained by a difference in methodology when analyzing the data and a difference in the observed progression in the different cohorts. It is also conceivable that activities of daily living, generally of low intensity, would not be associated with future changes in functional exercise capacity or muscle strength, which may require regularly scheduled intense activities. Finally, only 1 study investigated the relationship between PA and progression in quality of life and found a relationship with the symptom subdomain of the St George’s Respiratory Questionnaire (SGRQ).23 The fact that PA was not associated with changes in the activity domain supports the finding that the amount of PA and experienced difficulties with PA are distinct concepts.27 It should be noted that trajectories for disease progression are very heterogenous in COPD. Therefore, longitudinal studies with repeated measures of both PA and outcomes of interest, as well as intervention studies using PA as a key outcome, are needed.
Finally, in the interpretation of the reported associations it should be taken into account that longitudinal data are scarce, influence of unmeasured confounding factors cannot be ruled out, and clinical trial evidence about the effect of changing PA on long-term outcomes such as mortality is currently lacking.
The Relevance to Patients
The ability to participate actively in daily life is important to patients.27-29 Patients are typically also able to define the concept of PA as “any lifestyle activity including walking, gardening, and housework as part of their daily routine.”30 This definition, close to the operational concept of interest used by the CBQC working group (see below), does not single out one specific activity as the most relevant. Purely from a patient perspective, PA becomes relevant when amount, difficulty, and adaptations to patients’ daily life are considered.27 Then, the experience with PA becomes an essential part of quality of life and PA limitations impact on the global burden of the disease. Most patient-reported outcomes (PROs) assessing the broader concept of health status have items related to PA, reiterating the importance patients attribute to this concept in qualitative studies.31-33 In addition to direct relevance to the patient’s experience of their disease, engaging in PA also has social, psychological, and physiological downstream effects.
In summary, PA is now recognized as: (1) a marker related to important endpoints that is directly measured under real life conditions; (2) a distinct concept that contributes to predicting prognosis, in addition to system and integrated physiological markers (e.g., forced expiratory volume in 1 second [FEV1], 6-minute walk test distance); (3) a measure that is understood by and directly relevant to patients as well as health care providers; and (4) an outcome feasible to change, at least in the short-term. As such, PA in COPD is an important focus of investigation and intervention.
The Dimensions of Physical Activity
As outlined above, PA relates to all purposeful movements of patients during the day and, therefore, includes more than sports or exercise activities. Leidy et al34 identified the following categories: household maintenance, movement, family activities, social activities, work, altruistic avocation, and recreation. PA is a complex behavior, which makes it difficult to capture with a simple measure. Moreover, it has important day-to-day variability. Activity monitors provide insight into patients’ bodily movements in terms of frequency (distribution over the day, week), intensity (of specific movements or averaged out over the day), accumulated time (minutes of activity per day), and, in some cases, type (walking, cycling, sitting, etc) of PA. When PA is objectively monitored, and data are collected in small bins (e.g., per minute) more granular information becomes available. For example, PA in specific moments may be studied, which could be relevant when interventions are expected to have greater effects at a specific time of the day or type of activity.
Physical activity can be approached in several ways. One can measure the patient’s movements (e.g., steps, walking time, movement intensity) or estimate energy expenditure (e.g., active energy expenditure). These different concepts are explained below.
Concepts of Interest
Overall Amount of Physical Activity:
This concept quantifies the amount of PA performed irrespective of its intensity or duration of PA performed above a specific intensity threshold (e.g., time in moderate to vigorous intense activities). The amount of PA independent of intensity can be captured by the number of steps per day. Step count is an easy-to-understand metric and it captures what is the most relevant and problematic daily activity for the majority of patients with COPD.35 In the general population, steps per day have been used to classify people as more or less active (Table 2).36 Alternatives to total steps per day include descriptions of time spent during a specific activity (e.g., time in any activity, walking time, cycling time, shuffling time).
The duration of PA performed above a specific intensity threshold quantifies the time spent above a threshold of PA intensity. The thresholds are set to approximately reflect the metabolic equivalent of tasks (METs) where 1 MET is the energy expenditure during rest (resting metabolic rate [RMR]; typically standardized in adults to an oxygen uptake [VO2] of 3.5 ml/min/kg body weight). Table 3 provides intensity thresholds used to identify mild, moderate, and vigorous exercise intensities.37 The thresholds can also be determined relative to the capacity of the patient (e.g., 50% VO2 reserve). This gives the opportunity to relate the intensity to the individual capacity of the patient, which is often significantly constrained. This concept gives an interesting insight into how the capacity is constraining the patient’s activity (e.g., in the case where the relative intensity is high despite absolute intensity being low). However, when using this approach, one should be cautious because it might wrongly classify those with a very low capacity as active when they are not.38,39 To the best of our knowledge, there is currently no clear evidence showing the added value of using individually-anchored measures of relative intensity when relating the physical activity to health-related outcomes. This remains an intriguing research question. Most guidelines advocate regular periods of PA above a threshold of 3 METS (i.e., moderate-to-vigorous physical activity [MVPA]) to maintain or improve health.15MVPA and total amount of PA (irrespective of intensity) are different outcomes, but they are closely related in patients with COPD. Severely inactive patients will be characterized by low overall amount as well as low MVPA (Figure 2).22,38,40-44 Those with more severe COPD may have difficulties achieving physical activities with higher metabolic demand41 and these activities are the first to be reduced in early stages of the disease.45 However, it should be noted that patients with COPD may consume more energy to perform the same task than a healthy individual of the same age and, therefore, MET thresholds, which are derived from healthy populations, may be less applicable to patients with COPD.
Time in activities of certain intensity may be reported as total minutes or as bouts of activity where a given intensity is maintained. An example might be “MVPA in bouts of at least 10 minutes” but these bouts of uninterrupted activity are scarce among patients with COPD, rendering the concept less useful, especially for those with more severe COPD,41 whereas bouts of activities for time spent in PA at lower METs may be more relevant. While such bouts have been related to health benefits in the healthy population, recent evidence suggests that the total volume of MVPA relates to better outcomes, with no clear additional benefits driven by “bouted activity.”46
Intensity of Physical Activity:
Intensity of PA may be reported as: (1) overall intensity of a period of time such as 1 day, or waking hours (e.g., mean vector magnitude units (VMU) per minute, the magnitude vector of acceleration in 3 orthogonal planes); or (2) the intensity of specific activities (e.g., movement intensity during walking). An interesting concept that has been introduced and qualified by the EMA in patients with neuromuscular disease is the 95th decile of stride velocity, a measure of walking intensity.47 Whether this endpoint could be of relevance in patients with COPD is not known.
The amount and intensity of PA can be combined as a measure of volume of PA (e.g., total VMU).
Estimates of Energy Expenditure:
Human energy expenditure is highly complex and depends on a wide range of factors. Activity monitors estimate energy expenditure based on 1 or more measured parameters such as acceleration, heart rate, or skin temperature as well as wearer-specific information such as body weight. Energy expenditure can be summarized, for example, as total energy expenditure or active energy expenditure (both in kCal or kJ). METs or physical activity level (PAL), which can be averaged over a day, normalizes the energy expenditure to resting metabolic rate, thereby, avoiding the need to correct for individual factors such as body weight. It is important to note that energy expenditure-related outcomes are estimates based on the modelled relation of acceleration and other sensor information to true energy expenditure. Although such models may be valid in healthy controls,48 a comprehensive validation study showed that these estimates lack accuracy in patients with COPD. This is partly explained by the impaired total efficiency and increased work of breathing in patients with lung disease.6
The Different Concepts Summarized by 2 Factors:
To investigate whether the above-mentioned concepts are statistically distinct, a factor analysis including 1753 days of the baseline PA measurement of 410 patients with COPD included in the Urban Training Study49 was performed.
The factor analysis retained 2 independent factors (dimensions). Based on the contribution of each physical activity parameter to the 2 factors, measured by the coefficients (factor loadings) in Table 4, we interpret that the 2 factors correspond to “amount of physical activity” and “intensity of activity.”49 This analysis supports that the PA concepts of interests identified by experts and used in previous research are indeed supported by a data-driven analysis. Interestingly, and as indicated before, the measure of MVPA, which is included as a measure of intensity, is also related to the amount of PA in patients with COPD.
Related Concepts of Interest
Patient Experience of Physical Activity:
A different concept from objectively assessed PA is the experience patients have of PA. Recently, the PRO-active tools were developed and qualified by the EMA to assess this dimension of PA. Physical activity experience consists of 2 complementary domains: the experienced amount of PA and the experienced difficulty with PA,50 which match concepts recognized by patients.27 This concept is captured by the PRO-active tools, which were developed in line with the methodology proposed by the FDAPRO guidance19 and properly validated in multicenter clinical trials using interventions likely altering experienced amount of PA (e.g., tele-coaching), or experienced difficulty with PA (e.g., bronchodilators) or both (e.g., rehabilitation).51 The PRO-active tools provide insight in to how patients experience PA, rather than capturing how much PA is effectively performed and the intensity, thereof, which is likely more related to physiologic or health outcomes.
Symptoms Experienced During Physical Activity:
Several questionnaires aim to investigate the symptoms patients with COPD experience during PA. Common symptoms include shortness of breath, fatigue, pain, and sometimes anxiety (fear). These are beyond the scope of this review and details are provided in a systematic review.52
Sedentary behavior has been defined as “any waking behavior characterized by an energy expenditure of <1.5 METs in a sitting or reclining posture.” In the healthy population sedentary behavior and PA are clearly distinct concepts, with an independent relationship to mortality.53 In other words, a high physical activity level does not mean one has a low sedentary time and both behaviors have prognostic value. One paper also suggested this independent association of sedentary behavior and physical activity with mortality in patients with COPD.42 However, other papers have shown that sedentary time and physical activity are strongly, negatively related.23,54 In other words, in patients with COPD, a higher physical activity is accompanied with lower sedentary time and changing one’s behavior might result in a change in the other. This stronger association can likely be explained by the narrower spectrum in physical activity with which patients present. However, more research on sedentary behavior in patients with COPD is needed. Whether our recommendations to measure PAare also appropriate to measure sedentary behavior is not yet clear. However, these discussions go beyond the scope of this paper, which is focused on objectively measured PA.
How to Monitor Physical Activity Objectively
It is generally accepted that PA measured by questionnaires can be used to categorize patients in large epidemiological studies. Objective measurements have become more feasible as technology advances. An objective assessment is needed when the aim is to provide a directly measured and accurate assessment of an individual patient’s PA pattern within a clinical trial. As currently no standardized methodology exists to assess and process PA data,9 we aim to provide rationale for a standardized approach in the next paragraphs. We will focus only on objective measurement of PA using activity monitors in patients with no apparent locomotor impairments (e.g., tremor).
Types of Monitoring Devices
Currently, PA is most effectively assessed in daily living using small, unobtrusive PA monitors. Micro electromechanical systems (MEMs), e.g., accelerometers, gyroscopes, and pressure sensors, can objectively and accurately quantify movements and the context of the movements (e.g., stair climbing with differences in altitude) under controlled as well as free living conditions. The complexity of the monitor drives its use. Step counters, for example, provide only step counts often without time stamps and may suffice for feedback to the user as part of behavioral interventions.55 Other activity monitors provide more detail on quantity and quality of activities and may be more appropriate for outcome measurements for clinical trials. Activity monitors may be used in conjunction with positioning systems (although this raises potential privacy issues) and physiological sensors (e.g., heart rate, skin temperature). Integration of data can potentially increase the accuracy of estimated PA and energy expenditure. Whether these combinations will lead to clinically relevant improvements over existing algorithms needs to be confirmed.56
Consumer Versus Medical Grade Monitors
Figure 3 provides a comparison of consumer and medical grade monitors. Devices included in interventional clinical trials, especially phase 3 or later studies, are subject to greater regulatory requirements and scrutiny than monitors that are used personally, in the clinic, or for observational studies. Specifically, interventional clinical trial considerations within the pharmaceutical industry require devices not only to meet requirements associated with a medical-grade CE mark and/or 510(k) approval in the European Union, and United States respectively, but also to provide full audit trails to demonstrate data integrity, security, and privacy throughout the signal chain from initial data collection through long-term storage; this makes such devices considerably more expensive.57,58 Activity monitors that store raw data require more sophisticated signal processing after data collection and can detect detailed patterns of PA. They can be more sensitive and accurate in detection of motion even in less active individuals. Size of the device depends on the intended use. For example, if the intended use does not allow intermittent charging, a larger battery and, hence, a larger casing is needed. Similarly, continuously storing raw data including time stamps increases energy consumption of the device and, hence, the size of the battery, which in turn may impact wearability and adherence over longer periods of time.
Observational, clinical, and personal use of monitors, however, do not necessitate the use of medical research grade devices. Alternative devices are commercially available for these purposes that can bring additional features, reduced size, and/or reduced cost. Due to the complex requirements of patient safety, data integrity, security, and privacy, medical grade devices, together with secure data servers, may be preferable for use in multi-site interventional clinical trials, but these requirements may be different for an observational study, or for routine clinical or personal use.
Consideration of the intent of the PA measurement within a clinical trial is also important. Some devices give direct feedback to the study participant (e.g., step counts or sensory cues to increase activity), and, therefore, are useful if the intention is to intervene on the participant’s normal PA patterns. Typically, medical grade devices are “closed,” meaning that they are designed to be inobtrusive as possible to the participant and, therefore, assess normal spontaneous PA during an observational trial or in response to a study drug, device, or behavioral intervention.
Sampling and Algorithms
The sampling rates and (often proprietary) algorithms used to generate outputs can also vary greatly between devices, making comparisons across devices difficult. Even medical-grade PA monitors employ different step-detection strategies, which impact on outputs, even for steps per day.59 Therefore, for repeated measures, patients should be measured with the same type of device. In addition, comparison of different populations is more accurate if the same device has been used. For clinical trial purposes, the FDA has indicated through the Clinical Trials Transformation Initiative that consideration of any PA outcome should be device agnostic, however, device sensitivity and accuracy as well as data verification and documentation of validity remain critical considerations.60 Higher sampling rates can capture faster movements but require more data storage and may reduce battery life so sampling rate should be carefully balanced with study needs including transmission and storage of the data volume.
The validity of activity monitors to detect PA in patients with COPD has been the subject of several recent studies. In 2012 and 2014, the IMI PRO-active consortium published a methodologic standard for validation of activity monitors using a “lab-based” (validation against indirect calorimetry using a portable metabolic system)61 and a “real life”approach (validation against doubly labeled water indirect calorimetry).6 This consortium found that the DynaPort MoveMonitor (McRoberts BV, the Hague, the Netherlands), the Actigraph GT3X (Actigraph, Pensacola, Florida) and the SenseWear Armband (BodyMedia, Inc., Pittsburgh, Pennsylvania) (each employing bi- or tri-axial MEMs accelerometers) were valid and responsive for use in patients with COPD. These activity monitors showed similar properties in studies performed by other research groups.62 Other medical grade devices have been validated as well. One example is the StepWatch Activity Monitor (an ankle-worn accelerometer) which has been validated in a lab-based approach (validation against manual step count) in a U.S. COPD cohort.63 Newer consumer devices (wearables) are available, such as Fitbit devices and Polar watches as well as medical devices such as Philips Health watch, Apple watch series 4 and Verily Study watch. These sensors are more user-friendly and are preferred by patients but lack accuracy64; we would, therefore, recommend testing the relative accuracy in a representative COPD population prior to using any new device. In general, wrist worn monitors tend to have lower accuracy for step counts compared to monitors worn closer to the center of body mass (e.g., on the belt).65
Recommendations for Standard Operating Procedure for Data Collection
A wide variability in PA measurement methodology is present in the existing literature.9 The way PA data are collected and processed after collection (post processing) has an impact on the psychometric properties of the outcome. The CBQC, therefore, recommends that a standard operating procedure be used for data collection regarding the outcome of PA. A standardized methodology may guide investigators to obtain a more precise and robust outcome and would enable comparisons of outcomes to be made across studies. Several methodological aspects should be considered. While these decisions may vary depending on the aim of the assessment and the included population or can be changed to answer a specific research question, it is an important aim of this paper to make suggestions for standardization of the assessments. The recommendations provided in the present paper apply to PA assessments in stable patients with COPD, focusing on the assessment of overall PA. An overview of the recommendations is provided in Table 5. Further discussion of each recommendation is provided below.
To date, the most commonly used measurement intervals for PA assessment are “24-hours” or “during waking hours.” Patients with COPD typically perform most activity between 7AM and 10PM,43,44,66,67 across different centers in different countries (Figure 4). Although the PA pattern varies throughout the day among patients measured in different parts of Europe, patients across the different centers have on average taken 95% of their total daily steps by 10PM. This timeframe is not different from the PA pattern of a population-based cohort with comparable age68 and does not differ between seasons44 or across disease severity.66 When using the total amount of PA as the outcome (e.g., steps, total time in activity), a restriction of the sample interval to daytime hours will not noticeably influence the outcome. However, when using a measure of average intensity (e.g., VMU/min), including the sleep period will considerably affect the outcome. This is because when including sleep period hours, the calculated average per minute will include many hours where PA is at or close to resting, rather than reflecting the average per minute VMU during hours where PA is more likely to occur. Variation in sleeping time, therefore, has the potential to affect daytime VMU/min calculated, if included within the assessment interval. To optimize the adherence of patients to wearing the device, lower the burden, and to standardize the sampling interval, we recommend a measurement during waking hours to measure physical activity. If a 24-hour assessment is performed, it is advised to standardize the sampling interval for PA assessment towards an assessment of waking time, based on an individual’s own sleeping time or using the hours between 7AM and 10PM. The rest of the collected data can be relevant to assess sleep-related outcomes. When patients work night shifts these hours need to be adjusted.
Rabinovich, et al6 showed that almost the entire sample of patients with COPD would be willing to wear an accelerometer for at least 1 week. Several other studies in COPD, and at the population level, were successful in recording almost 7 days.4,5,49,68 By asking patients to wear the monitor for 1 week, there is a high likelihood that a sufficient number of valid days is obtained to be used in statistical analyses (see further discussion below). Moreover, data collected in the COPD Genetic Epidemiology (COPDGene®) study showed a strong week-by-week correlation based on a 3-week assessment, supporting the need to only measure 1 week in stable patients.4 Whereas it is sometimes argued that the measurement itself may influence PA behavior of patients (Hawthorne effect), this has never been convincingly shown in patients with COPD. In Figure 5, we show data of 151 COPDGene participants with PA data collected on 21 consecutive days.4 No differences were found between the first and later days, arguing against a Hawthorne effect. This could be explained by the lack of direct feedback provided by the monitor, as discussed before. It is not advised that the day of a clinical visit is part of the assessment as this does not represent a normal day for the patient’s behavior.
As per expert opinion, individuals are best instructed in person on how to use the monitor, according to the manufacturer’s guidance. Study site staff (where applicable) should be familiarized with the device ahead of trial recruitment, and there should be emphasis on instructing individuals to adhere to the recommended wear time. Ideally, monitors should require minimal instructions in order to obtain a valid measurement. Instructions may be provided by a written instruction sheet, demonstration and/or video.69 This includes: (1) information about the correct positioning of the monitor; (2) the measurement interval (e.g., start wearing from the moment you wake up until the moment you go to bed at night), with specific instructions to keep wearing the device throughout the day, including during sedentary behaviors or when feeling ill; (3) instructions when to take off the monitor (usually during water activities, bathing, and showering); (4) start and stop date of the assessment; and (5) any other instructions such as how and how often to charge the device (if required).
A logbook may be useful to interpret individual patient data. In this logbook patients can: (1) record the start and end of waking hours, (2) note the period(s) of taking off the monitor during the day and the water activities where the monitor has been taken off, mostly if they involve PA (e.g., swimming), and (3) report changes in health status. Based on this logbook, adherence of the patient to the data collection instructions can be verified. The logbook can be paper, website, or application based.
Processing of Data and Standardization of Statistical Analyses
Data analytics for wearable sensors typically go through several steps, as depicted in Figure 6. Each step in the signal chain needs to be thoroughly thought out and tested.
Algorithms and Data Reduction:
Each single individual’s raw sensor data are processed through algorithms to convert them into a meaningful time series prior to generating outcomes that are useful to investigators, clinicians, and/or patients. These algorithms will have an important role in filtering artefactual activity (e.g., sitting in a car) from real activity. Typically, data are first reduced by algorithms from multiple points per second to a less granular level such as minute-by-minute or day-by-day. Various studies have tested activity monitors and the algorithms provided by manufacturers in order to determine their accuracy in a COPD population.6,69 Some medical-grade devices store raw data, enabling a researcher to go back to the raw data and apply or develop the most appropriate algorithms and settings, even applying new algorithms developed after the data were collected. This allows researchers greater flexibility in creating “specific measurements” that are considered to be relevant. Many different features are reported in the literature, ranging from simple concepts such as total steps per day to more complex constructs such as duration and intensity of bouts of PA. This area has previously been reviewed70 and was presented earlier in this paper. An important development for the future could be to derive device-agnostic algorithms that allow open-source data reduction in order to enable better comparisons between different devices. Currently, the European IMI-JU project Mobilise-D is attempting to develop such algorithms.71
Statistical Interpretation of the Sensor Data:
Daily patient-level measurements are further analyzed by statisticians or researchers, who generate interpretable patient-, group-, and cohort-level output. This work must take several steps into account, as described below and summarized in Figure 7.
a) Definition of a valid day of assessment:
Validating the assessment based on the wearing time is of utmost importance to ensure that the variables obtained are representative of patient’s actual daily PA. Insufficient wearing time will result in a lower total amount of PA or an incorrect measure of average PA intensity. Including night-time data lowers average daily measures, as discussed above. Wearing time criteria should balance representative PA assessment against excluding too many days due to too stringent criteria for the present population. Among patients with COPD, the use of at least 8 hours of wearing time during waking hours was previously recommended.43 PA assessment may be done over 24 hours, but it is recommended that a valid day is defined as having at least 8 hours of daytime wearing time in the standardized time frame between 7AM and 10PM (Figure 7). Where possible, one can adjust these times considering individuals’ own sleep patterns, as determined from the data and ideally verified using the individual’s logbook or from algorithms if they can reliably detect overnight sleeping and waking moments. Of note, patients should always be asked to wear the monitor during all hours, except for water-based activities, which will normally result in more than 8 hours of wearing time.
b) Weekday versus weekend days:
PA measures during the weekend are typically lower than those obtained during weekdays among individuals with COPD6,72 and across populations (Figure 8). However, the pattern of PA tends to be similar on weekends and weekdays.66 Importantly, data shows that adding weekend days increases variability, but not the observed effect, of the outcome measure. In an interventional design, this resulted in an increase in the sample size needed to obtain a given statistical power.43 Therefore, when aiming to use the obtained variable as an endpoint in a clinical trial, one can consider exclusion of weekend days in the PA outcome in order to lower variability and required sample size to identify a specified interventional effect. If the aim of the measurement is to fully characterize PA of a patient cohort, both weekdays and weekend days are recommended to use in the calculation (see step 3 in Figure 7). To be able to compare baseline characteristics across studies, we propose to always report baseline characteristics of the tested population including all measured days. It needs to be recognized, however, that these recommendations are based on a limited number of studies and of limited disease severity and geographic variability, so further research is needed to revise these criteria in other settings.
c) Number of valid days required:
As above, we recommend asking patients to wear the monitor for 1 week, to ensure a high likelihood that a sufficient number of valid days is obtained to be used in statistical analyses. When only weekdays are included, a reliable assessment may be obtained based on at least 2 weekdays.43,67 When combining weekdays and weekend days, Watz et al72 concluded that 2–3 days was sufficient for a reliable PA assessment in patients with Global initiative for chronic Obstructive Lung Disease (GOLD) stage IV but that 5 days of measurement were needed in patients with GOLD stage I. During periods when typical PA is disturbed (e.g., acute exacerbation), fewer days of assessment can be sufficient to identify abnormal PA.73 When the aim is to use PA as an endpoint in an interventional trial, Demeyer et al43 showed that including more weekdays (up to 4) resulted in a decreased variability of the outcome measure.
As a result, having more weekdays (up to 4) in the measurement will decrease the sample size required to obtain an appropriate statistical power. Therefore, when used in clinical trials, one should aim to obtain 4 complete (week) days in the PA assessment which can be considered as the “ideal recommended situation” (Figure 7). These 4 days need not to be consecutive. Since adherence to wearing activity monitors is an important issue, more days of assessment might lead to a loss of patients in the sample due to imperfect adherence, especially in studies with multiple measurements (Figure 9). Therefore, since it is still a reliable assessment, the minimal required number of days to maintain a single patient visit in the analysis can be as low as 2 weekdays (Figure 7).
The day-by-day data that meet the acceptability criteria described above are then summarized into a mean daily PA assessment per patient to use for cohort-level statistical analyses. Of note, this mean should be based on all existing valid days (e.g., in the case of an assessment of 5 valid days, which is judged valid as the number of days if more than 2, the mean will be calculated based on the 5 available data points), see Figure 7. It is important to know that several external factors can impact the data, including weather, time of year, individuals’ routines, occupation, and social and psychological effects.74,75 These can be important covariates, so it is recommended to minimize their impact on the variability of the specific measurement wherever possible and to take these effects into account during statistical interpretation.43
Standardization of Reporting
In all cases, all of the components of PA measurement and post-processing of data discussed above should be reported in the methods of reports describing PA in a COPD population. This includes information about data collection (sampling period, number of days, instructions to patients), the algorithms used (if applicable), data reduction, and statistical interpretation (including definition of a valid day based on the wearing time, type of days, and number of days included in the summarized outcome).
As an example:
Methods: “Patients were asked to wear the [manufacturer] [device name] [device version] activity monitor for 7 consecutive days. They were instructed to wear the monitor from the moment they woke up until the moment they went to sleep. Patients were asked to note in a logbook when the monitor was not worn during the waking hours of the assessment period for quality control. Day-by-day data were exported using the company’s algorithms ([software name] [version xx]) to retrieve wearing time, steps per day, and time in at least moderate intense activity with settings [yy] and [zz]. All valid days (at least 8 hours of wear time in each) were included in the analyses. A PA assessment was judged adequate and representative if it included at least 4 valid days.”
Results: “From a total of [x] patients, [x] were excluded because they did not fulfil the criteria of at least valid 4 days of at least 8 hours’ wear time. Among the included patients, mean (SD) of wearing days was [x (SD)] and mean wearing time was [x (SD)].
Psychometric Properties of Physical Activity
Reliability and Sensitivity of End Points
Reliability (i.e., the ability to produce similar results under consistent conditions) is challenging to assess in real life, as PA has a large inherent day-to-day variability within an individual. Therefore, the assessment of a patient’s PA as a concept becomes more reliable when more assessment days are combined (as discussed above). Of note, among patients with less severe disease, the statistical reliability seems lower, as the day-to-day variability in PA is larger in these patients. However, in a test-retest based on 2 consecutive weeks of PA assessment, reliability was high (intraclass correlation coefficient 0.93).76 It is concluded that contemporary PA monitors reliably assess PA, but that the concept of an individual’s PA may be inherently more variable than other measures such as exercise capacity or lung function.
The sensitivity of a test to identify change due to clinical interventions is linked to both the effect of the interventions on the outcome as well as the reliability of the outcome measure. This is captured by the standardized response means ([SRM]; mean Δ/SD Δ). Demeyer et al showed that the SRM (for an identical intervention) was greater when more days of assessment are included and when weekends were excluded from the analysis.43 The main reason was a reduction in standard deviation, rather than differences in effects. In addition, these authors showed that using daily step count as an outcome resulted in greater SRM compared to other activity monitor outcomes tested (e.g., time in at least moderate intense activity, or mean METs as outcome measures), meaning that daily step count was a more sensitive endpoint than the other measures tested in that study.
The Minimal Important Difference
Assessment of the minimal important difference (MID) is a standard method to interpret whether or not an intervention effect is clinically meaningful. The availability of a MID also allows the presentation of “responder analyses” and is a basis for sample size calculations. The aim of the MID is to reflect both a minimally important difference between groups (such as derived from distribution-based methods) and a minimally important difference or response within an individual over time (such as derived from anchor-based methods). Therefore, a frequently used approach to estimate a MID is combining anchor-based and distribution-based methods and to triangulate a single value or small range of values for the MID. The MID in PA for patients with COPD has been described only in 2 studies.76,77 Both studies targeted daily steps as the PA endpoint, used triangulated anchor-based methods combining a clinical indicator with distribution-based methods and resulted in similar MID estimations (600 to 1100 steps76 and 350 to 1100 steps77). One study was conducted in an outpatient pulmonary rehabilitation setting and used the beneficial effect of pulmonary rehabilitation on PA and hospital readmission to estimate the MID of increased PA,76 the other one used the negative effect of clinically significant medical events (i.e., acute exacerbations or hospital admission) to assess the MID of decreased PA.77 Hence, the MID seems to coincide with important clinical outcomes (minimal clinical important difference [MCID]). In light of the biomarker qualification process, it is important to note that none of the MID estimations currently available were derived in the context of a drug intervention. Because the MID is population specific and estimates of MIDs that have been obtained from pulmonary rehabilitation studies do not necessarily directly translate to drug interventions, more research is needed in this area. Of note, recently the MID for the PRO-active tool, which measures PA from the patient perspective, has been estimated to be 6 for amount and difficulty scores and 4 for the total score.51
Expectations of Regulatory Authorities
To be accepted in the context of a “labeling claim” request (e.g., “in patients with moderate to severe COPD, product [xxx] was shown to improve PA”), any outcome measure used needs to be either already established as valid or to be developed appropriately prior to undergoing a thorough qualification process as detailed in EMA78 and FDA guidance.79 An important prerequisite is to define the context of use (CoU), which is a critical element for the regulatory assessment of any qualification application. This specifies the specific use of the instrument, e.g., digital biomarker, in the drug development process. For the FDA, the CoU provides the boundaries within which the biomarker (e.g., steps per day) may be adequately used.
As an example, based on current knowledge, we would propose the following CoU for physical activity:
Biomarkers of overall physical activity, such as steps per day, are valid, reliable, and sensitive endpoints to evaluate efficacy of pharmacological and non-pharmacological interventions in patients with chronic obstructive pulmonary disease (COPD).
Over time the CoU may be extended as more data become available. Table 6 provides the outstanding questions for which data are needed to support a biomarker for a given CoU.78
The use of digital tools (e.g., activity monitors) to capture PA complicates the regulatory process as, at present, devices are not interchangeable and technical and regulatory requirements in the different jurisdictions need to be met (e.g., the 2017 EU Medical Device Regulation57). Requirement criteria relate to:
(1) the safety, usability, and acceptability of the device for patients with COPD. For activity monitors there are data to suggest that these criteria are met.6
(2) the device produces reliable and accurate data (see above); and
(3) in clinical trials and in accordance with its CoU, a specific device needs to be approved or cleared by regulatory authorities as detailed by the FDA58 or through other existing recommendations60 or more recently.80
In summary, this group of experts is of the opinion that regulatory agencies may be willing to consider PA endpoints (e.g., steps per day or other) to support labelling claims around engagement in PA in COPD, if they are used as secondary endpoints, employ validated sensors, and use the recommendations detailed in this paper. Currently, steps per day carries the largest clinical evidence across a spectrum of interventions and COPD populations. It meets most, if not all, criteria required for qualification. Other, less frequently used endpoints also have potential, particularly those that capture PA intensity or PA characteristics not captured by measurements of steps per day. So far, only physical activity experience has been recognized by regulators for labeling claims of drugs in COPD.81 As outlined above, this is a different, patient-reported outcome.
Physical activity is of key importance for health and clinical outcomes among healthy persons and individuals with COPD. PA has multiple dimensions that can be assessed and quantified objectively using activity monitors. Variable methodologies used in the existing literature to date to quantify PA among individuals with COPD precludes clear comparisons of outcomes across studies and hinders incorporation of PA as clinical trial outcomes by regulatory agencies such as the FDA and EMA. The CBQC of the COPD Foundation recommends implementation of a standard operating procedure for PA data collection and reporting, that should, over time, further clarify the relationship between PA and clinical outcomes, the impact of treatment interventions on PA, and enable use of PA endpoints to support labeling claims around engagement in PA in COPD.
The authors would like to acknowledge the COPD Foundation for their support in the organization of the authors’/CBQC meetings. We would like to acknowledge the late John W. Walsh (March 7th,2017) for his inspiring support in the initiation of the project.
Author Contributions: All authors were involved in data review and interpretation, were involved at all stages of manuscript development, writing, and revision, approved the final manuscript, and agree to be held accountable for all aspects of the work.
Data Sharing Statement: Data presented in the present manuscript can be made available upon request.
Declaration of Interest
Dr. Heleen Demeyer is a post-doctoral research fellow of FWO Flanders; Drs.Divya Mohan and Ruth Tal-Singer are former employees and current shareholders of GlaxoSmithKline; Dr. Tal-Singer reports personal fees from Immunomet, and Vocalis Health. Dr. Mohan is a current employee of Genentech. Matthew Heasley is a full-time employee and shareholder at GlaxoSmithKline. Dr. Richard Casaburi reports personal fees from AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, Genentech, Respinova, and Regeneron. Dr. Christopher Cooper reports grants from the National Institutes of Health /the National Heart, Lung and Blood Institute, the Foundation of the National Institutes of Health and the COPD Foundation during the conduct of the study. He also reports personal fees from PulmonX, GlaxoSmithKline, NUVAIRA, and MGC Diagnostics, outside the submitted work. Dr. Stephen Rennard was employed by AstraZeneca and holds shares. Dr. Alan Hamilton is an employee of Boehringer Ingelheim (Canada) Ltd. Niklas Karlsson is employed by AstraZeneca. Dr. William Man reports grants from the National Institutes for Health, grants from the British Lung Foundation, personal fees from Jazz Pharmaceuticals, personal fees from Mundipharma, personal fees from Novartis, non-financial support from GlaxoSmithKline, and grants from Pfizer, outside the submitted work. Dr. Michael Polkey is a paid consultant for Philips Respironics, JFD, and has received fees for lecturing from Genzyme Sanofi, and GlaxoSmithKline. His institution has received fees for research from GlaxoSmithKline and Novartis, relating to Dr. Polkey’s work. Dr. Carolyn Rochester participates in clinical trials sponsored by AstraZeneca and has received personal fees for scientific advisory board participation from Glaxo SmithKline and Boehringer Ingelheim. Dr. Henrik Watz received payments for lectures/consulting honorarium/travel support from Almirall, AstraZeneca, BerlinChemie, Boehringer Ingelheim, Chiesi, GlaxoSmithKline, Janssen, and Novartis and received unrestricted research grants from AstraZeneca and GlaxoSmithKline. The employer of Dr. Watz (Pulmonary Research Institute at LungenClinic Groshansdorf) received compensation for participation in clinical trials and consulting fees from Almirall, Takeda, AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, Merck, Novartis, Pfizer, TEVA, Bayer HealthCare, Revotar, Sterna, Roche, AB2BIO, and Philips. Dr. Martijn Spruit reports grants from the Netherlands Lung Foundation, grants from Stichting Astma Bestrijding, grants and personal fees from AstraZeneca, grants and personal fees from Boehringer Ingeheim, all outside the submitted work. Dr. Judith Garcia-Aymerich reports other from AstraZeneca, other from Chiesi, outside the submitted work. Dr. Thierry Trooster’s institute received speaker/consultancy fees on the topic of physical activity from Boehringer Ingelheim, AstraZeneca, Chiesi, and Bayer. All other authors have nothing to disclose relevant to the submitted work.