Accuracy of vital parameters measured by a wearable patch following major abdominal cancer surgery

h of measurements was collected in 26 patients. For RR, a total of 102 h of measurements was collected in 21 patients. On second to second analysis, 97% of the HR and 87% of the RR measurements were within 5 bpm and 3 rpm of the reference monitor. Assessment of 5-min averaged data resulted in 96% of the HR and 95% of the RR measurements within 5 bpm and 3 rpm of the reference monitor. A Clarke error grid analysis showed that 100% of the HR and 99.4% of the 5-min averaged data was clinically acceptable. Conclusion: The Healthdot accurately measured HR and RR in a cohort of patients recovering from major abdominal surgery, provided that good quality data was obtained. These results push the Healthdot forward as a clinically acceptable tool in low acuity settings for unobtrusive, automatic, wireless and continuous monitoring.


Introduction
In case of major surgery, the recovery of patients is frequently affected by complications. Previous literature showed that up to 25% of the patients suffers from major events such as pneumonia, anastomotic leakage, abscesses or bleeding [1]. Besides short-term effects such as mortality, re-operations and (re)admission to the intensive care unit (ICU), postoperative complications have also been found to influence long-term survival [1]. This emphasizes the importance of early recognition and treatment of patients that deteriorate towards a complication. Any substantial disturbance in vital parameters has been found to precede the occurrence of clinical deterioration [2e6]. When capturing these disturbed vital parameters accurately using a wearable sensor, prediction or recognition of the deteriorating patient may be improved.
To help healthcare professionals to identify patients at risk, early warning scores such as National Early Warning Score (NEWS) and Modified Early Warning Score (MEWS) were developed [7,8]. At the general ward vital parameters and the accompanying early warning scores are assessed by nursing staff in so called spot checks. These spot checks are typically performed manually once every 8 h by nursing staff and therefore result in substantial workload. Additionally, the vital parameters are often poorly registered which is especially the case for respiration rate [9e12]. Moreover, these spot checks only represent the vital parameters at the moment of assessment while vital parameters during the rest of the time and trends remain unknown.
Recent advances in technology allow for the development of wearable sensors to continuously measure vital parameters in the general ward or even at home. One of these techniques is seismocardiography, where signals obtained from an accelerometer positioned at the chest can be used to assess the heartrate (HR). Additionally, an accelerometer is capable of measuring chest movement and can determine the respiration rate (RR) of patients. The Healthdot (Philips Electronic Nederland BV) is a recently developed accelerometer-based wearable in the form of a small patch. This device is applied to the patient's left lower rib by an adhesive layer. The device can unobtrusively measure vital parameters for up to two weeks without requiring any effort from the patient or nursing staff for data capture and transmission. The measured HR and RR are transmitted via a nationwide low-power wide-area network (LoRa) both in-and outside the hospital. The accuracy of the Healthdot was previously studied in post-operative bariatric patients [13]. This population is characterized by a short postoperative hospital stay and low complication rate. In contrast, the present study assesses the accuracy of the Healthdot in post major abdominal cancer surgery, a population with a high risk of complications and prolonged hospital stay. Therefore, this population could benefit during their hospital stay from continuous monitoring for deterioration instead of spot-check measurements.
In the present study, we aim to determine the accuracy of the Healthdot for continuous monitoring of HR and RR in postoperative major abdominal surgery patients by comparison of the measured vital parameters to the reference provided by ECG and capnography from a patient monitor in ICU and post-anesthesia care unit (PACU).

Study population
The present study population is part of the TRICA study NCT03923127, a single center study on data using wearable sensors in post-operative patients in a tertiary hospital (Catharina Hospital, Eindhoven, The Netherlands). The trial was approved by the medical ethical committee (Maxima Medical Center, Veldhoven, The Netherlands (W19.001)).
A total of 150 patients was included in the TRICA trial. All adult patients scheduled for major abdominal oncological surgery from April 2019 to August 2020 were eligible for participation. Patients were not included if they met any of the following exclusion criteria: pregnant or breastfeeding, allergy to tissue adhesives, antibiotic resistant skin infection, active implantable device or any skin condition at the area of application of the devices. Additionally, patients had to be willing and able to sign informed consent prior to the start of the research procedures.

Data collection
The investigated device, Healthdot, is a wearable patch of 5 Â 3 cm that weights 13.6 g. The device is applied to the patients' lower left rib on the mid-clavicular line. The accelerometer-based wearable combines seismocardiography and the displacement along different reference axis to calculate HR and RR [14,15]. The HR and RR measurements are stored on the internal memory of the Healthdot with a time interval of 8 s and 1 s, respectively. Every 5 min, averages of the collected data over the past 5 min are transferred to a cloud server. The HR and RR measurements from the internal memory of the Healthdot were used for the present analysis, HR data was resampled to a 1 s interval using linear interpolation. For every HR and RR measurement, the Healthdot reports a quality index between 0 and 100. Low quality scores are mostly caused by motion artefacts. Only measurements with quality index >0 were included in the analysis.
Immediately after surgery, the Healthdot was applied in the PACU or ICU, whichever was applicable for the patient. The wearable patch sensor then collected vital parameters (HR and RR) for a period of two weeks. Vital parameters of 34 major abdominal cancer surgery patients were extracted from the bedside monitor in PACU or ICU and saved in real time, allowing comparison between the vital parameters measured by the Healthdot and the reference monitor. The logging of measurements from the reference monitor started once the patient was no longer mechanically ventilated. The majority of patients arrives at the ICU without mechanical ventilation.
The reference data for HR was obtained by logging the ECG and RR by logging the capnography signal from the bedside monitor. In ICU, vital parameters were extracted from the Philips IntelliVue MP70 monitor using the IntelliVue software with frequency of 250 Hz for HR and 1 Hz for RR. In the PACU, vital parameters were extracted from the CAR-ESCAPE monitor B650 (GE Healthcare, Milwaukee, WI USA) using the iCollect software (GE Healtcare) with a sample frequency of 100 Hz for HR and 0.1 Hz for RR. The ECG and capnography signals from both reference monitor systems were visually inspected for low quality measurements, only good quality measurements were included in the analysis.

Data analysis
The Healthdot and reference monitor measurements were synchronized based on the temporal sequence of HR values by applying a means of cross-correlation and by applying manual correction when needed in order to define time offset and drift between the clock of the two devices. Any clock drift were corrected by using linear interpolation of the time base for the Healthdot dataset [13]. Patients with a reference recording length of at least 30 min were included in the analysis. This minimum recording length was chosen as the busy clinical postoperative environment could impair the data collection. Additionally, patients with less than 5% sufficient quality Healthdot are excluded from further analysis.
The agreement between the Healthdot and reference monitor measurements was visualized using Bland-Altman plots [16]. The mean difference, or bias, limits of agreement, within and between subject variation were calculated while considering that repeated measures of the same patient were collected using the method by Zou et al. [17]. The corresponding 95% confidence intervals around the limits of agreement were assessed using MOVER [17].
The American National Standards Institute consensus standard states that for HR measurements the error should be 10% or 5 bpm, whichever is greater [18]. We consider an error of 5 bpm for HR and 3 rpm for RR to be clinically acceptable.
To quantify the implications of the difference between the Healthdot and reference monitor Clarke error grid analysis was performed [19]. Since normal resting HR and RR of the included patients are unknown, the size of alteration from a normal resting HR and RR could not be estimated. Therefore, the collected data was distributed between different zones based on the cut-off boundaries of the Modified Early Warning score [8,20]. Zone A represents data points that differ less than 20% from the reference or are correctly identified as bradycardia/bradypnea. Zone B represents data points that differ more than 20% but would not cause unnecessary treatment. Zone C represents points that would lead to unnecessary treatment for patients with normal vital parameters. Zone D represents failure to detect impaired HR or RR. Zone E represents data points where impaired HR or RR are confused. Since the Healthdot transfers 5-min averages to a cloud server, and in future clinical practice treatment decisions would therefore be made on 5-min intervals rather than second to second data, Clarke error grid was repeated for data averaged on 5-min intervals as well.

Results
A total of 34 patients was enrolled (Fig. 1). HR data of patients were excluded from analysis due to insufficient recording length (n ¼ 6) or insufficient high-quality data (n ¼ 3). For RR analysis, 13 patients were excluded due to insufficient recording length (N ¼ 9) or unavailable high-quality capnography reference data (N ¼ 4). Demographics of the study population are shown in Table 1.
For HR, a total of 106 h of data was collected from 25 patients with a median of 3 h per patient (min: 32 min, max: 10 h). Of the second-to-second measurements, 60% had sufficient quality (quality index >0) for analysis. After discarding the low-quality proportion, 1% of the total amount of measurements had to be discarded due to low quality reference measurements.
For RR, a total of 102 h of data was collected, originating from 21 patients. A median of 5 h of RR measurements was collected per patient (minimum: 40 min, maximum: 11 h). Of the second-tosecond measurements, 97% had sufficient quality to be analyzed. After this filter, another 25% had to be excluded due to poor reference quality. Details on the availability are shown in Table 2. Fig. 2 and Table 2 show the results of the Bland Altman analysis. For HR, the bias was 0.23 bpm and 97% of the Healthdot measurements differ 5 bpm from the reference measurements. The Bland-Altman plot shows a small number of data points measured by the Healthdot in the upper right corner which are twice as high compared to the reference measurement. These 'doubled' measurements originate from a single patient. For RR, the bias was À0.28 rpm, the absolute difference between the Healthdot and reference monitor is 3 in 87% of the measurements.
The Clarke error grid analysis in Fig. 3 and Table 3 illustrates the clinical relevance of the differences. For both HR and RR, the majority of the measurements lies in zone A (HR: 99.8%, RR: 83.2%). Zone B includes all remaining data points for HR and the majority of the remaining data points for RR (HR: 0.2%, RR 14.7%). For RR, a limited percentage of the measurements lies in the areas C, D and E (0.1%,1.9%, 0.1% respectively). Data points in these areas might cause unnecessary or inadequate treatment and are therefore undesirable. The combination of zones A and B, which are the clinically acceptable ranges since data points in these zones will not lead to any unnecessary treatment, includes 100% of the HR measurements and 97.8% of the RR measurements.  Next, the data was averaged over 5-min intervals, leading to a reduction in the amount of data points. The down sampled HR data showed a bias of À0.28 bpm and 96% of the measured HR values were within the predefined 5 bpm. For 96% of the 5-min intervals, high quality HR data was available. Excluding the data from the patient with the 'doubled' frequency would result in a bias of À1.17 bpm. For RR the down sampled data had a bias of À0.29 rpm, 95% of the measurements were within the predefined 3 rpm. For 99% of the 5-min intervals high quality RR data was available.
The Clarke error grid analysis was repeated for the down sampled data, results are shown in Fig. 3 and Table 3. For HR, 100% of the down sampled data points were within the clinically acceptable areas A and B. For RR, 99.4% was within these zones, indicating that in less than 1% of the RR data points the difference between the Healthdot and reference could result in failure to detect tachy/bradypnea.

Discussion
The accuracy of the Healthdot to measure vital parameters in patients after major abdominal surgery was assessed by comparison with vital parameters measured by the reference patient monitor. On a second to second analysis, 97% of the HR and 87% of the RR measurements were within 5 bpm and 3 rpm of the reference monitor, provided that good quality data was obtained. Since good quality data was only 63% of the measurements, the Healthdot is not able to replace beat-to-beat measurements. The Healthdot is intended to monitor patients remotely using 5-min averaged data that the Healthdot sends to a cloud server. Therefore, the accuracy of 5-min averaged data was also assessed. This resulted in a decreased amount (37%e4%) of low quality measurement intervals and 96% of the HR and 95% of the RR measurements within 5 bpm and 3 rpm of the reference monitor, demonstrating that Healthdot is able to accurately measure HR and RR on 5-min intervals.
To assess the clinical implications of the differences between the Healthdot's vital measurements and the reference monitor's measurements, a Clarke error grid analysis was performed. The 5-min averaged data was in the clinically acceptable regions for at least 99% of both HR and RR measurements, therefore we conclude that the clinical performance of the Healthdot in the current cohort is good.  The accuracy of the Healthdot for 5-min averaged data was previously studied in bariatric patients where 90.5% of the HR measurements were within the 5 bpm and 88.5% of the RR measurements were within the 3 rpm threshold [13]. The higher percentages found in the present study could be explained by the higher BMI in the bariatric study population as the presence of subcutaneous fat around the chest might impair accelerometer measurements. Both in the bariatric cohort and the present major abdominal cancer surgery cohort the Healthdot was accurate and to be considered for remote monitoring of postoperative patients.
Accuracy of other wearable devices that monitor HR and/or RR have been published, including the performance of the Philips biosensor were 72.8% of the RR measurements lie within the 3 rpm threshold in a minute-to-minute comparison [21]. In the present study 87% of the second-to-second comparisons and 95% of the 5min averaged Healthdot data met the 3 rpm threshold, the Healthdot thus appears to be able to measure RR more accurately than the biosensor. Breteler et al. studied the clinical performance of two other wearable patch sensors, the SensiumVitalis system and the HealthPatch MD, in high-risk surgical patients. The Fig. 3. Clarke error grid for HR (left) and RR (right) comparing the measurements of the reference monitor (x-axis) and the Healthdot measurements (y-axis). Panel I shows results for the second-to-second comparisons, panel II shows results for the 5 min averaged data. Zone A represents data points that differ less than 20% from the reference or are correctly identified as bradycardia/bradypnea. Zone B represents data points that differ more than 20% but would not cause unnecessary treatment. Zone C represents points would lead to unnecessary treatment for patients with normal vital parameters. Zone D represents failure to detect brady-or tachy-cardia/pnea. Zone E represents data points where brady-and tachy-cardia/pnea are mixed up. SensiumVitalis patch showed 99% of HR and 92% of the RR measurements were within the clinically acceptable zones A and B in a Clarke error grid analysis of data analyzed with a frequency of once a minute [20]. Our second-to-second analysis showed a similar performance despite the higher sampling frequency. For the HealthPatch, Breteler et al. reported 100% of the HR and 77% of the RR measurements in zone A and B on 1-min interval data [20]. Again, our results for the Healthdot show similar accuracy for HR however, for RR the Healthdot results show a significantly higher percentage of data in the clinically acceptable zones compared to the HealthPatch. A limitation of the Healthdot is the large amount (37%) of lowquality data for HR in 1 s measurements. This was reduced after 5-min averaging to 4%. Additionally, HR measured by the Healthdot was of insufficient high quality in 3 patients to include them in the analysis. The amount of missing HR data and the delay in reported vitals due to the 5-min averaging make the device unsuitable for high-care environments such as the operating theatre, postanesthesia care unit or ICU where real-time beat-to-beat measurements are required. However, the device's 5-min interval results in an up to 96-fold increase of the measurements in the general ward where spot checks are typically performed once every 8 h. Even in case of missing data, the device is still able to measure vital parameters much more frequently than in current clinical practice and will therefore be a valuable tool. Moreover, the device could be suitable for monitoring vital parameters patients at home after discharge.
A limitation of the reference data is the use of capnography, despite being the gold standard for perioperative RR monitoring [22]. Measurements can be compromised by movement or removal of the nasal cannula or occlusion of the nasal cannula against the nasal mucosa [23]. Therefore, measurement of respiration rate remains a challenging task in the spontaneously breathing population. These difficulties resulted in 27% of low-quality capnography reference data and 4 patients where no or insufficient reference data could be obtained. Additionally, recording length was impaired as the logging software occasionally failed to work.
We observed an abnormally high HR in one of the patients, approximately twice as high as the reference HR. These 'doubled' HR measurements were previously observed in 3 patients from the bariatric cohort which studied the Healthdot's accuracy. The hypothesis was that contraction of the atria followed by the ventricles could be assessed by the Healthdot as separate heart beats. Consequently, the calculated heartrate is doubled [13]. These 'doubled' HR measurements could unnecessarily trigger undesirable alarms in clinical practice. Nevertheless, inclusion of the patient with outlying HR measurements did not compromise the overall clinical performance of the Healthdot.
Future research should focus on the real world clinical use of vital parameters measured by a wearable in the general ward and at home after discharge. New early warning scores incorporating continuous measurements of merely HR and RR can be developed, compared to conventional early warning scores, validated and implemented in clinical practice for the purpose of early detection of postoperative complications. The present study establishes the accuracy of vital parameters measured by a wearable in a realistic, clinically relevant, high risk postoperative population and is an important first step towards the development of wearable based early warning scores.

Conclusions
The Healthdot accurately measured HR and RR in the present cohort of patients recovering from major abdominal surgery. Providing unobtrusive, automatic, wireless and continuous measurement of heart-and respiratory rate and sufficient coverage on 5-min intervals, the Healthdot is clinically acceptable when sufficient quality measurements have been obtained. The Healthdot may therefore be valuable in low-acuity care settings such as the general ward or at home.

Funding
The author(s) received no specific funding for this work.

Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: R. Bouwman acts as clinical consultant for Philips Research in Eindhoven, The Netherlands. This did not affect the work or decision to submit this paper for publication.