Heart Rate Accuracy from Wearable Wristbands

Heart Rate Accuracy from Wearable Wristbands

Labfront's data scientist team has done experiments comparing a Polar H10 and Garmin Vivosmart 4 to answer the question, "How good is the quality of heart rate data from a smartwatch?"

Feb 7, 2022
By Dr. Francis Hsu and Dr. Han-Ping Huang
Polar H10 and Garmin devices with text "Heart Rate Accuracy"

Since we launched Labfront, the most frequently asked question has probably been “Is the quality of signals from the watch any good?” This question is then closely followed up with, “Under which circumstances can we reliably trust the signals?”  We too were interested in finding the answers to these questions and have consequently done a number of experiments to discover the answers.  In this blog, we share our results with you in the hopes of advancing your understanding of the strengths and limitations of wearable devices and to help make Labfront more meaningful in your quest for solid scientific research.

Heart Rate Accuracy from Wearable Wristband during Daily Activity

1. Motivation

Heart rate and heart rate variability are related to mental and physical health. As a means to collect HR/HRV data, wearable smart bands are extremely convenient and thus becoming increasingly popular for measuring stress, exercise intensity, arrhythmia detection, and so on.  Watches derive heart rate by measuring the changes in vascular blood flow during the cardiac cycle using a photoplethysmography (PPG) sensor. However, recent studies have suggested that the accuracy of HR as measured by a PPG-based sensor can be susceptible to various confounding factors such as physical movement and upper arm muscle contractions. To investigate this further, we started a series of experiments to validate the accuracy of heart rate data measured by the smart band. 

2. Experiment 

Two young, healthy male participants wore the Garmin Vivosmart 4 and Polar H10 for 24 hours on a regular workday. They manually recorded the time and duration of daily activities, including sleeping, (computer) typing, chatting, and walking. 

The devices’ beat-to-beat HR data was uploaded to the cloud through the PhysioQ app and then downloaded from the Labfront website.

We tested the accuracy of heart rate (HR) data obtained from the wrist band Garmin Vivosmart 4 by comparing the data with the Polar HR strap (H10, Polar Electro Oy). The Polar H10 is a handy chest-strap device that is widely considered the most accurate method for obtaining heart rate since it uses electrocardiographic (EKG) signals and not PPG-based signals. In other words, HR is determined by directly measuring the electrical activations of the heart and not indirectly through blood flow changes in the wrist.

To test the differences in the HR data from Vivosmart 4 and Polar H10 during different daily activities, we extracted three separate, but continuous 10-minute segments during each activity type from each participant.

3. Results

Figure 1 illustrates an individual sample of HR time series from the Vivosmart 4 and the Polar H10 during the unrestricted, real-life scenarios. In general, the HR from the Vivosmart 4 and Polar H10 matched well through all four activities, in which most of the HR differences are less than 5 bpm. In particular, the two time-series are nearly identical during sleep. However, when typing, chatting, or walking, the HR series from Vivosmart 4 appear noisier than that from Polar H10.

heart rate series measured by Vivosmart 4 and Polar H10 devices
Figure 1. Examples of Heart Rate Series during Different Activities Measured by Vivosmart 4 and Polar H10

There are several established ways to objectively quantify the accuracy of a device when compared to a “gold standard”.  In this case, the device being evaluated is the PPG-based Garmin Vivosmart 4, and the “gold standard” is the Polar H10.   Bland-Altman plot is one method where the level of agreement between two devices can be evaluated.  Take for instance in the morning, your Vivosmart 4 shows a HR of 102 bpm while Polar H10 shows a HR of 106 bpm.   This HR difference of 4 bpm might be considered acceptable when the HR is in the 100s but not so much when the HR is in the 40s.  For this reason, this HR difference is plotted against the average HR of the two measures – in this case, 104 bpm.

Figure 2a-d show the Bland-Altman plots for HR data shown in Figure 1a-d, separately. The y-axis is the difference in HR obtained simultaneously by the two devices; the x-axis is the HR average of the measures obtained from Vivosmart 4 and Polar H10. In each subplot, the solid line represents the mean of HR differences, and the two dashed lines depict the mean ± two standard deviations, separately.  Assuming a normal distribution, 95% of data would fall within this range demarcated by the two dashed lines.  In other words, if your Garmin watch shows a HR of 81 while talking (Figure 2c), you would be ~95% confident that the true HR (as determined by Polar H10) is within the HR range of 71 to 91 bpm since the 2xSD was approximately 10 bpm. 

If you were purely interested in the absolute difference (not worrying about whether one device had higher or lower measured HR), then the mean absolute error (MAE) can be calculated.  As noted in Figure 2b-e, MAE was < 5 bpm in all 4 activities. But as noted previously, an MAE of 2 bpm may not be significant if your average HR was 140 compared to a lower HR of 35. Therefore, to account for this factor of HR, the MAE can be “corrected” by dividing it by the actual HR measured from Polar H10 to obtain the mean absolute percentage error (MAPE).  A MAPE value less than 10% is generally the criterion by which the data is considered of sufficient quality [1-3] and so, by this criterion, the Vivosmart is sufficiently reliable for determining HR from the perspective of general use. 

land-Altman Plots of HR Data from the Two Devices during Different Activities
Figure 2. Bland-Altman Plots of HR Data from the Two Devices during Different Activities

The results are summarized here in Table 1. For each type of activity, there are 6 sessions (3 sessions * 2 participants) of 10 mins HR data.

table with summary of mean error, MAE and MAPE
Table 1. Summary of the mean error, MAE, and MAPE during different activities.

4.  Conclusions

During sleep, both the MAE & MAPE values are quite low - suggesting that the HR from the Vivosmart 4 wrist band is accurate during sleep.  The MAE & MAPE values are greater during the other 3 activities – indicating that Vivosmart 4 may not be as accurate during physical activity.  Nevertheless, for all four activities, the MAPE values are less than 10% and so the HR from the Vivosmart may be considered reliable during these daily activities. 

As a general rule, based on this data, we would be comfortable using a Garmin wristband for measuring heart rate through the course of the day, although the PPG does tend to give noisier HR during more physically-active motions.

5.  Future Validations 

In the current study, we tested the HR under four controlled activities. Therefore, the influences on HR accuracy from other factors such as watch snugness or motion artifacts (like multi-activities at the same time or sudden movements) would need to be explored. 

What other validation studies would you like to see? Tell us what you want to know at hello@labfront.com

6. References

[1] B. W. Nelson and N. B. Allen, “Accuracy of Consumer Wearable Heart Rate Measurement During an Ecologically Valid 24-Hour Period: Intraindividual Validation Study.,” JMIR mHealth uHealth, vol. 7, no. 3, p. e10828, Mar. 2019.

[2] B. D. Boudreaux et al., “Validity of wearable activity monitors during cycling and resistance exercise,” Medicine Sci. Sports Exercise, vol. 50, no. 3, pp. 624–633, 2018.

[3] H.W. Chow, C.C. Yang. “Accuracy of optical heart rate sensing technology in wearable fitness trackers for young and older adults: Validation and comparison study”. JMIR mHealth uHealth, vol. 8, no. 4, 2020.

Last medically reviewed on
Dr. Han-Ping Huang, PhD
Dr. Han-Ping Huang, PhD
Research Lead

Research Lead at Labfront. Han-Ping is a researcher who seeks interesting phenomena, especially the interdisciplinary ones. His dream is to make all the riveting research be easily explored.

Dr. Francis Hsu, PhD
Dr. Francis Hsu, PhD
Research Lead

Francis is a research Lead at Labfront, responsible for data validation and analysis. He is interested in applying physics or math to medical research.

Quotations Icon
Hardvard University Logo Image
“Now, more than ever, we need tools like Labfront that can help researchers take their research and data collection virtual.”
Dr. Gloria Yeh
Dr. Gloria Yeh, MD
Associate Professor of Medicine, Harvard Medical School
Hardvard University Logo Image
Quotations Icon
Bar Ilan University logo image
“The Labfront team was extremely responsive to our requests and made every effort to accommodate our unique needs.”
Yogev Kivity image
Yogev Kivity, Ph.D.
Senior Lecturer of Psychology, Bar Ilan University
Bar Ilan University logo image
Quotations Icon
Clemson University Logo Image
"Labfront's analysts understood our needs as researchers and saved us weeks of work when they prepared our datasets for analysis."
Bryan Edwards headshot
Bryan Edwards, Professor, Joe Synar Chair
Management Department, Oklahoma State University
Oklahoma State University logo
Quotations Icon
Clemson University Logo Image
“The physiological data really helped provide deeper insights.  We recommend Labfront and will definitely be using it again!”
Dr. Kristin Scott image
Dr. Kristin Scott, PhD
Professor of Business, Clemson University
Clemson University Logo Image
arrow icon