Why Device Data Matter So Much to Population Health Analytics

While positive behavioral change from wearable usage hasn’t been conclusively demonstrated, the value of the data they produce is indisputable.

Daniel Addyson is a data scientist for Aetna. His writing is based on his experience and opinions on data analysis. It is not company-specific information or “insider knowledge.” This is intended to be a general overview, not to be interpreted as speculation on any agreements or future actions by Aetna, Apple, or any other company.

Both the healthcare and the tech industry are showing great interest in getting consumers to track their health with mobile devices, and more specifically, wearables. The market for them is projected to double by 2021. There is a lot of optimism, which is not necessarily well-founded, that health-tracking devices can become important tools in population health management, since consumers will have insight into a vast array personal health metrics.

There is, however, a less-obvious drive towards wearable adoption in the healthcare industry: data collection. While positive behavioral change from wearable usage hasn’t been conclusively demonstrated, the value of the data that wearables produce is indisputable.

In short order, I’ll discuss why healthcare companies (insurers, hospitals, pharma) care about device data; what kinds of data can be collected from a device; and how those data are collected, aggregated, and turned into information that can be used for modeling consumer behavior.

Wearable data represent variety

We now live thoroughly in the age of “big data*.” To most analysts, that term doesn’t mean much in practice. Data are either useful or they aren’t. But there are 4 very loosely-defined criteria that determine whether an organization has big data:

  • Volume (a lot of data)
  • Velocity (high frequency intake)
  • Veracity (data are good quality)
  • Variety (data come from a variety of different sources)

The one criteria that may be hardest to achieve, at least in healthcare, is variety. The general goal is to collect as many types of data as possible to determine what factors drive an outcome. For a health insurer, variety might include claims data, electronic health record (EHR) data, lab and pharmacy data, and publicly-available demographic data (e.g. Census data).

When trying to predict health events, there’s a problem with most of those sources I listed above: they’re largely outcomes. Hospital visits, costs, and even lab results are all the products of underlying conditions and events. The actual drivers of health (what you eat, how much you exercise, how well you sleep) are vastly more difficult to derive. And this is why sources of behavioral data are so important to population health applications.

Consequently, modeling healthcare data isn’t so different from quantitative stock trading: essentially “backtesting” with a few controlling covariates. As anyone invested in the stock market will know, past performance does not guarantee future returns.

What data can be collected from wearable devices?

Data from wearables can capture some of our basic behaviors, or at least their correlates. For example, the Apple Watch essentially collects 2 types of data: heart rate and movement. From these 2 data types, Apple can infer how much and what kind of exercise you’ve done, as well as your length and quality of sleep. Additionally, the wearer can set their own exercise and sleep goals, which can also be tracked by the company.

But there is other, less-obvious data that can be collected from devices, particularly smartphones. Depending on licensing agreements, 1 app may be able to access other apps on the phone, which can be used to directly collect data like shopping habits (the retail industry has this down to a science. Joseph Turow’s The Aisles Have Eyes provides an extensive overview of that industry’s well-monitored digital ecosystem).

The ways and times you use your smartphone are themselves behaviors, and which have been shown to predict mental illness events. And although not directly health-related, Aetna has begun implementing behavior-based security protocols to protect unauthorized access to its app on users’ phones.

Turning data into information: machine learning applications

Once collected, those data points must be turned into usable information. There are numerous events and trends that payers and providers are interested in using machine learning and big data to predict, like ER visits, disease management, hospital readmissions, and costs.In all of these cases, behavioral metrics can be added as input data, which provide a more complete picture of the individual.

Additionally, healthcare data are often held up while claims, pharmacy, and lab are processed and eventually collected and stored for analysis. The consumer can also delay the healthcare process simply by choosing to put off care for a condition. Wearable and app data can circumvent these delays by providing a continuous data stream that reflects health status.

The shortfalls of device data

Device and app data are not, however, a panacea. A few of the challenges are:

  • Selection bias. Device wearers tend to be already more engaged in improving their health: The characteristics of the “wearable-tracking” population need to be compared with those of the “non-wearable” population to see if they can reasonably be analyzed together. There are additional considerations with including these data points in machine learning pipelines if the entire population won’t have those features available for future scoring.
  • The wearer can unplug. People forget to charge their watches, or choose to turn off their phone’s geotracking. So even the data from your digitally-connected consumers can be spotty or incomplete.

Nevertheless, healthcare companies—payers and providers, in particular—are hoping to gain greater digital connection from their consumers. As this digital connection increases, organizations are banking on potential cost savings derived from richer data sources. As long as we remember to recharge the batteries, that is…or until we all get chipped.

* All of these terms are relative. Bookkeepers in the 1800s probably saw their accounting logs as big data, and any company that owned an IBM mainframe in 1952 would have claimed they had a “big data platform,” had the term existed then.