EURO PEAN SO CIETY O F CARDIOLOGY ®

State-of-the-art paper

Data-driven healthcare: from patterns to actions M Grossglauser1 and H Saner2

European Journal of Preventive Cardiology 2014, Vol. 21(2S) 14–17 ! The European Society of Cardiology 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/2047487314552755 ejpc.sagepub.com

Abstract The era of big data opens up new opportunities in personalised medicine, preventive care, chronic disease management and in telemonitoring and managing of patients with implanted devices. The rich data accumulating within online services and internet companies provide a microscope to study human behaviour at scale, and to ask completely new questions about the interplay between behavioural patterns and health. In this paper, we shed light on a particular aspect of datadriven healthcare: autonomous decision-making. We first look at three examples where we can expect data-driven decisions to be taken autonomously by technology, with no or limited human intervention. We then discuss some of the technical and practical challenges that can be expected, and sketch the research agenda to address them.

Keywords Big data, analytics, data mining, preventive care Accepted 4 September 2014

Introduction Electronic traces of human behavioural patterns have accumulated at a breathtaking pace over the last two decades, along with the spread of the social web and of mobile applications running on smartphones. This has opened up new and exciting avenues for scientific research both in computer science, where the technologies to handle this information glut are developed, as well as in other disciplines, such as the social sciences, humanities, linguistics, and medicine, where completely new research questions about large populations can be asked. Usually, such user data is not primarily collected for research purposes, but for operational reasons. The primary business model for online and mobile services is currently based on different forms of advertisement, and great efforts are expended to optimise its effectiveness. This in turn was the driver for technology developments in databases, data mining, and machine learning to handle massive amounts of data, build models of human behaviour patterns, and take optimal decisions in real time, e.g. for advertisement placement. This era of ‘big data’ is reflected in the research agenda in information and communication technology (ICT). We identify three major trends. The first trend concerns uncertainty in data. At the risk of a gross oversimplification, classical statistics is concerned with

squeezing out as much basic information as possible from limited data, while data mining and machine learning tend to focus on extracting more complex patterns and insights from very large datasets. Furthermore, a lot of the data collected in the digital economy are not structured, i.e. conforming to a strong a-priori specification. For example, the structure of social networks, the content of text messages, or human mobility patterns are not well modelled in the classical relational data model, and there is a lot of noise and uncertainty inherent in such data. A major focus of machine learning is to deal with such complex and noisy data, and to extract robust patterns from it. The second trend is scale. The explosion of information volumes is well known. While these exponential growth trends are quite predictable at a macroscopic scale, the rate of change for an individual service, for 1 School of Computer and Communication Sciences, Ecole Polytechnique Fe´de´rale de Lausanne, Switzerland 2 Department for Preventive Cardiology and Sports Medicine, University Clinic for Cardiology, Bern, Switzerland

Corresponding author: M Grossglauser, School of Computer and Communication Science, Ecole Polytechnique Fe´de´rale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland. Email: [email protected]

Downloaded from cpr.sagepub.com at UCSF LIBRARY & CKM on March 10, 2015

Grossglauser and Saner

15

example, can be breathtaking. A smartphone app can become very popular almost overnight, with an explosion in traffic that can be challenging to handle. Over the last decade, the cloud computing paradigm has taken over: all but the largest service providers do not own their own infrastructure any longer, but rent it ondemand from a small number of cloud providers. Among the advantages of this paradigm are the flexibility to bring in resources only as needed, and to reduce capital investments. The third trend is energy, which is increasingly the most critical resource in the big data universe. While computing power and storage continue to get cheaper exponentially, the energy needs of the global cloud computing infrastructure is slated to surpass other major industries (one recent estimate expects the carbon footprint of the global cloud infrastructure to surpass that of the airline industry in 2020). A lot of research effort is currently going into reducing the energy footprint of the digital economy. This goes from semiconductor improvements all the way to smarter algorithms to dynamically deactivate unnecessary resources in data centres. For the medical community, these trends are relevant for several reasons. First of all, they drive the explosion of user data that is accumulating, some of which can be studied, e.g. to ask new questions about the interplay of human behaviour and health outcomes. Second, they imply that the barrier to developing new user-facing tools, e.g. for preventive care of patient assistance is now much lower than before. Most patients in the developed world possess a smartphone, which makes it quite easy to get these tools into their hands. Looking back at the role of data in medicine, historically this has meant analysing data in an ‘offline’ manner to discern patterns, correlations, and predictors. Such analytical tools have had tremendous success, and are embodied, for example, in the rigorous framework of medical trials that govern how new drugs and treatments come to market. The world of big data provides new opportunities to the medical community. There are massive amounts of new data to be harnessed; at the same, this data is often unstructured, and not curated in the rigorous manner that classical data analysis relies on. This calls for new methods, and new ways to ask questions, and has recently been recognised as an exciting new research frontier.1,2 In this paper, we consider a specific aspect of this development: when data-driven actions are taken autonomously by systems for the benefit of an individual patient, or even for society as a whole. We illustrate this aspect with a few examples that point the way, and then develop the main research thrusts to bring this agenda to fruition.

Individual patients: monitoring and analytics Big data technologies are currently making strong inroads in the areas of medical devices and home care. Research in remote monitoring of electrocardiogram (ECG) tracings and tracking of sensor data from devices such as pacemakers or implanted defibrillators have led to substantial progress in regard to the management and the prevention of rehospitalisation for such patients.3 Using data from multiple sensors to assist independent living of elderly people has a great potential to improve the complex healthcare process and to facilitate an enhanced and efficient care for elderly under these circumstances.4 These two contexts have significant potential to improve the quality of care at lower cost. We can classify the functionality of these systems according to their time-scale. At the shortest time-scale (minutes to hours), critical conditions can be detected more quickly, and actions initiated. At a longer time-scale (days to months), analytics tools can help discern relevant trends and patterns that contribute to personalised treatments. Some of the actions initiated by such systems might not directly involve medical or care staff, but rely on a degree of autonomy of the system. For example, a home-monitoring system might propose exercises or entertainment options based on monitoring data in an automated way. Or a patient in a critical situation might be given advice to self-medicate, even before an ambulance can get to the patient. The promise of such systems is patients with more autonomy, and reductions in cost. On the other hand, the threats to privacy are obvious if data collected by such a system were to leak. Also, a system that takes automatic actions will fundamentally have to trade off ‘false alarms’ with situations where actions are taken too late or inadequately. While such tradeoffs exist for human decision-making as well, the legal and ethical issues will be challenging when technology becomes more central.

Communities of patients: networking and sharing People influence each other, and some of this influence today flows through online social networks (Facebook, Twitter, etc.) This fact is relevant both to harness it as a tool, including for health-related questions, and as a model to understand how behavioural patterns emerge. The latter has been well documented, for example, in the work by Christakis and Fowler.2 Their basic thesis is that behavioural patterns can spread like epidemics of infectious diseases, for which they show statistical evidence based on several large-scale datasets.

Downloaded from cpr.sagepub.com at UCSF LIBRARY & CKM on March 10, 2015

16

European Journal of Preventive Cardiology 21(2S)

Although the soundness of their statistical methods has been disputed,5 their findings nevertheless provide new opportunities to predict behavioural patterns of patients based on their social neighbours, and also new ways to influence them for their own benefit. Beyond this, social networks can be directly harnessed as a tool to help patients. Several online services exist (e.g. http://www.patientslikeme.com/) for patients with specific conditions to join communities, exchange information, and provide and receive support. These networks become rich data sources over large patient populations, which can help uncover new and unexpected patterns. There are several ways in which such services might take automatic actions on behalf of patients. For example, the way communities form and evolve determines what interactions and information exchanges take place. Also, personalised recommendations and suggestions can be provided to patients. For example, the system might proactively ask questions of a patient or recommend specific medical tests if behavioural patterns suggest a particular risk (e.g. if a patient describes specific symptoms).

Society as a whole: managing epidemics A third area where we see significant potential for big data technologies is in the management of infectious diseases. Epidemics spreading through personto-person contacts or through the environment pose a great danger to society, especially in the developing world. It is well known that human behaviour influences the dynamics of epidemics. Counter-measures such as quarantines have historically been used to change these contact dynamics, along with other tools such as immunisation, hygiene, etc. depending on the disease. The broad availability of mobile communications technology provides new possibilities of countering epidemics. Mobile phones can be used both to collect behavioural fingerprints of individual users, and to provide individualised and real-time recommendations to users, with the goal of changing the contact dynamics of the target population and slowing down or even stopping the epidemic. This should be achieved with less intrusion than a full-scale quarantine or curfew. For example, even during an outbreak, people have to pursue some specific activities, such as shopping, working, looking after other people or livestock. The resulting contacts provide opportunities for the disease to be passed on to new patients. This is particularly harmful if a contact, at a supermarket, say, is between two people who belong to two different social circles; if they belong to the same social circle (i.e. people who encounter each other on other occasions anyway), then that contact is less likely to lead to the infection of a

whole new subcommunity. Therefore, if people went to the supermarket in a staggered manner, to minimise contact with people in other subcommunities, then the epidemic can be slowed without actually quarantining people at home. A recent study explores this idea using real mobility data from a mobile phone operator.6 This study looks at simulated cholera epidemics in a country, based on actual mobility patterns reproduced from mobile phone call-data records (CDRs). They show that such an epidemic could be slowed significantly, even if only some fraction of the population (20%, say) participates in the scheme and follows the recommendations. While this study is preliminary, it does suggest the potential of the combination of mobile and big data technology as a tool to combat large-scale epidemics.

Discussion and conclusion In this paper, we outline a research agenda for datamining technologies to improve health care. Specifically, we argue that significant potential exists for scenarios that go beyond classical statistical analysis and pattern recognition. In these scenarios, data-driven actions are taken in real-time and in an automated way. We have provided three areas where we expect this approach of ‘closing the control loop’ to lead to new solutions in both preventive and non-preventive care. First, the newest generations of implantable devices generate monitoring data that can be harnessed in order to provide feedback and recommendations to patients and the health personnel alike; also, homemonitoring solutions can turn raw sensor data into detailed fingerprints of patient behaviour and of trends and anomalies. Second, carefully managed social interactions within communities of interest can provide patients with psychological support and advice, and constitute a rich source of research data. Third, mobile networking technologies provide a new platform to implement real-time recommendation tools to change contact dynamics in a large population (e.g. a whole city), with the aim of slowing or stopping an epidemic of an infectious disease. Realising this vision will require advances in big data technologies to overcome several challenges. The first challenge is certainly privacy: although privacy threats have been well documented in social networks, payment systems, etc. the risks for patients’ medical data is of course in another league. These risks span the range from wrong treatment decisions, abuses (employment, insurance, etc.) to malicious manipulation with the intent to do harm. If strong assurances cannot be given to patients, the adoption of these technologies will be in jeopardy. This is not only a legal issue but also a technical issue, and research on

Downloaded from cpr.sagepub.com at UCSF LIBRARY & CKM on March 10, 2015

Grossglauser and Saner

17

privacy-preserving data mining will need to provide the tools to extract knowledge from data under strict and provable privacy guarantees. A second challenge lies in dealing with the inherent uncertainties in such systems. Data that measures human behaviour from sensors that are subject to noise and errors, will necessarily lead to decisions that are themselves subject to uncertainties. Although this is true for human decision-making as well, the acceptance of systems that make autonomous decisions may very well hinge on their ability to handle uncertainty in a systematic way. At scale, errors will be inevitable given these uncertainties, and the legal and ethical frameworks to deal with this issue will need to be developed and accepted by patients. We can ask: if one could prove that a technology or system makes mistakes, but – statistically speaking – performs at least as well as a human decision-maker, would this suffice to gain patient acceptance? In the end, this will be the litmus test for the degree of automation that patients will tolerate without threatening their bond of trust with medical and care personnel.

Conflict of interest None declared.

References 1. Weber GM, Mandl KD and Kohane IS. Finding the missing link for big biomedical data. JAMA 2014; 311: 2479–2480. 2. Christakis NA and Fowler JH. Connected: The surprising power of our social networks, and how they shape our lives. 2011, Little, Brown and Company. 3. Small RS, Whellan DJ, Boyle A, et al. Implantable device diagnostics on day of discharge identify heart failure patients at increased risk for early readmission for heart failure. Eur J Heart Fail 2014; 16: 419–425. 4. Wood AD, Stankovic JA, Virone G, et al. Context-aware wireless sensor networks for assisted living and residential monitoring. IEEE Network 2008; 22: 26–33. 5. Lyons R. The spread of evidence-poor medicine via flawed social-network analysis. Stat Politics Policy 2011; 2. 6. Kafsi M, Kazemi E, Maystre L, et al. Mitigating epidemics through mobile micro-measures. Presented at: NetMob Conference, Boston, MA, May 2013, arXiv paper, abs/ 1307.2084.

Funding This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Downloaded from cpr.sagepub.com at UCSF LIBRARY & CKM on March 10, 2015

Data-driven healthcare: from patterns to actions.

The era of big data opens up new opportunities in personalised medicine, preventive care, chronic disease management and in telemonitoring and managin...
134KB Sizes 2 Downloads 7 Views