This editorial refers to ‘Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis’, by M. Motwani et al., on page 500.

Decision-making in medicine is based on the factual knowledge of the physician/practitioner, but is significantly influenced by various humanistic factors that play a role in the physician–patient relationship. Despite the increasing amount of electronic data available ‘online’ during the patient encounter, the ability to utilize a patient-centred approach is a central quality of superior practitioners. However, electronic health records (EHRs) and use of smart computer systems are having an increasing impact on established approaches in medicine.

The basic EHR collects and stores all data generated in a healthcare system (e.g. clinical notes, test results, and imaging studies). In addition to these data traditionally generated within the clinical setting, there are several novel feeds of information that can impact patient care. Specifically, wearable devices, and information from sensors and apps that capture and track various environmental, socio-economic, and other personal data can be valuable in understanding an individual's health status and variations in patient outcomes.1 (Figure 1). An example is recent data describing that a model including socio-economic and geographic variables of patients with heart failure was better able to predict 30-day readmissions and all-cause mortality than a standard assessment alone.2
Figure 1

An illustration of the online availability of patient-centred data at the time of the encounter. The clinical impact of this ‘data cloud’ will need to be examined in future studies. EHR, electronic heath record.

‘Smart’ computer systems are not limited to data storage, but also contribute to its collection and analysis. Examples are computer-aided detection (CAD) systems in diagnostic imaging/radiology and automatic data analysis. A recent paper examined the usefulness of CAD for the diagnosis of lung nodules on computed tomography (CT) scans.3 CAD systems detected up to 70% of lung cancers that were initially missed by a radiologist, but failed to detect ∼20% of the lung cancers identified by the radiologist, which suggests that CAD cannot replace the radiologist, but may be useful in the role of ‘second reader’. Automatic analysis (‘data-mining’) of large data sets can be performed with minimal human input. This process is called ‘machine learning’ (ML) and is used for various applications including weather forecasting, or recommending items of interest to consumers while using online search engines.46 Machine learning uses algorithms to identify expected and unexpected patterns and can consider a greater number and complexity of variables than traditional methods of predictive analysis. ML techniques are increasingly applied to large data sets in healthcare in order to build predictive models, both for individual patients and for larger populations.7,8

In this issue of the journal, Motwani et al. describe the application of machine learning to predict 5-year all-cause mortality (ACM) in >10 000 patients undergoing coronary computed tomographic angiography (CCTA).9 The complex model used 25 clinical and 44 CCTA parameters, including extent of atherosclerotic changes on the CT scan, demographic variables, and standard cardiovascular risk factors. A total of 745 patients died during the 5-year follow-up. Compared with the performance of existing clinical or CCTA metrics, ML exhibited a higher area under the curve for predicting all-cause mortality. The authors describe that in contrast to traditional clinical risk assessment, the ML algorithm does not make a priori assumptions about causative factors, which may identify unexpected patterns of risk factors. The authors emphasize that ML models can automatically incorporate new data, utilizing it continually to update and optimize its algorithm, with gradual improvement of its predictive performance over time.4,10 Analysis and research using such large, complex data sets is still limited by the fact that current statistical methods are not optimized and may run the risk of overfitting.

While these approaches are exciting, it is important to consider potential limitations. The tremendous amount and complexity of patient-related data accumulating in large healthcare institutions requires advanced software and hardware and a dedicated group of IT professionals for maintenance.11 Data have to be available 24/7 without downtime, and there are mandated requirements for long-term data storage. A particular challenge is unequivocal patient identification, in particular if data are to be shared across large healthcare systems and with other institutions. Complex issues related to data security and ownership (hospital system or patient) have to be addressed. Because of the complexity and associated cost of maintaining such ‘data clouds’, healthcare systems may consider external providers for data management, further complicating oversight and regulation.

In summary, the combination of large EHRs and automated analysis ML algorithms allows automated information gathering, data analysis, and feedback to the practitioner. Organized in complex network ‘cloud’ structures, these systems allow sharing of data using mobile, online access and communication.12 This has, for example, been described in the context of emergency triage of acute aortic syndromes.13 It is likely that these approaches will increasingly affect medical education.14 As smart computer systems are able to parse through a vast amount of data and share the analyses with providers, the focus of medical training may shift to best practices in use of these systems for patient-centred care. Similar to the impact of data technology in many aspects of daily life, these changes will impact current models of doctor–patient relationships, with potential benefit for the individual patient and also larger patient populations. However, evidence demonstrating impact on clinical outcomes is still limited and it will require clinical trials before the role of these tools can be established in clinical practice. Clinical decision-making depends on complex human factors and personal preferences, and it is likely that in the short term these data-driven approaches with automated data collection and machine learning will serve mainly a supporting role in the physician-patient relationship.

Conflict of interest: N.M. reports grants from IBM (IBM Watson Research group) paid to the Cleveland Clinic, outside the submitted work. P.S. has no conflicts to declare.

References

1

Rumsfeld
JS
,
Joynt
KE
,
Maddox
TM
.
Big data analytics to improve cardiovascular care: promise and challenges
.
Nat Rev Cardiol
2016
;
in press
.

2

Huynh
QL
,
Negishi
K
,
Blizzard
L
,
Sanderson
K
,
Venn
AJ
,
Marwick
TH
.
Predictive score for 30-day readmission or death in heart failure
.
JAMA Cardiol
2016
;
in press
.

3

Liang
M
,
Tang
W
,
Xu
DM
,
Jirapatnakul
AC
,
Reeves
AP
,
Henschke
CI
,
Yankelevitz
D
.
Low-dose CT screening for lung cancer: computer-aided detection of missed lung cancers
.
Radiology
2016
;
in press
.

4

Waljee
AK
,
Higgins
PDR
.
Machine learning in medicine: a primer for physicians
.
Am J Gastroenterol
2010
;
105
:
1224
1226
.

5

Deo
RC
.
Machine learning in medicine
.
Circulation
2015
;
132
:
1920
1930
.

6

Dietterich
TG
.
Ensemble methods in machine learning
.
Lect Notes Comput Sci
2000
;
1857
:
1
15
.

7

Waljee
AK
,
Joyce
JC
,
Wang
S
,
Saxena
A
,
Hart
M
,
Zhu
J
,
Higgins
PD
.
Algorithms outperform metabolite tests in predicting response of patients with inflammatory bowel disease to thiopurines
.
Clin Gastroenterol Hepatol
2010
;
8
:
143
150
.

8

Singal
AG
,
Mukherjee
A
,
Elmunzer
BJ
,
Higgins
PD
,
Lok
AS
,
Zhu
J
,
Marrero
JA
,
Waljee
AK
.
Machine learning algorithms outperform conventional regression models in identifying risk factors for hepatocellular carcinoma in patients with cirrhosis
.
Am J Gastroenterol
2013
;
108
:
1723
1730
.

9

Motwani
M
,
Dey
D
,
Berman
DS
,
Germano
G
,
Achenbach
S
,
Al-Mallah
MH
,
Andreini
D
,
Budoff
MJ
,
Cademartiri
F
,
Callister
TQ
,
Chang
HJ
,
Chinnaiyan
K
,
Chow
BJ
,
Cury
RC
,
Delago
A
,
Gomez
M
,
Gransar
H
,
Hadamitzky
M
,
Hausleiter
J
,
Hindoyan
N
,
Feuchtner
G
,
Kaufmann
PA
,
Kim
YJ
,
Leipsic
J
,
Lin
FY
,
Maffei
E
,
Marques
H
,
Pontone
G
,
Raff
G
,
Rubinshtein
R
,
Shaw
LJ
,
Stehli
J
,
Villines
TC
,
Dunning
A
,
Min
JK
,
Slomka
PJ.
.
Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis
.
Eur Heart J
2017
;
38
:
500
507
.

10

Mjolsness
E
,
DeCoste
D
.
Machine learning for science: state of the art and future prospects
.
Science
2001
;
293
:
2051
2055
.

11

Drowning in Big Data? Reducing Information Technology Complexities and Costs for Healthcare Organizations http://www.emc.com/collateral/analyst-reports/frost-sullivan-reducing-information-technology-complexities-ar.pdf

12

Schoenhagen
P
,
Zimmermann
M
,
Falkner
J
.
Advanced 3-D analysis, client–server systems, and cloud computing. Integration of cardiovascular imaging data into clinical workflows of transcatheter aortic valve replacement
.
Cardiovasc Diagn Ther
2013
;
3
:
80
92
.

13

Schoenhagen
P
,
Roselli
EE
,
Harris
CM
,
Eagleton
M
,
Menon
V
.
Online network of subspecialty aortic disease experts: impact of ‘cloud’ technology on management of acute aortic emergencies
.
J Thorac Cardiovasc Surg
2016
;
in press
.

14

Mehta
NB
,
Hull
AL
,
Young
JB
,
Stoller
JK
.
Just imagine: new paradigms for medical education
.
Acad Med
2013
;
88
:
1418
1423
.

Author notes

The opinions expressed in this article are not necessarily those of the Editors of the European Heart Journal or of the European Society of Cardiology.