Advertisement

The Big Health Data–Intelligent Machine Paradox

      What Happens When an Unstoppable Force Meets an Immovable Object?

      This paradox plays out in many high-stakes global arenas—geopolitics, climate change, and financial markets, to name a few. Artificial intelligence thrives on massive disparate data sets, so it is not surprising that these data-dense megatrends are being shaped by artificial intelligence
      • Le Cun Y
      • Bengio Y
      • Hinton GE
      Deep learning.
      Although most humans do not face election, live near receding glaciers, or trade in cryptocurrencies, we are all learning that when manmade forces and intelligent machines collide, the costs and consequences can be real.
      So it is with human health data.
      Humans share 1 inherent right and bear 1 attendant risk—good health. In an inexorably older and progressively sicker world,

      Frischetti, M. Developing countries are battling diseases of the rich and the poor. Available at: https://www.scientificamerican.com/article/developing-countries-are-battling-diseases-of-the-rich-and-poor/. Accessed February 20, 2018.

      another potential cataclysm confronts patients and physicians—the silent explosion of health data. The uses and fates of these health data, which will have grown from an estimated 153 exabytes in 2013 to 2314 exabytes by 2020 in the United States alone, are already being influenced by artificial intelligence.
      • Beam AL
      • Kohane IS
      Big data and machine learning in health care.
      Medical students and trainees learn humanistic principles based on the Hippocratic precept, Primum non nocere (‘First do no harm’). Will humans use intelligent machines wisely, using big data responsibly for health care and medical training? Or will their power and immensity paradoxically produce unintended harms?
      The expanding digital health data universe is the unstoppable force saturating the Cloud with big data droplets.
      Individual personal health information resides in and fluxes through 2 types of data repositories. Administrative health care databases are operated by entities responsible for resourcing care and managing costs in socialized and quasi-market health insurance systems. Administrative health care databases are massive multigenerational payloads of demographic and utilization data (ie, pharmacy, physician, ambulatory, and hospital services)
      • Cadarette SM
      • Wong L
      An introduction to health care administrative data.
      linked to multiple electronic medical records (EMRs) and increasingly homed in Cloud platforms compliant with the Health Insurance Portability and Accountability Act. Despite informatician and analyst expertise, their complexity and the density of their data defy standard statistical methods, limiting their applications to health care process and utilization management.
      The EMR is the immovable object of health careit's not going anywhere.
      EMR databases (and picture-archiving communication systems) are intended to help providers coordinate complex patient care. Federal meaningful use mandates now find documentation consuming ∼50% of health professionals’ working time.

      Wachter R, Goldsmith J. To combat physician burnout and improve care, fix the electronic health record. Available at: https://hbr.org/2018/03/to-combat-physician-burnout-and-improve-care-fix-the-electronic-health-record. Accessed June 29, 2018.

      The computer technologies underpinning EMRs remain relatively antiquated, with user-unfriendly dropdown menus, free typing entries, cut-and-paste functions, and point-and-click navigation. In the era of health insurance payments only for what is documented, physicians (and increasingly ‘scribes’) are rewarded for overinclusivity and data redundancy in EMR entries, adding quantity without necessarily adding quality to bloated health records.
      Efforts to insert policy between the unstoppable force and immovable object have largely failed—miserably, some might say.
      Population health research on repository data intended to design better health care delivery has morphed into population management. Population management is derivative of policymakers’ desire to align provider behaviors through the allocation of resources within well-defined health populations in accountable care organizations or patient-centered medical homes. One reason public policy carrot-and-stick forays into the hypercompetitive health care business have fallen short is that private EMR vendors, health insurers, and pharmacy vendors wield tremendous big data market power.
      To paraphrase India's Prime Minister Narendra Modi, “They who control the data control the world.”
      Again, health care is no exception.
      Data integrity requires the migration and storage of deidentified anonymized data. Safeguards restrict repository access for researchers and data miners. Notwithstanding this, Apple and 13 prominent U.S. health systems recently announced plans to download EMR data (patient permission pending) onto Apple iCloud servers. Hailed by some as “truly disruptive” and “game-changing,”

      Blumenthal, D, Chopra, A. Apple's pact with 13 health care systems might actually disrupt the industry. Available at: https://hbr.org/2018/03/apples-pact-with-13-health-care-systems-might-actually-disrupt-the-industry. Accessed June 29, 2018.

      the degree to which affected patients will own their individual data deposits remains unclear. Absent political courage and health insurer permission to grant citizens rights to own their individual personal health information for health benefits, the promise of personalized medicine remains adrift in big data space.
      In the profitable U.S. $570 billion health insurance sector (2.7% of the U.S. gross domestic product), it comes as no surprise to find the biggest insurers using self-learning artificial intelligence machines to mine their data troves. Insurers contend that artificial intelligence helps customers choose the right health plan (virtual assistants), professionals manage chronic diseases (biometric device tracking), and providers boost their star quality ratings (compliance-enhancing bot calls and text messaging).
      Recent health insurance market consolidations—vertical acquisitions (ie, Anthem's attempted Cigna purchase) and horizontal mergers (ie, CVS Health-Aetna, Cigna-Express Scripts, Walmart-Humana)—were pursued (in part) to accumulate valuable patient data. Mergers and acquisitions activity has also secured in-house artificial intelligence analytics capabilities for health systems (ie, Health Share of Oregon buying Health Catalyst) and insurers (Cigna buying Brighter AI).
      An industry-insider maxim is that artificial intelligence will “improve health insurance claims management” (ie, coverage eligibility, explanation of benefits, payments).

      Hehner, S, Kors, B, Martin, M, et al. Artificial intelligence in health insurance. Available at: https://healthcare.mckinsey.com/sites/default/files/Artificial%20intelligence%20in%20Health%20Insurance.pdf. Accessed March 15, 2018.

      Insurers’ message to plan beneficiaries, whose big and little data are being scraped in Facebook-ian fashion, doesn't yet include lower premiums or better coverage. A 2017 industry survey

      From mystery to mastery: unlocking the business value of artificial intelligence in the insurance industry. Available at: https://www2.deloitte.com/content/dam/Deloitte/de/Documents/Innovation/Artificial-Intelligence-in-Insurance-Whitepaper-deloitte-digital.pdf. Accessed March 15, 2018.

      showed that 40% of U.S. insurance customers would track and share their health behavior data for more accurate premiums. However, when the London National Health Service Trust trusted Google enough to provide its artificial intelligence subsidiary (DeepMind) with >1.6 million patients’ data, the privacy regulator determined that they had failed to comply with United Kingdom data protection laws.
      The immovable object and unstoppable force have converged and are hurtling Earthward.
      What can be done to save humanity from big health data annihilation? Achieving elusive EMR interoperability, impenetrable Cloud security, and actionable analytics would each be welcome relief. New artificial intelligence computing technologies poised on the launch pad offer more potent solutions.
      Simple recurrent neural networks are trained on large data sets (108-1010 elements) to carry forward recent data run solutions (at time t – 1) to impact current (time t) and future (time t + 1) outputs (Figure 1A). This time-spanning neural network ‘hidden layer’ back-propagation approach creates an algorithm-derived vector-weighted short-term memory that is well suited to sequential data input tasks
      • Le Cun Y
      • Bengio Y
      • Hinton GE
      Deep learning.
      (Figure 1B). Recurrent neural networks can read EMR text entries to generate problem lists

      Tsou, C-H, Devarakonda, M, Liang, JJ. Toward generating domain-specific / personalized problem lists from electronic medical records. Available at: https://www.aaai.org/ocs/index.php/FSS/FSS15/paper/viewFile/11733/11479. Accessed March 1, 2018.

      and translate digital images into meaningfully worded descriptive reports
      • Li X
      • Jin Q
      Improving image captioning by concept-based sentence reranking.
      (Figure 2).
      Fig 1
      Figure 1(A) Recurrent neural network (RNN) architecture makes use of sequential information. RNNs are called recurrent because they perform the same task for each element of a sequence, with the output being dependent on the prior computations. This creates a short-term ‘memory’ functionality that captures information about the prior calculations. This simple RNN is unrolled into a neural network of 3 layers designed to decode a 3-word phrase; the input at the time step (t) is a vectorial representation of word 2 in the phrase. The main feature of an RNN is the so-called hidden state, which comprises the interconnected memories at each time step (the blue arrows from and to the gray boxes). This memory is actually a mathematical function calculated based on the previous hidden state at time t – 1 and the current input at time t. The final output is a vector of probabilities of word 3 in the phrase from a vocabulary of choices available at time t + 1. (B) For an RNN to predict the next word in a sentence (ie, language modeling), it is helpful to know which words came before it. In these 2 sentences, a multilayer neural network is used to sequentially predict the next word from the unrolled RNN's hidden state memory of prior layers’ outputs and the current input (ie, “I have a pen. I have an ???”). Performing the same tasks at each step in the sequence with different inputs generates a vector of mathematical probabilities (ie, a generative model) that the final word in the second sentence is apple and not pen, red, or hello. High-probability sentences are typically correct (ie, “I have an apple”). This explains (in part) how RNNs (and more sophisticated long short-term memory units) can successfully carry out natural language processing tasks like reading a medical record.
      Fig 2
      Figure 2Researchers registered with CLEF dev set can access open-source captioned image databases (ie, plant specimen photos, digital medical images) and submit requests to use large training data sets to run artificial intelligence analytics.
      • Li X
      • Jin Q
      Improving image captioning by concept-based sentence reranking.
      The upper run illustrates training a sentence-generating model on an ImageCLEF dev set. The lower run first trains the model on a Microsoft MS COCO image data set, then tests it on the ImageCLEF dev set. An algorithm scoring system (ie, METEOR) is used to assess the performance of different image-captioning software for concept detection (ie, a higher METEOR score equals better concept detection). Concept-based sentence re-ranking can then be applied on sentences generated by these LSTM-RNN models. The outcome sentence, “A plant with pink flower and brown stem…,” reflects the transformed hidden state description of the original image as generated by the neural network system. CLEF = Cross Language Evaluation Forum; CNN = convolutional neural network; LSTM = long short-term memory unit; METEOR = Metric for Evaluation of Translation with Explicit Ordering; MS COCO = Microsoft Coco framework; RNN = recurrent neural network.
      Memory networks have greatly expanded recurrent neural networks’ limited short-term memory into a tape-like ‘remember-the-story’ functionality for later data queries.
      • Weston J
      • Chopra S
      • Bordes A.
      Memory networks read a story, are trained to keep track of embedded data imagery, and can correctly answer questions like “Where is our story's heroine walking right now?” The game-changing question for medicine, “Where is my diabetic patient in her chronic disease trajectory?,” can now be answered by long short-term memory units that learn to gate the storage, release, or erasure of recurrent neural network hidden layer data, effectively linking causes to effects.
      • Pham T
      • Tran T
      • Phung D
      • Venkatesh S
      Predicting healthcare trajectories from medical records: a deep learning approach.
      Memory networks and long short-term memory units will soon exert a powerful gravitational pull on both the unstoppable force and the immovable object. Despite cautionary human wisdom about artificial intelligence technology insertion,
      • Beam AL
      • Kohane IS
      Big data and machine learning in health care.
      • Miller DD
      • Brown EW
      Artificial intelligence in medical practice: the question to the answer?.
      perhaps the ultimate paradox is that the very same intelligent machines some fear will hurt jobs are the last best hope for saving the health care universe from entering deeper into a data explosion–provider documentation ‘black hole.’

      References

        • Le Cun Y
        • Bengio Y
        • Hinton GE
        Deep learning.
        Nature. 2015; 521: 436-444
      1. Frischetti, M. Developing countries are battling diseases of the rich and the poor. Available at: https://www.scientificamerican.com/article/developing-countries-are-battling-diseases-of-the-rich-and-poor/. Accessed February 20, 2018.

        • Beam AL
        • Kohane IS
        Big data and machine learning in health care.
        JAMA. 2018; 319: 1317-1318
        • Cadarette SM
        • Wong L
        An introduction to health care administrative data.
        Can J Hosp Pharm. 2015; 68: 232-237
      2. Wachter R, Goldsmith J. To combat physician burnout and improve care, fix the electronic health record. Available at: https://hbr.org/2018/03/to-combat-physician-burnout-and-improve-care-fix-the-electronic-health-record. Accessed June 29, 2018.

      3. Blumenthal, D, Chopra, A. Apple's pact with 13 health care systems might actually disrupt the industry. Available at: https://hbr.org/2018/03/apples-pact-with-13-health-care-systems-might-actually-disrupt-the-industry. Accessed June 29, 2018.

      4. Hehner, S, Kors, B, Martin, M, et al. Artificial intelligence in health insurance. Available at: https://healthcare.mckinsey.com/sites/default/files/Artificial%20intelligence%20in%20Health%20Insurance.pdf. Accessed March 15, 2018.

      5. From mystery to mastery: unlocking the business value of artificial intelligence in the insurance industry. Available at: https://www2.deloitte.com/content/dam/Deloitte/de/Documents/Innovation/Artificial-Intelligence-in-Insurance-Whitepaper-deloitte-digital.pdf. Accessed March 15, 2018.

      6. Tsou, C-H, Devarakonda, M, Liang, JJ. Toward generating domain-specific / personalized problem lists from electronic medical records. Available at: https://www.aaai.org/ocs/index.php/FSS/FSS15/paper/viewFile/11733/11479. Accessed March 1, 2018.

        • Li X
        • Jin Q
        Improving image captioning by concept-based sentence reranking.
        (eds.)in: Chen E Gong Y Tie Y Advances in Multimedia Information ProcessingPCM 2016.17th ed. Springer, New York, NY2016: 231-240 (Part 2)
        • Weston J
        • Chopra S
        • Bordes A.
        Memory networks.
        (Available at:) (Accessed February 27, 2018)
        • Pham T
        • Tran T
        • Phung D
        • Venkatesh S
        Predicting healthcare trajectories from medical records: a deep learning approach.
        J Biomed Inform. 2017; 69: 218-229
        • Miller DD
        • Brown EW
        Artificial intelligence in medical practice: the question to the answer?.
        Am J Med. 2018; 131: 129-133