Healthcare Data Privacy, Security, and Accuracy Are Dependent on Safe and Successful Digital Transformation

Patient data is increasingly becoming digitized and accessible for patient care through the modernization of healthcare information technology and devices. Healthcare organizations were already well on their way to digital transformation and the coronavirus pandemic transition to remote and hybrid work has only further accelerated the digital transformation of patient data and systems.

The goal of healthcare digital transformation is to improve patient well-being. However, with the transition to a digital health world, there needs to be oversight and risk management, including data governance and monitoring to ensure that ingested data remains intact, secure, and accurate for decision-making.

Patient care can be affected if records are lost, stolen via data breach, or contain inaccuracies due to digitization problems. Theft of personal identities also leads to fraud and other nefarious activities by threat actors. From a business perspective, digital transformation and storage costs may also expand out of control if data requirements, processes, and goals are not well-defined upfront, resulting in excessive data cleansing and recovery later on.

HIPAA Requires Digital Data Accuracy & Protection

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) is a United States federal law that protects patient data privacy with the creation of national standards to prevent disclosure of personal health information without the patient’s consent or knowledge. A goal of HIPAA is to make sure Protected Health Information (PHI) is properly secured while allowing the flow of necessary health information.

Due to the massive ingestion and flows of health data, it’s important that healthcare leadership become proactive digital data stewards with clear data governance and privacy policies. For healthcare research, data also must be anonymized to protect data privacy. Patient data must be comprehensively identified, appropriately tagged and protected, and any inaccuracies like misfiled records should be corrected. High-quality patient care relies on complete and secure data records.

Healthcare professionals and administrative staff deliver patient care, but now they must also become medical technologists of electronic health record (EHR) systems, Internet of Things (IoT) devices, and other emerging technologies. With the fast pace of emerging technologies, there are learning curves for adoption and these human risks to data must be acknowledged and managed appropriately.

Innovative systems and devices have tremendous potential to make healthcare and data more accessible, create efficiencies, and improve healthcare procedures – but there is also the potential for human error and security risks. The growth of digital health has been accompanied by a large rise in data breaches targeting healthcare organizations and associated federal agencies.

Digital transformation and cybersecurity go hand in hand. Data governance will increasingly move to artificial intelligence (AI) and machine learning (ML) technologies to properly manage and control how data is used. Data security is important, but so also is ensuring the reliability of data to protect its value.

Nextgov article,The CX Executive Order Turns One,” discusses the White House memo on the federal government’s efforts to shift towards a human-centered design methodology for public interactions including digital options for paper processes and redesigned websites. Panelists at an Urban Institute event also talked about moving beyond surveys to leverage technology, big data, and blending data to support evidence-based policymaking. Accurate, representative, and accessible data are central to healthcare research and patient care.

As reported by Health IT Security, the Health Sector Cybersecurity Coordination Center (HC3) also recently issued a detailed brief regarding automation and its impacts on healthcare cybersecurity. Examples of automation in cybersecurity include machine learning and artificial intelligence, penetration testing, and automated intelligence collection.

HC3 noted that it can be time-consuming for attackers to manually go through stolen data, and they may use automated software to identify valuable information like credit cards and passwords. However, defenders can also leverage automation for identifying hidden data risks, vulnerability scanning, detecting data exfiltration, and other data protection efforts.

Unstructured data is not exempt from compliance requirements and must be properly managed and protected, along with structured data. Data retention is a key part of data compliance, and data backup teams must understand the different locations and types of data.

Additionally, an organization’s data is often subject to different compliance rules. Healthcare has many types of unstructured data including medical notes and imaging. Identifying, classifying, and protecting different types of PHI and PII is critical to modern data governance and privacy protection.

The Human and Business Cost of Unsafe and Unreliable Data

If the process of digitization, data discovery, and classification does not use appropriate controls with visibility into the data estate, patient data cannot be accurately transformed into a digital record, properly segmented according to policy, or anonymized for analytics projects.

The cost of unsafe or disorganized healthcare digital transformation includes:

    • fines and legal costs
    • excessive operational expenses
    • business interruption revenue loss
    • human resource costs
    • decline in quality of delivered care
    • long-term impact to patients whose data is lost, misfiled, or stolen
    • missed opportunities for data-driven insights

    To avoid ballooning digitization costs and burden on human resources, human-in-the-loop intelligent automation using artificial intelligence is increasingly being used to aid in digitizing records and to manage and protect the exponential growth of data.

    Data indexing, classification, and microsegmentation can be done at the time of digitization, or by using data discovery and risk assessment on existing data that may be hiding PHI and PII risks. Other risks can be identified using data discovery federated search, including exposed intellectual property or unprotected passwords.

    Over 40 million Americans’ health data is stolen or exposed each year due to security vulnerabilities in electronic health care systems, often due to problems with digitization of records or by failing to protect legacy systems. According to USA TODAY, “Hacking accounts for about half of all security breaches, while about one-third are caused by employee errors, such as lost computers or accidental disclosures.”

    The U.S. Department of Health and Human Services (HHS) recently warned of ongoing Royal ransomware attacks targeting healthcare entities with threat actors aiming to exfiltrate sensitive data, and often using social engineering like malicious ads, fake forum pages, blog comments, or phishing emails. There have been multiple ransomware alerts in late 2022, with variants including BlackCat, LockBit 3.0, Hive, Lorenz, and Cuba. Clop ransomware is reported to be infecting medical images.

    In early November 2022, the HC3 issued a detailed brief exploring the Iranian threat landscape and its implications for the U.S. healthcare sector. Iranian threat actors have been known to steal personally identifiable information (PII) using spear phishing and other tactics.

    According to BankInfoSecurity, Stoddard Manikin, CISO at Children’s Healthcare of Atlanta, said hackers are targeting children’s hospitals to use data from pediatric health records to apply for loans. Often the damage to the patients’ credit may go undetected until victims are adults. Adults’ health records are also targeted and used for insurance or prescription fraud. 

    Theft of PHI and PII also contributes to government fraud, including state-backed actor fraud of around $20 million in U.S. Covid benefits. CPO Magazine reported that the Secret Service was investigating a known Chinese advanced persistent threat group related to these activities. The ability to commit this type of fraud is likely due to personal data that has been collected on U.S. citizens by China over the years.

    Patient care and data is also at risk when hospitals are forced to use paper records after a ransomware attack and suddenly staff attention is split between patient care and dealing with new administrative procedures. New York City-based One Brooklyn Health’s hospitals had to rely on paper for weeks following a November 2022 cybersecurity incident that caused three of its hospitals to shut down its EHR and IT systems. The new paper records must then be eventually safely merged with existing digital records to ensure continuity of care.

    Senator Mark Warner, D-Va, in November 2022 issued a white paper titled, “Cybersecurity is Patient Safety,” focusing on current cybersecurity challenges facing the healthcare industry and suggests policy changes that could help to improve healthcare cybersecurity and better protect all health information, including health data not currently protected under the HIPAA Rules.

    Lawsuits in the aftermath of data breach can be lengthy, costly, and complicated. For example, Mass.-based Sturdy Memorial Hospital experienced a 60,000 patient breach of PHI in September 2021, and has now reached a settlement in December 2022 agreeing to refund up to $375 to each class member for ordinary losses and up to three hours of lost time at $20 per hour. 

    The settlement also provides reimbursement of up to $5,000 in extraordinary losses and free credit monitoring. The lawsuit alleged that the health system maintained the private information in a “reckless” manner on a system and network that was “vulnerable” to a data breach without proper monitoring.

    Specifically, the claim stated that:

    “Armed with the Private Information accessed in the Ransomware Attack, data thieves can commit a variety of crimes including, e.g., opening new financial accounts in Class Members’ names, taking out loans in Class Members’ names, using Class Members’ names to obtain medical services, using Class Members’ health information to target other phishing and hacking intrusions based on their individual health needs, using Class Members’ information to obtain government benefits, filing fraudulent tax returns using Class Members’ information, obtaining driver’s licenses in Class Members’ names but with another person’s photograph, and giving false information to police during an arrest.”

    Health IT Security article discussed Forrester experts’ best practices for secure healthcare digital transformation from the HIMSS Healthcare Cybersecurity Forum in Boston. The analysts suggested that an organization can meet all the minimum requirements under HIPAA on paper, but still could be at risk. Organizations must go beyond compliance and practice risk assessment to manage factors like the speed of innovation.

    Being proactive about data identification, management, and protection is also important due to concerns about future quantum computing exploits that can breach encrypted data that is stolen now. The U.S. Senate passed the bipartisan Quantum Computing Cybersecurity Preparedness Act in December 2022 with the goal of fortifying the federal government’s defenses against future quantum-computing-enabled data breaches.

    The new Act will require the Office of Management and Budget (OMB) to prioritize federal agencies’ transition to post-quantum cryptography. The White House already mandated in a November memo that by May 4, 2023 federal agencies must provide an inventory of assets containing cryptographic systems that could be cracked by quantum computers.

    Another important piece of legislation is the Cures Act, also known as the 21st Century Cures Act, which provides the National Institutes of Health (NIH) with the resources to improve healthcare in the United States. The Act has many different goals, however, the final rule took effect October 6, 2022 and prohibits information blocking and requires that patients have ready access to their data.

    To empower patients in their healthcare decision-making, the Act states that healthcare organizations must facilitate patients’ access to and sharing of their data when requested in a timely manner. To comply with the Cures Act, healthcare needs to have processes in place to ensure PHI data is complete, properly protected, and accessible.

    Wherever patient data lives in various paper files, EHR systems, unstructured data stores, healthcare professional communications, or elsewhere – it must be identified, safely digitized, protected, anonymized when necessary, and retained or deleted according to regulation. This is the future of digital health.

    With the exponential rise in data also comes concerns about the cost and control of data storage. Scientists at Aston University warn there is not enough space to handle the 300% increase of information set for 2025.

    Data discovery, data governance, and data cleansing need to become a business priority to reduce unnecessary storage costs now and in the future as data grows. Knowing your data estate also improves its security and usability. Increasing regulation will also require organizations to understand their data and properly classify and protect it for appropriate user and device access.

    Data Quality Requires Safe Digitization

    Data asset and PHI identification and management are necessary for both Zero Trust strategy and healthcare analytics projects. Business goals and regulation align by ensuring data-based decisions are made using quality data that is intact, complete, and protected. If data can be tampered with either accidentally or intentionally, data-based decisions will be impacted.

    An S4x22 event video from 2022 features John Kindervag, who is considered a founder of Zero Trust with his 2010 Forrester research and later efforts at other organizations. In the first four minutes he provides an excellent summation of Zero Trust and emphasizes that you have to know what your assets are in order to protect them.

    Often the sheer number of devices and technologies can give a false sense of security – the belief that data assets are protected and accurate because a lot of tools have been thrown at the problem. The reality is without taking the time to create a clear inventory of what your data assets are and who has access to them (digital identities comprised of people and their devices), you will not be able to properly risk assess your data estate.

    As John Kindervag emphasized, you must know and have ongoing visibility of what needs protection. Zero Trust security controls should be prioritized based on comprehensive asset inventories that can then be risk assessed to meet business goals, as well as budget and resource constraints. This process includes data discovery, classification, and risk management of data assets to enact data governance policies, as well as the user and device controls necessary for Zero Trust.

    Continuous monitoring is a key feature of Zero Trust so that you are alerted if critical data is being either accidentally or intentionally tampered with or exfiltrated by unauthorized digital identities. To properly protect data, data must be first identified and assigned to those with authorized access.

    See our recent articles on addressing human risks, Zero Trust data protection, and risks to critical infrastructure due to lack of asset risk assessment:


    For patient care and healthcare analytics, data digitization accuracy and classification should be emphasized right from the start so that mistakes are avoided that could make a big difference in patient outcomes and costs. Auditing the accuracy of the digitization process should be a priority to avoid unnecessary HIPAA violations, such as misfiled patient records, security risks due to poor data governance, or lost records that could affect quality of care.

    For research, decision-making based on data analytics is dependent on data quality with data mapping and lineage. For existing data stores used for data research, automated data discovery and tagging can locate and fix data retention issues or past file-handling mistakes to avoid compliance issues and data quality problems.

    Data Discovery and Intelligent Document Processing (IDP) can also be used to automate the anonymization and redaction of data so that data can be accessible but also protected for research, Freedom of Information Act (FOIA) requests, or other Data Subject Access Requests (DSARs).

    Solutions to Ensure Patient Data Accuracy, Privacy & Security

    Anacomp’s D3 AI/ML Data Discovery and Intelligent Document Processing solutions provide automated healthcare digital transformation, data asset management, and targeted data processing for Electronic Health Record (EHR) systems, data risk management, data privacy and compliance projects, data analytics for research, and other custom data projects.

    • Securely and accurately digitize and classify patient records
    • Correct data-handling mistakes and PHI errors to protect record accuracy and prevent HIPAA violations
    • Instantly locate risky data even at the content-level using risk filters, advanced queries, and federated search
    • Anonymize or redact data for research and other data requests
    • Monitor data for any unauthorized changes

    Our solutions reduce digital transformation, security, and storage costs by helping you control your data with confidence using automation implemented by our expert professional services staff:

    D3 Digital Transformation Solutions 

    Our Intelligent Document Processing (IDP) and high-speed scanning solutions use technologies like Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), and Optical Character Recognition (OCR) to process and ingest many types of data including handwriting and poor-quality documents, as well as images, enabling you to incorporate more data into patient records or data projects. You can also flag data privacy, HIPAA violations, or other data risk concerns.

    We offer secure digitization and indexing of all types of sensitive records for Electronic Health Record (EHR) systems, claims processing, benefits delivery, compliance, data analytics, intellectual property, human capital management, and secure records management. Advanced capture technology automates classification and data extraction with minimal operator assistance.

    D3 Data Discovery and Distillation Solution

    Our Data Discovery solution provides a customizable single pane view of both structured and unstructured data stores for over 950 file types with visualization of all file properties and both standard and user-defined metadata. D3 crawls your entire data estate and uses artificial intelligence and machine learning to see risks hidden in actual file content – not just file attributes.

    Risk filters, workflows, data tagging, and federated search help to identify, manage, clean, and protect data and keep it that way with ongoing, automated monitoring.

    You can also quickly and easily perform Data Subject Access Requests (DSARs), search for PHI/PII, or perform other sensitive data requests like intellectual property using advanced queries. D3 is unique in that it provides actionable visibility and filters for many data types down to the content-level.

    These solutions can be combined and customized to validate and improve data quality for security, compliance, and analytics projects. Anacomp conversion centers, infrastructure, and processes are certified compliant with the highest federal security protocols.

    We invite you to test out data discovery on your own data with a free 1 TB Test Drive of Anacomp’s D3 AI/ML Data Discovery Solution.

    Anacomp has served healthcare organizations, the U.S. government, military, and Fortune 500 companies with data visibility, digital transformation, and OCR intelligent document processing projects for over 50 years.