14-07-2025
From Strategy To Implementation: Leveraging Unstructured Health Data
Dr. Tim O'Connell is a practicing radiologist and the founder and CEO of emtelligent, a developer of clinical-grade AI software.
The healthcare industry is experiencing a data transformation that began with the 2009 HITECH Act and has since gained momentum through initiatives like the 21st Century Cures Act and CMS's Promoting Interoperability Program. These policies have shifted the focus from electronic health record (EHR) adoption to value-based care, emphasizing interoperability, data sharing and patient access.
By 2020, U.S. healthcare data had reached 2,314 exabytes—15 times more than in 2013—thanks to connected devices and remote monitoring. This surge has turned data into a critical asset, with unstructured clinical data offering particularly untapped value.
Until recently, this data could only be accessed via labor-intensive manual review. Organizations seeking to unlock this critical 'last mile' of clinical data and put it to work across multiple use cases face numerous considerations and pitfalls along the way.
In my experience working with clients across healthcare industry sectors, I have identified the emergence of a maturity curve for organizations as they move toward the utilization of unstructured clinical data. This curve includes the following stages:
1. Opportunity: Realizing there is valuable data hiding within unstructured clinical records that can be extracted and analyzed to improve care, increase efficiency and inform research.
2. Competency: Understanding that advanced tools like artificial intelligence (AI) and natural language processing (NLP) hold the key to unlocking unstructured data at scale, readying it for insights and business processes.
3. Viability: Identifying and defining the value dimensions and key performance indicators (KPIs) that unstructured data can impact, such as reducing time to diagnosis, identifying gaps in care or streamlining reimbursement, that are aligned with business and clinical objectives.
4. Feasibility: Building and operationalizing a scalable data pipeline that can rapidly and efficiently process high volumes of unstructured data across diverse clinical sources and formats.
5. Extensibility: Scaling proven use cases across the enterprise and embedding unstructured data analysis into core workflows, strategic initiatives and population health efforts.
This article focuses on the first two stages: identifying the opportunity and understanding enabling technologies.
Buried Treasure
An estimated 80% of clinical data is unstructured. Even documents using structured formats like C-CDA often include vast narrative content—especially for patients with complex conditions. Much of this critical information remains invisible to conventional analytics.
Keyword search tools, still common in healthcare, lack contextual understanding. They often miss key insights or provide irrelevant results because they can't interpret negation, chronology or relationships between concepts. Without sophisticated tools, unstructured data is underused—resulting in missed clinical context, risk miscalculations and missed research opportunities.
Enabling Intelligent Data Access At Scale
Advanced technologies like AI and NLP are rapidly transforming how healthcare organizations engage with unstructured data—replacing manual review processes with intelligent automation that is faster, more scalable and more accurate.
And this isn't just an opinion; peer-reviewed studies back it up. A 2024 review of how wider healthcare is implementing NLP and AI found that 81% of systems were using NLP to extract clinical data from EHRs. That's a big deal because it means faster access to important information across workflows. And in a study of data from more than 4,000 stroke patients admitted to Massachusetts General Hospital, NLP accurately pulled stroke severity scores from doctors' notes—matching expert reviews more than 92% of the time and removing the need for manual chart review. It's a powerful example of how this technology can drive real impact at scale.
Unlike traditional search tools that rely on static keyword matching, these advanced systems understand the context, semantics and structure of language. They recognize synonyms, interpret negation (e.g., 'no history of diabetes'), differentiate between historical and current conditions and extract relationships between clinical concepts (e.g., linking a symptom to a diagnosis or a medication to a specific condition). This deeper understanding enables them to surface more relevant and actionable insights while minimizing false positives and irrelevant matches.
These technologies also eliminate the need for time-consuming manual chart review, freeing up clinicians, analysts and administrative teams to focus on higher-value tasks. Rather than reading through hundreds of pages of clinical notes, users can instantly extract structured summaries, quality measures, risk indicators and cohort-specific criteria.
By transforming narrative data into structured, searchable insights, AI and NLP enable a wide range of use cases:
• Supporting real-time clinical decision-making
• Powering predictive analytics for earlier interventions
• Identifying gaps in care for population health management
• Accelerating patient recruitment for clinical trials
• Enhancing claims processing and risk adjustment accuracy
• Surveillance for public health
Best Practices For Harnessing AI In Healthcare
Implementing AI in healthcare isn't just about choosing the right tools—it's about making them work in the real world. In my experience, the biggest challenges show up after the technology is in place. Success depends on how well teams understand the problem they're trying to solve and how much trust exists in the system.
A few best practices can help:
• Start with one clear use case. Whether it's chart abstraction, quality reporting or cohort identification, narrowing the focus makes it easier to prove value and build momentum.
• Prioritize transparency. If users can't trace an insight back to the source, they're not going to trust it. Make sure outputs are verifiable and easy to audit.
• Support the humans doing the work. AI should reduce manual effort, not override clinical judgment. Adoption improves when teams see that it will make their jobs easier.
• Be clear about who is accountable. Even with AI at the helm, someone still needs to own the final decision. Build governance around who reviews outputs and how errors are caught and corrected.
• Broadcast success and enlist champions. Identify one or two business or clinical advocates to embed AI into their workflows and showcase throughput gains, cost savings and how AI can free up clinicians for higher-value work and patient interaction.
These practices don't merely help with implementation. They lay the groundwork for everything that comes after.
Finally, with those foundations in place, teams can move from theory to real-world results.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?