HealthSeptember 09, 2021

Taming the data tsunami: Leverage clinical terminology for high-value use cases

Welcome to part two of our “Taming the Data Tsunami” Expert Insights series. In the next several installments, we are going to get a bit more detailed about the tactics that we introduced in our recent webinar, Taming the Data Tsunami: 5 Terminology Tactics. If you have listened to the webinar, I hope it was clear that I am passionate about making data interoperable, meaningful, and actionable. Clinical data is a rich and meaningful source of information about a patient's journey; however, the challenge is that it's stored in free text narrative or in non-standardized data tables.

Claims data alone isn’t enough to improve the quality of care

For decades, we have relied on claims data, which is highly structured and codified to coding standards like ICD-10 and CPT, to answer questions such as:

How many diabetics are covered under each of the plans in my organization?
What test/treatments have they undergone related to their condition?
When did they last see their internal medicine physician?
Which medications is this member taking?

The last two questions are hard and are probably only doable if you have access to ALL a patient's claims data, including prescription data and history from past insurers.

Now, let’s try to answer some more complex questions, such as:

Which histologists are most prevalent in my member population?
Of the diabetic patients identified in my population, which ones have uncontrolled A1C results over the course of the last year?
What interventions have been tried to help members with chronic kidney disease manage their condition before they require dialysis?

We can start to see the problem with relying solely on claims data to answer important questions that can dramatically improve the quality of care delivered, while also lowering the cost of that care. We need meaningful, actionable, and sharable data to directly affect outcomes. I could go on and on asking questions (some people call me Cheryl “The Why” Mason and with good reason); but then we will never get to the solution, which is ultimately what I’m passionate about.

We need meaningful, actionable, and sharable health data to directly affect care outcomes.

Cheryl Mason - Director, Content and Informatics

So, how do we take disparate data from various sources and in various formats and make it actionable to improve care and reduce cost, while trying to improve the day-to-day lives of clinicians and consumers of healthcare?

Connecting disparate healthcare data

Let’s first review how we are doing this today - it’s not pretty. As an industry, we are sharing more and more data. This will continue to increase with the ONC and CMS rules requiring all participants within the healthcare ecosystem to share more and more data with each other and with patients, thus the term, data tsunami. Like a tsunami, it's lots and lots of data - unlike a tsunami, we are not causing deep destruction. Though by not sharing information, knowledge, or wisdom, we are perpetuating the age-old problem of healthcare data being shared in multiple unusable formats. In fact, more times than not, the clinical notes that contain the most relevant and impactful information are not even shared in a computer-readable format. We see PDF documents that have been faxed, copied, faxed again, and rendered useless because they are not readable, much less computable.

Information that is shared in more structured and computable ways – CCDA and HL7 messages – are often not recorded or translated into standard terminologies as required by the “promoting interoperability” requirement. At the end of the day, it’s not about the requirements, it’s about being able to easily answer clinical questions like the ones listed above to reduce redundant testing, operate in value-based care arrangements, and identify and close gaps in care, just to name a few. Really, it’s about servicing the consumer, reducing administrative burden, and saving money.

The solution is multifaceted which is why there are so many organizations working on it. Let’s start with a couple of definitions.

What is clinical data?

When I say clinical data – what comes to your mind? You might be surprised to hear that the answer is different depending on your perspective. At Health Language, we define clinical data in this context:

Data that is generated at the point of care is directly related to the patient’s conditions, and the management, evaluation, assessment, or treatment of those conditions. This can include concrete information such as lab results or it could be information about a patient’s social determinants of health, data from remote monitoring devices, and observations by the clinician.

Some might say that claims data recorded in ICD-10, CPT, and HCPCS is clinical data, but we know better. You cannot answer complex questions nor understand the condition of a patent using that data.

What are interoperability and semantic interoperability?

When we talk about interoperability, we like to use the HIMSS definition:

It is the ability of different information systems, devices and applications (systems) to access, exchange, integrate and cooperatively use data in a coordinated manner, within and across organizational, regional and national boundaries, to provide timely and seamless portability of information and optimize the health of individuals and populations globally.

“Optimize the health of individuals and populations globally” is my favorite part of the definition, and after all, is the ultimate goal. In order to do so, it is important the data we send and receive is parsed, transformed, tagged, and enriched to add value and lead us to wisdom. That is the holy grail and requires a more technical level of interoperability, semantic interoperability.

Again, at Health Language, we use the HIMSS definition for semantic interoperability:

Provides for common underlying models and codification of the data including the use of data elements with standardized definitions from publicly available value sets and coding vocabularies, providing shared understanding, and meaning to the user.

Leveraging clinical terminology for data validation, transformation, tagging, and enrichment

This is where partnering with a company that not only understands healthcare data, but the terminologies that underpin the understanding of rich clinical data, is so important.

While some clinical data might be codified at the point-of-care via problem list entries, medication lists, past medical histories, and allergy lists, you still need to make sure that the codes are valid codes that will be recognized by clinical decision support, surveillance, and downstream analytics applications. Without proper clinical data mapping, data codified to invalid codes will be lost as it is shared between systems.

Much of the data coming from HL7 messages, such as laboratory orders, is what I refer to as, ‘semi-structured’ data. Anyone who has spent much time with HL7 messages - or tried to use them in clinical decision support, surveillance, or analytics applications - knows that most of it is not codified to LOINC, the Meaningful Use standard for sharing laboratory data. It is critically important that this data be codified to LOINC for it to be usable. This is not an easy task and requires deep domain knowledge. At Health Language, we have mapped hundreds of thousands of laboratory orders and results to LOINC for multiple use cases including those listed above, plus many more.

Industry statistics claim more than 80% of healthcare data is locked in unstructured data types, such as the PDF clinical notes I mentioned earlier. Today, clinicians must spend hours manually reviewing the documents in order to pull out the useful information into spreadsheets or databases, sometimes tagging the notes using an annotation tool, to make the data usable for things like identifying gaps in care, ensuring that patients are getting the best care possible, and optimizing value-based payment programs. Even when the information is extracted, it often isn’t tagged with a valid code from a standard terminology so that value can be extracted from it. Leveraging technology such as NLP for medical records, with a good Clinical Natural Language Processing (cNLP) tool, can save a lot of time and lead to an overall higher quality of data. Make sure that whatever system you use is tuned for healthcare data, recognizes medical jargon in context, and can codify that data to the correct terminology for downstream applications and high-value use cases such as, risk adjustment.

Once your data has been enriched using a terminology server and cNLP, it can be leveraged for many different use cases, as it is now organized and actionable.

If you're interested in learning more, watch our on-demand webinar, Ninja Tactics for Curating, Cleansing & Enriching Clinical Data for High-Value Use Cases, where we go even deeper into these exact topics.

Data Quality Workbench

Cheryl Mason

Director, Content and Informatics, Health Language

As the Director of Content and Informatics, Cheryl supports the company’s Health Language solutions leading a team of subject matter experts at that specialize in data quality. Together, they consult with clients across the health care spectrum regarding standardized terminologies, data governance, data normalization, and risk mitigation strategies.

Explore related topics

Health Language

Data Quality Workbench

Unlock the power of your healthcare data and improve your data quality with tools for managing, enriching, and mapping healthcare datasets. Establish a trusted data foundation to improve clinical accuracy in analytics and more efficiency into reimbursement processes.

Data Quality Workbench

Related Insights

Article

Health

February 10, 2026

The connected healthcare ecosystem: Why collaboration is the future of innovation

Commercial healthcare organizations operating as part of an ecosystem, rather than in isolation, create opportunities for innovation.

Learn More
Article

Health

February 09, 2026

Understanding the CPT 2026 Code Set Updates

Dive into the 2026 CPT code updates with 418 changes covering AI, remote monitoring, and surgery. Discover tips to manage these updates effectively and efficiently.

Learn More
Case study

Health

February 05, 2026

Modernize healthcare utilization management with automated value sets: MCG Health case study

Discover how MCG Health moved from static spreadsheets to algorithmic logic, saving thousands of hours and improving decision confidence.

Learn More
Article

Health

January 27, 2026

AI survey insights: Newer providers concerned about deskilling

As digital-native providers use AI solutions for care, leaders need to implement training to prevent deskilling and improve clinical judgment.

Learn More

Brazil

Canada

Latin America

United States

Belgium

Czech Republic

Denmark

France

Germany

Hungary

Italy

Netherlands

Norway

Poland

Portugal

Romania

Slovakia

Spain

Sweden

United Kingdom

Australia

China

Hong Kong

India

Japan

Malaysia

New Zealand

Philippines

Singapore

South Korea

Taiwan

Thailand

Vietnam

Health

Tax and Accounting

Financial & Corporate Compliance

Legal & Regulatory

Corporate Performance & ESG

Useful Links

Solutions

Roles

Solutions

Solutions

Roles

Solutions

Solutions

Business Insights Hub

Featured Reports

Trending Topics

Insights

Trending Topics

Insights

Trending Topics

Insights

Trending Topics

Insights

Trending Topics

Insights

Brazil

Canada

Latin America

United States

Belgium

Czech Republic

Denmark

France

Germany

Hungary

Italy

Netherlands

Norway

Poland

Portugal

Romania

Slovakia

Spain

Sweden

United Kingdom

Australia

China