Adapting a Clinical Data Repository to ICD-10-CM through the use of a Terminology Repository James J. Cimino, MD; Lyubov Remennick, MD Laboratory for Informatics Development, NIH Clinical Center, Bethesda, MD Clinical data repositories frequently contain patient diagnoses coded with the International Classification of Diseases, Ninth Revision (ICD-9-CM). These repositories now need to accommodate data coded with the Tenth Revision (ICD-10-CM). Database users wish to retrieve relevant data regardless of the system by which they are coded. We demonstrate how a terminology repository (the Research Entities Dictionary or RED) serves as an ontology relating terms of both ICD versions to each other to support seamless version-independent retrieval from the Biomedical Translational Research Information System (BTRIS) at the National Institutes of Health. We make use of the Center for Medicare and Medicaid Services’ General Equivalence Mappings (GEMs) to reduce the modeling effort required to determine whether ICD-10-CM terms should be added to the RED as new concepts or as synonyms of existing concepts. A divide-and-conquer approach is used to develop integration heuristics that offer a satisfactory interim solution and facilitate additional refinement of the integration as time and resources allow. Introduction Many patient diagnoses are recorded in electronic databases using codes from the International Classification of Diseases, Ninth Revision (ICD-9-CM, or I9 for short).1 When these data are included in longitudinal databases, such as clinical data repositories, challenges arise with each annual I9 update. For example, when searching for * patients with septic shock, a researcher looking for data prior to 2008 would need to use the I9 codes 785.59 and 998.02 (Shock without mention of trauma, not elsewhere classified and Postoperative shock, septic, respectively). However, a longitudinal study of patient records coded with the I9 code 785.59 would show a sharp decrease in incidence in 2008. Unfortunately, this is not due to development of effective measures for the prevention of septic shock but rather due to the addition of the I9 code 785.52 (Septic shock), with a concomitant change in the implicit meaning of 785.59 and a corresponding decrease in its use to code data. Researchers are often unaware of how such changes impact their work.2 Mitigating impact of updates on the use of data repositories requires special measures.3 Such challenges are likely to be even greater for US researchers when, in 2014, I9 is replaced by the International Classification of Diseases, Tenth Revision (ICD-10-CM, or I10 for short).4 I10 is much larger than I9; some of this increase is due to more finely grained terms (for example, Postoperative Shock, Septic is being replaced by the three terms Postprocedural septic shock, initial encounter, Postprocedural septic shock, subsequent encounter, and Postprocedural septic shock, sequela). I10 also contains terms that are similar, but not synonymous, with I9 terms (for example, Septic shock is being replaced by Severe sepsis with septic shock). Past I9 updates, involving dozens or hundreds of changes pale in comparison to the change that I10 brings, with tens of thousands of new codes. We have previously described a method for coping with annual I9 updates through the use of a controlled terminology resource that maps multiple external terminologies to a single, coherent terminology for use within a single institutional clinical data repository.3 This paper describes the application of that methodology to adapt to the inclusion of I10 data into a repository that contains patient diagnostic data coded in part with I9. Background The Biomedical Translational Research Information System (BTRIS) BTRIS is a repository of clinical research data collected at the Clinical Center, a 240-bed hospital on the National Institutes of Health campus in Bethesda, Maryland. It includes data from electronic health records and clinical trials management systems dating back to 1953. Along with clinical notes, laboratory test results, procedure reports, vital sign measurements, medication administration records, electrocardiograms, radiologic images, mass spectrograms, and exome data, BTRIS includes 1.9 million discharge diagnoses assigned to hospital admissions, coded in I9.5

*

In this paper, codes and terms from the various versions of the International Classification of Diseases will be distinguished by the use of a different typeface.

405

The Research Entities Dictionary (RED) The RED is a terminology repository modeled after Columbia University Medical Center’s Medical Entities Dictionary (MED).6 The RED contains all the terms from the various controlled terminologies that are used by each of the BTRIS data sources to code their data. Each term is mapped to a corresponding concept in the RED, usually in a one-to-one manner, although where terms from multiple sources are clearly synonymous, they may be mapped to single concepts. Each concept is assigned a unique identifier (the “RED Code”), which is stored with the source data in BTRIS. RED concepts are organized in a directed acyclic graph of hierarchical is-a relationships that can be used to query BTRIS in a class-based manner. Using a single RED code, one can find all patients with a diagnosis of any infectious disease, any foodborne infectious disease, any disease caused by the infectious organism Salmonella, or a particular disease (e.g., typhoid fever). Terms from disparate data sources are organized into the same classification structure so that, for example, a query for a disease term will retrieve data coded with I9 codes from the current EHR and previous Clinical Center EHRs, rare disease terms coded by the Medical Records Department, and problem list terms from one of the clinical trials management systems. Figure 1 shows an example of the RED hierarchy. RED maintenance is an ongoing task, involving two full-time ontologists. Changes to source terminologies must be characterized as changes in syntax and/or semantics. For example, if a data source changes the name of a term (syntax), a determination must be made as to whether the change reflects a minor name change (no change in meaning, requiring a simple update to the term information assigned to the corresponding RED concept) or a major name change (a change in meaning, requiring retirement of the existing RED concept the creation of a new one).3 The addition of a new data source to BTRIS usually requires addition of one or more new terminologies in the RED. Here the task is somewhat different. The first priority is to make sure there is a RED concept that corresponds to each source term; this allows data to be stored in the BTRIS database and supports simple queries (e.g., “get all the patient diagnoses”). The second priority is to place the new concepts into the existing RED hierarchy, or add new hierarchical concepts, as appropriate to the meaning of the terms; this supports the more sophisticated class-based queries (as described in Figure 1). Ideally, if the new term is synonymous with an existing source term, it will be added as a synonym to an existing RED concept. However, this usually requires time-consuming manual review and is actually the lowest priority. Redundant concepts clutter the RED and might confuse a user looking for terms with which to search BTRIS, but they do not adversely affect retrieval of data and can be corrected retroactively by merging concepts in the RED and replacing RED Codes in BTRIS.

Figure 1: A sample of concepts in the RED hierarchy. The concept “Acute Hemolytic Reaction – Grade 1” corresponds to a term from the system in the Clinical Center’s Department of Transfusion Medicine, while “Acute Hemolytic Transfusion Reaction, Incompatibility Unspecified” corresponds to the I9 term with the code 999.84. Note that a BTRIS user wishing to retrieve data about acute hemolytic reactions can simply request the parent concept, “Acute Hemolytic Transfusion Reaction” and data coded by all three concepts will be retrieved.

406

General Equivalence Mappings (GEMs) The National Center for Health Statistics (NCHS) has developed a bidirectional general equivalence mapping (GEM) system to assist users of I9 and I10 in deciding how to change their coding practices and understand how data coded in I9 and I10 relate to each other. NCHS, together with the Centers for Medicare and Medicaid Services (CMS) has published files that can be used as the basis for translating between I9 and I10 terms.7 The GEM mappings consist of two files, one in which each I9 code is listed at least once, with a set of flags associating it with one or more I10 codes, and one in which each I10 codes is listed at least once, with a set of flags associating it with one or more I9 codes.8 The flags consist of five digits, with meanings as shown in Figure 2. I10 Code A00.9 A01.00 A01.01

I10 Name I9 Code I9 Name Cholera, unspecified 001.9 Cholera, unspecified Typhoid fever, unspecified 002.0 Typhoid fever Typhoid meningitis 002.0 Typhoid fever

A02.1 Salmonella sepsis A02.1 Salmonella sepsis R40.2131 Coma scale, eyes open, to sound, in the field

Flag Position 1 2 3

“Approximate” “No Map” “Combination”

4

“Scenarios”

5

“Choice”

003.1 Salmonella septicemia 995.91 Sepsis

GEM 00000 10000 10000 10111 10112 01000

Meaning Synonymous I10 term similar to I9 term Another I10 term similar to the same I9 term I10 term best expressed with two I9 terms (one scenario) No mapping in I9

Interpretation 0=exact match (synonymous), 1=approximate match (similar meaning) 0=some plausible mapping; 1=no plausible mapping 0=mapping to single term, 1=mapping to multiple options (“scenarios”), multiple terms, or both With the Combination flag=1, a coding option; may be one scenario (numbered “1”) with multiple options or multiple scenarios (numbered “1”, “2”, etc.) each with one or more choices With the Combination and Scenario flags, one or more options for a Particular scenario (numbered “1”,”2”, etc.)

Figure 2: Understanding the General Equivalence Mapping Codes. Examples of the I10 to I9 GEM mappings are shown, with interpretation of each of the five flag positions. Note that GEM mapping files do not include term names; these were added here for clarity. Methods Our general approach to adding I10 terms to the RED is to examine the pairings with I9 codes provided in the GEMs and use the flags to suggest synonymy with or relationships to existing RED concepts. Relationships can include “child” (added to the hierarchy under the existing concept), “parent” (inserted into the hierarchy above the existing concept), or “sibling” (added to the hierarchy under the parent(s) of the existing concept). Using a “divide and conquer” approach,9 we partitioned GEM mappings according to the following steps: 1.

GEM flags were used to create four partitions: 00000 – potential synonyms; 10000 (“similar” flag set) – potential children, parents or siblings; 101xx (“combination” flag set) - potential children, parents or siblings; and 01000 (“no mapping” flag set) – new concept

2.

We divided the 10000 (“similar”) partition into four partitions based on the cardinality of terms involved the I10-to-I9 mappings: one-to-one mappings, one-to-many mappings, many-to-one mappings and many-to-many mappings (note that the “potential synonyms” partition is by definition one-to-one and the “combination” partition is by definition one-to-many or many-to-many but was not further subdivided; thus, the fourth and fifth flags were not considered further, as their complexity requires case-by-case review)

3.

Each of the above partitions (except the new concept partition, in which there are no corresponding I9 terms) were further divided into based on the whether any of the I10 or I9 terms contained the word “other” or the phrase “not elsewhere classified”; this was done because the semantics of these terms can never be precisely determined (even the GEM mappings state that identical term names are not synonymous if they are “other” terms10); the phrases “not otherwise specified” and “other and unspecified” were excluded from this

407

consideration since they actually equivalent to the generic class terms; regardless of their original partition, “other” mappings were evaluated as potential children, parents or siblings 4.

Within each of the resulting partitions (except the new concept partition, in which there are no corresponding I9 terms), we performed automated comparisons of the I10 and I9 terms names for similarity: o

Name match: exact match between terms; treated as potential synonyms

o

Normalized name match: terms were normalized as follows prior to comparison: all letters capitalized, punctuation removed, words replaced with preferred forms using a set of previously described word synonyms,11 stop words (“A”, “AN”, “AND”, “OR”, “THE”, “UNSPECIFIED”, and “WITH”) removed, and remaining words sorted alphabetically; for example, “Typhoid meningitis” becomes “MENINGITIDES TYPHOID”; treated as potential synonyms

o

Non-name match: no exact match between original names or normalized names; treated as potential children, parents or siblings

We manually examined each partition to identify patterns of semantic relationships between the I10 and I9 terms. For example, we did not automatically assume that I10 and I9 terms were synonymous just because their GEM mapping was “00000” if the names (or normalized names) did not match. Similarly, we did not assume that terms were not synonymous just because their GEM mapping was “10000” if their names (or normalized names) matched and they were not “other” terms. These manual examinations of each partition led to development of general decision rules, or heuristics, to apply to each I10-I9 pair. Based on the decions, the following actions were taken: synonym - I10 term code and name added to existing I9 RED concept child - new term added as child of existing I9 RED concept (new leaf node in hierarchy) sibling - new term added as new leaf node child of the parent(s) of the existing I9 RED concept parent - new term added as parent of existing RED concept (inserted into hierarchy above I9 concept). Once the general rule was established, we began to manually review these decisions to find exceptions. This process allowed us to add I10 terms to the RED in bulk fashion with subsequent review and, where appropriate, correction (split synonyms into new and old concepts, merge synonymous concepts, and reclassify new concepts). Results General RED Inclusion Rules for Partitions of GEM Mappings Table 1 shows the size of each of the partitions of GEM mappings. Also shown are the initial mapping decisions for each partition, plus any decisions made based on manual review to date. Table 2 shows examples of each of these mappings. For example, all “non-other” I10-I9 pairs for which the GEM “synonymous” (00000) mapping were initially considered to be synonyms. After manual review, we agreed with these mappings for all pairs in which the names (or normalized names) matched. Figure 3 shows an example of a synonym mapping in the RED. However, our manual review of the “non-name match” partition discovered 95 pairs that we considered to be non-synonymous, such as the I10 term Newborn (suspected to be) affected by maternal hypertensive disorders (P00.0) and the I9 term Maternal hypertensive disorders affecting fetus or newborn (760.0); we understand the I9 term as a diagnosis that is applied to the parent of the newborn and the I10 term as a diagnosis that is applied to the newborn itself. An example of combination mappings can be found with the I10 term Salmonella sepsis (A02.1), which has a GEM map of 10111 to the I9 term Salmonella septicemia (003.1) and a GEM map of 10112 to the I9 term Salmonella sepsis (995.91). In this example, there is only one scenario (fourth flag) with two choices (fifth flag). In this case, we determined that the I10 term should be added as a new concept in the RED and placed in the hierarchy under the existing concepts for both of the I9 terms.

408

Table 1: Partitions of GEM Mappings Based on Flag Pattern, Combinatorial Patterns, “Non-Other” versus “Other” Terms, and Type of Name Matching. RED addition decisions are shown as counts of synonyms, siblings and children; letters next to decision counts refer to examples shown in Table 2. GEM Flags

I10 to I9 Mapping

"Non-Other" vs. "Other" Terms

Mapping Decision Name Matching Synonyms

Siblings

Children

1554 a

0

0

Normalized Match n=262

262 b

0

0

Non-Name Match n=1425

1251 c

85 d

89 e

0

100 f

0

Normalized Match n=31

0

31 g

0

Non-Name Match n=156

62 h

76 i

18 j

Name Match Non-Other n=3241 00000 (Synonymous)

One to One n=3528

Name Match Other n=287

Name Match Non-Other N=812 One to One n=1012

One to Many n=358 with 844 mappings

Non-Other n=592 mappings

0

0

0

Non-Name Match n=734

0

647 m

87 n

n=19

0

19 o

0

Normalized Match n=7

0

7 p

0

Non-Name Match n=174

0

173 q

1 r

28 s

0

0

n=28

Normalized Match n=29

29 t

0

0

Non-Name Match n=535

0

458 u

77 v

n=14

0

14 w

0

Normalized Match n=4

0

4 x

0

0

95 y

139 z

83 aa

0

0

Normalized Match n=292

292 bb

0

0

Non-Name Match n=45,948

0

0

45,948 cc

Name Match Many to One n=57,236 (4192 I9 terms)

Many to Many n=3221 with 7550 mappings

None (n=669)

0

37 l

Name Match

n=41

Non-Name Match n=234

10000 (Approximate)

101xx (n=3808)

41 k

Name Match Other n=252 mappings

n=100

Normalized Match n=37

Name Match Other n=200

n=1554

One to Many n=3808 with 7825 mappings

No mappings

Non-Other n=46,323

n=83

Name Match Other n=10,913

Non-Other n=5260 mappings Other n=2290 mappings Non-Other n=5880 mallings Other n=1945 mappings No mappings

n=0

0

0

0

Normalized Match n=0

0

0

0

Non-Name Match n=10,913

0

0

10,913 dd

9 ee

0

0

Normalized Match n=13

Name Match

13 ff

0

0

Non-Name Match n=5238

0

5238 gg

0

0

13 hh

0

Name Match

n=9

n=13

Normalized Match n=17

0

17 ii

0

Non-Name Match n=2260

0

2260 jj

0

1 kk

0

0

Normalized Match n=0

0

0

0

Non-Name Match n=5879

0

0

5879 ll

n=0

0

0

0

Normalized Match n=0

0

0

0

Non-Name Match n=1945

0

0

1945 mm

No mappings

0

669 nn

0

Name Match

Name Match

409

n=1

Table 2: Examples of Addition Decisions from Table 1 I10 Flag 00000 00000

I9 Code 001.9 350.2

00000

005.0

00000

204.10

00000

760.0

00000

728.6

00000

031.0

00000

003.8

Other thalassemias

00000

282.49

D49.7

Neoplasm of unspecified behavior of endocrine glands and other parts of nervous system

00000

239.7

i

A03.8

Other shigellosis

00000

004.8

j

A35

00000

037

j

D26.0

00000

219.0

k

A52.11

Other tetanus Other benign neoplasm of cervix uteri Tabes dorsalis

10000

094.0

I10 Code

I10 Name

a b

A00.9 G50.

c

A05.0

d

C91.10

d

P00.0

e

M72.0

e

A31.0

f

A02.8

Cholera, unspecified Atypical facial pain Foodborne staphylococcal intoxication Chronic lymphocytic leukemia of B-cell type not having achieved remission Newborn (suspected to be) affected by maternal hypertensive disorders Palmar fascial fibromatosis [Dupuytren] Pulmonary mycobacterial infection Other specified salmonella infections

g

D56.8

h

00000 00000

Synonym Synonym

00000

Synonym

00000

Sibling

00000

Sibling

Contracture of palmar fascia

00000

Child

00000

Child

00000

Sibling

00000

Sibling

00000

Synonym

00000

Sibling

00000

Child

00000

Child

Pulmonary diseases due to other mycobacteria Other specified salmonella infections Other thalassemia Neoplasm of unspecified nature of endocrine glands and other parts of nervous system Other specified shigella infections Tetanus Benign neoplasm of cervix uteri Tabes dorsalis

009.0

10000

Synonym

Enteritis due to norwalk virus

10000

Sibling

Other specified chlamydial infection Other human herpesvirus infection Tuberculosis of other urinary organs, unspecified Tuberculosis of mastoid, unspecified

[none provided]

Child

10000

Sibling

10000

Sibling

10000

Sibling

Asbestosis

10000

Parent

10000

Synonym

10000

Synonym

10000

Sibling

10000

Child

10000

Sibling

10000

Sibling

10000

Sibling

10000

Child

m

A08.11

Acute gastroenteropathy due to Norwalk agent

10000

008.63

n

A74.81

Chlamydial peritonitis

10000

079.88

o

B10.89

10000

058.89

p

A18.13

10000

016.30

q

A18.03

10000

015.60

r

J61

10000

501

s

B56.9

10000

086.5

t

A20.2

Pneumonic plague

10000

020.5

u

A15.5

Tuberculosis of larynx, trachea and bronchus

10000

011.30

v

A54.83

Gonococcal heart infection

10000

098.85

w

A07.8

Other specified protozoal intestinal diseases

10000

007.8

x

A08.39

Other viral enteritis

10000

008.69

y

A07.8

Other specified protozoal intestinal diseases

10000

007.8

z

A54.83

Gonococcal heart infection

10000

098.85

Pneumoconiosis due to asbestos and other mineral fibers African trypanosomiasis, unspecified

Cholera, unspecified Atypical face pain Staphylococcal food poisoning Chronic lymphoid leukemia, without mention of having achieved remission Maternal hypertensive disorders affecting fetus or newborn

Synonym

A09

Tuberculosis of other bones

Addition

10000

l

Other human herpesvirus infection Tuberculosis of other urinary organs

I9 Flag

Infectious colitis, enteritis, and gastroenteritis

Infectious gastroenteritis and colitis, unspecified

10000

I9 Name

410

African trypanosomiasis, unspecified Pneumonic plague, unspecified Tuberculosis of bronchus, unspecified Other gonococcal heart disease Other specified protozoal intestinal diseases Enteritis due to other viral enteritis Other specified protozoal intestinal diseases Other gonococcal heart disease

Table 2 (continued): Examples of Addition Decisions from Table 1 I10 Code

I10 Name

I9 Code

I9 Name

10000

017.30

10000 10000

002.0 002.0

Tuberculosis of eye, unspecified Typhoid fever Typhoid fever

10000

002.0

Typhoid fever

10000

263.9

Early syphilis, latent

10000

092.9

Tuberculous neuritis

10000

013.62

10000

040.89

10000

098.49

Tuberculosis of other sites

10000

017.80

A22.1 A02.1

Pulmonary anthrax Salmonella sepsis

10111 10111

022.1 003.1

Unspecified protein-calorie malnutrition Early syphilis, latent, unspecified Tuberculous encephalitis or myelitis, bacteriological or histological examination unknown (at present) Other specified bacterial diseases Other gonococcal infection of eye Tuberculosis of esophagus, unspecified Pulmonary anthrax Salmonella septicemia

ll

A02.1

Salmonella sepsis

10112

995.91

mm

A37.81

10111

mm

A37.81

10112

nn

R40.2131

aa

A18.50

bb cc

A01.00 A01.01

dd

A01.09

ee

E46

ff

A51.5

gg

A17.83

hh

A48.8

ii

A54.39

jj

A18.89

kk ll

Tuberculosis of eye, unspecified Typhoid fever, unspecified Typhoid meningitis Typhoid fever with other complications Unspecified protein-calorie malnutrition

I10 Flag

Other specified bacterial diseases Other gonococcal eye infection

Whooping cough due to other Bordetella species with pneumonia Whooping cough due to other Bordetella species with pneumonia Coma scale, eyes open, to sound, in the field [EMT or ambulance]

I9 Flag

Addition

10000

Synonym

10000

Synonym Child Child

10000

Synonym

10000

Synonym

[none provided]

Sibling

10000

Sibling

10000

Sibling

10000

Sibling

10000 10000

Synonym Child

Sepsis

[none provided]

Child

033.8

Whooping cough due to other specified organism

[none provided]

Child

484.3

Pneumonia in whooping cough

[none provided]

Child

Repeating Patterns of Many-to-One Mappings By far, the largest partition is the many-to-one approximate (10000) mappings, with 57,236 I10 codes mapping to 4192 I9 codes. Our initial assessment was that most of these mappings represent a refinement of an I9 term into many I10 terms. However, when we reviewed specific sets of Typhoid Fever Code: C4198842 Preferred Term: Typhoid Fever mappings, we noticed recurring Full_Syn Typhoid fever patterns. For example, 996 of the Syn_Source SoftMed-CC I9 codes map to exactly two I10 Syn_Type_Term: PT terms. In 261 cases, one of these Syn_Source_Local_Code 002.0 is an “other” term and the other is Syn_Source_Domain Diagnosis_ICD-9-CM not. More broadly, the mappings Full_Syn Typhoid fever to 768 I9 terms (18.3%) include Syn_Source CRIMSON NIAID exactly one “non-other” I10 term. Syn_Type_Term: PT Examination of these sets shows Syn_Source_Local_Code 2068 that, in general, if the I9 term is Syn_Source_Domain Problem represented as “X”, one I10 term Full_Syn Typhoid fever, unspecified can be represented as “Other Syn_Source SoftMed-CC specified X” and the remaining Syn_Type_Term: PT I10 terms are more specific forms Syn_Source_Local_Code A01.00 of X. If this is true for all such Syn_Source_Domain Diagnosis_ICD-10-CM “one ‘other’ and many ‘nonFigure 3: Example of a RED concept that maps to an I9 term and an I10 – other’” patterns, we can with that is, the terms are considered synonymous. This is also a term used by an confidence add all of these I10 NIH clinical trials data management system (CRIMSON). terms as children of their

411

corresponding I9 term. A similar rule may apply for the 2498 (59.6%) I9 terms that have no “other” I10 terms in their set of maps.

Herpes Zoster Complicated { I9: 053.8}

Herpes Zoster of Eye {I9: 053.2; I10: B02.30}

Class-Based Retrieval of BTRIS Data across ICD Versions

The ultimate purpose of all this work is to support the storage and retrieval of Zoster Conjunctivitis {I10: B02.31} Herpes Zoster Ophthalmicus BTRIS data in as accurate and seamless a manner as possible. A perfect Herpes zoster with other ophthalmic Zoster Scleritis {I10: B02.34} ontology is probably impossible, given complications {I9: 053.29} all the subtle nuances of the differences among and between I9 and I10 codes. Recoding all the I9 data with I10 codes Figure 4: Example of RED hierarchy after inclusion of I10 terms. Light is not practical and would likely lead to inaccuracies. Requiring users seeking boxes represent concepts that were in the RED prior to inclusion of I10 to retrieve data to select a code from terms; dark boxes represent newly added concepts. I9 and I10 codes are each terminology version might be shown in {curly braces}. accurate but would be confusing and awkward. The I10 update we are applying to the RED provides a practical solution. Figure 4 shows an example of the new hierarchy of RED concepts that include mappings to I9 and I10 terms. BTRIS supports class-based queries so that a user could, for example, query for the RED concept Herpes Zoster Complicated and retrieve all data coded with the I9 codes 053.8, 053.2 and 053.29 and with the I10 codes B02.30, B02.31 and B02.34. Discussion A clinical data repository that combines I9 and I10 data without a coordinated terminology solution has two options: either re-code all the old I9 data with I10 or force users to query using both I9 and I10 terms. Our use of a single repository terminology, the RED, provides significant benefits to our users: they use a single terminology to perform queries that includes not only I9 and I10 terms but several other local diagnosis terminologies. The RED also provides the benefit of multiple hierarchies, so that users are not restricted to either of the single hierarchies of in I9 or I10, and they can still query for specific I9 or I10 terms if desired. The ontology approach – regardless of how the maintenance is accomplished - reduces the need to include multiple terms in queries. While this probably improves query execution somewhat, the real saving is a reduction in the BTRIS user’s effort to construct a comprehensive query. RED maintenance requires significant human effort. The manual integration of a new terminology of I10’s magnitude would require many months of work, much of it mind-numbing and prone to error. The GEM mappings have been of tremendous benefit for providing a first approximation of synonymy and classification of I10 terms. The result is that I10 terms can be added quickly and be made useful immediately for both storing and retrieving patient data. Where revisions are required to merge redundant terms, split ambiguous ones, or reorganize the hierarchy, the amount of effort needed is no more that would be needed if we were to try to get everything “right” prior to loading I10 into the RED. Meanwhile, our use of GEM-based heuristics allows us to have a working, integrated repository while proceeding with “tuning” the RED. We now have the luxury of prioritizing our reviews based on the I10 terms as they actually start appearing in patient records. In addition to continued work at reviewing and revising the I10 additions to the RED, we will continue to study the patterns of the many-to-one “similar” GEM mappings to determine if they can be used to improve the relationships between I9 and I10 terms. We will also need to make changes as updates to I10 occur; one of these has taken place already. While the GEMs are obviously limited to use with I9 and I10, their existence raises the possibility of similar approaches to other mappings, such as between various drug terminologies or between I10 and the Systematized Nomenclature of Medicine (SNOMED-CT). The actual methods used and level of effort needed to develop the GEMs is not published, but it is interesting to consider whether similar tools could be developed for other term mappings.

412

When our ontology-based method for coping with I9 changes was originally published in 1996, Tuttle and Nelson praised it in an accompanying editorial but lamented the need for having to carry out “reverse engineering”. They went on to say: Such methods should be viewed as necessary short-term expedients only, and all parties concerned should work toward an incremental plan by which the intent of changes to controlled health-care vocabularies can be made both explicit and machine processible. Only then can the comparability of patient descriptions be sustained.12 We, too, await such changes.13 In the meantime, while the GEMs are not machine processable, they are at least machine readable and appear to be based on sound terminologic principles. Furthermore the ontologic approach of mapping terms to concepts and then coding data based on those concepts has proven to be a valuable precedent that continues to provide value in dealing with local and standard terminologies alike. Conclusion The practice of using an ontology that integrates disparate clinical terminologies has proven to be a powerful method for meeting the challenges of adapting a clinical data repository to include I10 data while maintaining the value of legacy I9 data. Our approach is able to take advantage of the GEM mappings from CMS to provide a rapid solution that can evolve gracefully. Acknowledgments Drs. Cimino and Remennik are supported in part by research funds from the NIH Clinical Center and the National Library of Medicine. The opinions expressed in this article are authors’ own and do not reflect the view of the National Institutes of Health, or the Department of Health and Human Services References 1. United States National Center for Health Statistics. The International Classification of Diseases, 9th Revision, with Clinical Modifications. Washington, DC; 1980. 2. Yu AC, Cimino JJ. A comparison of two methods for retrieving ICD-9-CM data: the effect of using an ontologybased method for handling terminology changes. J Biomed Inform. 2011 Apr;44(2):289-98. 3. Cimino JJ. Formal descriptions and adaptive mechanisms for changes in controlled medical vocabularies. Methods Inf Med. 1996 Sep;35(3):202-10. 4. Boyd AD1, Li JJ, Burton MD, Jonen M, Gardeux V, Achour I, Luo RQ, Zenku I, Bahroos N, Brown SB, Vanden Hoek T, Lussier YA. The discriminatory cost of ICD-10-CM transition between clinical specialties: metrics, case study, and mitigating tools. J Am Med Inform Assoc. 2013 Jul-Aug;20(4):708-17. 5. Cimino JJ, Ayres EJ, Remennik L, Rath S, Freedman R, Beri A, Chen Y, Huser V. The National Institutes of Health's Biomedical Translational Research Information System (BTRIS): Design, contents, functionality and experience to date. J Biomed Inform. 2013 Nov 19. pii: S1532-0464(13)00181-0. doi: 10.1016/j.jbi.2013.11.004. 6. Cimino JJ. From data to knowledge through concept-oriented terminologies: experience with the Medical Entities Dictionary. J Am Med Inform Assoc. 2000 May-Jun;7(3):288-97. 7. Ross-Davis SV. Preparing for ICD-10-CM/PCS: one payer's experience with general equivalence mappings (GEMs). Perspect Health Inf Manag. 2012;9:1e. 8. Centers for Medicare and Medicaid Services. Diagnosis Code Set General Equivalence Mappings: ICD-10-CM to ICD-9-CM and ICD-9-CM to ICD-10-CM 2009 Version: Documentation and User's Guide. Available at https://www.cms.gov/ICD10/11b1_2011_ICD10CM_and_GEMs.asp. 9. Gu H, Perl Y, Elhanan G, Min H, Zhang L, Peng Y. Auditing concept categorizations in the UMLS. Artif Intell Med. 2004 May;31(1):29-44. 10. Center for Medicare and Medicaid Services. 2014 General Equivalence Mappings (GEMs) – Diagnosis Codes and Guide. https://www.cms.gov/Medicare/Coding/ICD10/Downloads/DiagnosisGEMs-2014.zip 11. Cimino JJ. Use of the Unified Medical Language System in patient care at the Columbia-Presbyterian Medical Center. Methods Inf Med. 1995 Mar;34(1-2):158-64. 12. Tuttle MS, Nelson SJ. A poor precedent. Methods Inf Med. 1996 Sep;35(3):211-7. 13. Cimino JJ. An approach to coping with the annual changes in ICD9-CM. Methods Inf Med. 1996 Sep;35(3):220.

413

Adapting a Clinical Data Repository to ICD-10-CM through the use of a Terminology Repository.

Clinical data repositories frequently contain patient diagnoses coded with the International Classification of Diseases, Ninth Revision (ICD-9-CM). Th...
193KB Sizes 0 Downloads 9 Views