Clinical data in digital form represents a “digital library” with many of the same issues faced by digital libraries in other fields. Thought needs to be given to how health-related information stored in EHRs can be preserved, stored, and yet be totally accessible. So far, these issues have not been defined for EHRs.
This is an issue that needs to be addressed or valuable and irreplaceable information will disappear over time and affect patient care and valuable research. Also, replacing lost data can entail huge costs for patients, clinicians, administrators, and pharmacists.
Some of the issues facing the health IT industry concerning EHRs are:
• Retaining data and for how long
• Dealing with the obsolescence of hardware and software
• The interchange of information
• Costs involved
• Developing standards
• Addressing privacy issues along with data ownership
• Legal constraints
NIST is collaborating with the National Library of Medicine, the National Archives and Records Administration, the VA and others, such as Health Level Seven (HL7) to identify best practices and support standards development needed for the long term preservation and lifecycle management of EHRs.
Through these collaborations, NIST will work to develop an interoperable framework to support a wide variety of data types, data formats, and data delivery mechanisms, while providing a technology-independent infrastructure to acquire, store, search, retrieve, migrate, replicate, and distribute EHRs over time.
Another major issue concerns accessing EHRs by content which is a fundamental usage requirement for today’s electronic health record management systems. Today’s systems provide access based on structured fields—data elements in the record coded to allow effective access.
However, the majority of the content of a record is often in the care providers’ notes and other free-text fields that are not structured so as a result, standard text processing techniques do not work well for these fields. It is particularly difficult when the information does not contain well formed grammatical sentences, when highly specialized vocabulary is used with many non-word terms such as abbreviations and symbols, and when the notes are frequently too brief.
Health records will continue to have free-text fields since this is the way more users enter information. However, NIST is studying how to develop technology so that records can be based on the semantic content of free-text fields. The ability to find electronic health records by matching semantic content in free-text fields will help in the use of health records especially in applications such as medical trials and epidemiological studies.
NIST’s Information Technology Laboratory’s Text Retrieval Conference (TREC) project is working with the research community to develop test data sets, evaluation methods, and other infrastructure to foster the development of new text processing algorithms specially designed for EHRs.