Wednesday, September 21, 2011

Effectively Using NLP

The Veterans Administration’s EHR databases contain approximately 20,000 unstructured fields containing narrative text and reports with patient-specific information. The databases contain both structured and unstructured data but the unstructured data comprises the majority of the health record.

The unstructured data can include outpatient pharmacy records, laboratory reports, provider notes, nuclear medicine and radiologic reports, electromagnetic images, discharge summaries, physician orders, vital sign measurements, and information on medications administered. This data is rich with information and could provide researchers with a greater opportunity to characterize patients to determine their health status.

Currently, clinical and administrative use of EHR databases largely depends on structured or coded data and researchers are limited to questions that can only be addressed using the structured data. It is also very difficult for researchers to use information from databases without reformatting the text.

One of the solutions is to use Natural Language Processing (NLP) that when fully developed will free up and make it possible to better utilize all of the data both structured and unstructured that is contained in health records. NLP, an invaluable branch of computer science teaches machines to make sense of human language. For example, the science is already at work in internet search engines and translation programs.

The Veterans Administration and university investigators in Nashville and several other sites are conducting studies on NPL, as part of an overall effort known as the “Consortium for Healthcare Informatics Research”.

One study being conducted at six VA medical centers, is working to interpret free text in veterans’ EMRs according to the VA’s September issue of “Research Currents”. In this specific study, researchers are using NLP to interpret doctors’ notes to identify post-surgery complications. The researcher’s findings appeared in the August 24/31 issue of the Journal of the American Medical Association.

The study used data on nearly 3,000 VA patients who underwent surgery between 1999 and 2006. Compared with a standard automated method that scans administrative data, NLP was better at picking up adverse post-surgery events such as lung, kidney or heart problems. To provide a benchmark for both approaches, trained nurses manually reviewed the patient records and carefully looked for any clinical notes indicating complications.

Eventually, the Consortium will collaborate with researchers at the various VA Medical Centers, with appropriate VA offices, VHA offices, non-VA research institutions, and other federal agencies to coordinate and apply accepted technical standards to more effectively use NPL.