Health Text Processing

Anthony Nguyen
Anthony Nguyen
Team Leader
Health Data Semantics

Data is captured about patients in a number of different formats, electronic repositories and using many different terminologies. Our technologies are targeted at understanding the information in data, whether the data is captured in an electronic health record, coded in a clinical database, captured from sensors, described in medical free text reports or even captured using imaging technology.

The Health Data Semantics team is focused on deriving value from electronic health data in terms of improving patient outcomes, and health system performance and productivity. The group does this by developing and applying machine learning, natural language processing, information retrieval and formal logic approaches to deliver and support meaningful data interoperability and analysis for decision support, analytics, modeling and reporting.

Clinical Terminology Tools
Developing tools to collect health information in a standard way.

Snorocket, a fast classifier for ontologies, such as SNOMED CT.
Snapper Platform, a tool for mapping one terminolgies to concepts or expressions of another terminology.
Ontoserver, a terminology server

Medical Text Retrieval & Analytics
Retrieving and extracting clinical information from free text reports, including

The analysis of pathology reports and death certificates to timely assess the incidence of cancer and the associated mortality rates, the analysis of radiology reports to support the reconciliation of radiology findings with emergency department discharge records, the analysis of medical reports to provide capability for medical record searching and analytics, and the analysis of medical forums to identify adverse drug reactions