Abstract
Unstructured clinical texts contain patients’ disease related narratives, but it is required elaborate work to mine the kind of information. Especially for the classification of semantic types of a clinical term, implementations of domain knowledge from resources such as the Unified Medical Language System (UMLS) are essential. The UMLS has a limitation in dealing with other languages. In this paper, we leverage Wikipedia as well as UMLS for clinical event extraction, especially from clinical narratives written in mixed-language. Semantic features for clinical terms are extracted based on semantic networks of hierarchical categories in Wikipedia. Semantic types for Korean clinical terms are detected by using translation links and semantic networks in Wikipedia. An additional remarkable feature is a controlled vocabulary of clue words which can be contextual evidence to determine clinical semantic types of a word. The experimental result on 150 discharge summaries written in English and Korean showed 75.9% in F1-measure. This result shows that the proposed features are effective for clinical event extraction.
| Original language | English |
|---|---|
| Pages (from-to) | 302-313 |
| Number of pages | 12 |
| Journal | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
| Volume | 8870 |
| DOIs | |
| State | Published - 2014 |
Keywords
- Clinical event extraction
- Mixture of language
- Semantic classification
- Wikipedia
Quacquarelli Symonds(QS) Subject Topics
- Computer Science & Information Systems
Fingerprint
Dive into the research topics of 'Leveraging wikipedia and context features for clinical event extraction from mixed-language discharge summary'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver