Skip to main navigation Skip to search Skip to main content

Leveraging wikipedia and context features for clinical event extraction from mixed-language discharge summary

  • Kwang Yong Jeong
  • , Wangjin Yi
  • , Jae Wook Seol
  • , Jinwook Choi
  • , Kyung Soon Lee
    • Jeonbuk National University
    • Seoul National University

    Research output: Contribution to journalJournal articlepeer-review

    Abstract

    Unstructured clinical texts contain patients’ disease related narratives, but it is required elaborate work to mine the kind of information. Especially for the classification of semantic types of a clinical term, implementations of domain knowledge from resources such as the Unified Medical Language System (UMLS) are essential. The UMLS has a limitation in dealing with other languages. In this paper, we leverage Wikipedia as well as UMLS for clinical event extraction, especially from clinical narratives written in mixed-language. Semantic features for clinical terms are extracted based on semantic networks of hierarchical categories in Wikipedia. Semantic types for Korean clinical terms are detected by using translation links and semantic networks in Wikipedia. An additional remarkable feature is a controlled vocabulary of clue words which can be contextual evidence to determine clinical semantic types of a word. The experimental result on 150 discharge summaries written in English and Korean showed 75.9% in F1-measure. This result shows that the proposed features are effective for clinical event extraction.

    Keywords

    • Clinical event extraction
    • Mixture of language
    • Semantic classification
    • Wikipedia

    Quacquarelli Symonds(QS) Subject Topics

    • Computer Science & Information Systems

    Fingerprint

    Dive into the research topics of 'Leveraging wikipedia and context features for clinical event extraction from mixed-language discharge summary'. Together they form a unique fingerprint.

    Cite this