Skip to main navigation Skip to search Skip to main content

Identification of subject shareness for Korean-English machine translation

  • Kye Sung Kim*
  • , Seong Bae Park
  • , Hyun Je Song
  • , Se Young Park
  • , Sang Jo Lee
  • *Corresponding author for this work
  • Kyungpook National University

Research output: Contribution to conferenceConference paperpeer-review

Abstract

One of the most critical issues in translating Korean into other languages is the common use of empty arguments. Since even mandatory elements in Korean are often dropped unlike English, the missing elements should be resolved during translation to obtain grammatical sentences. In this paper, we focus on missing subjects in intra-sentential level, which can be regarded as the identification of subject sharing between clauses. In order to reflect syntactic information in resolving missing subjects, we use a parse tree kernel, a specialized convolution kernel. In experimental evaluation, syntactic information turns out to be positively related to the identification of subject shareness. Our method achieves an accuracy of 81.39% and outperforms the baseline system assuming that two adjacent clauses share a subject.

Original languageEnglish
Title of host publicationPRICAI 2008
Subtitle of host publicationTrends in Artificial Intelligence - 10th Pacific Rim International Conference on Artificial Intelligence, Proceedings
Pages211-222
Number of pages12
DOIs
StatePublished - 2008
Event10th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2008 - Hanoi, Viet Nam
Duration: 2008.12.152008.12.19

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5351 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2008
Country/TerritoryViet Nam
CityHanoi
Period08.12.1508.12.19

Fingerprint

Dive into the research topics of 'Identification of subject shareness for Korean-English machine translation'. Together they form a unique fingerprint.

Cite this