Select to translate
Journal|[J]Theory and Practice in Language Studies (TPLS)Volume 10, Issue 8. 2020. PP 879-885
MT基于语料库的《现代汉语词典》自指义词词性标注个案研究
A Corpus-based Case Study on the POS Tagging of Self-referential Lexemes in the Contemporary Chinese Dictionary
Affiliations: Zhaoqing University
摘要 / Abstract
The POS tagging in the 5th edition of the CCD has been revised in the 6th and the 7th editions. The noun POS of most sports and science lexemes are deleted, and their senses of noun (self-referential senses) are included into verbs. However, most of these lexemes can be used as nouns intuitively, and their noun POS and senses should exist. Based on the grammatical functions of words (Xv & Tang, 2006) and the two-level word class categorization theory (Wang, 2014), this study conducts a corpus-based case study of a science lexeme “guina”. The result shows that “guina” not only has self-referential usage, but has high token frequency, with 133 occurrences accounting for 42.8% of the total usages, and rich type frequency widely distributed in “guina + (of) + NP “,” NP + (of) + guina” and “VP + guina”, which conforms to the criterion of conventionalization. Therefore, it is necessary to tag the noun POS and to set up the self-referential sense for “guina”. This research has an implication for solving the POS tagging problem of self-referential lexemes in the CCD.
《中国学术期刊(光盘版)》电子杂志社有限公司KDN平台基础技术由KBASE 11.0提供