Dependency-Parsing in Classical Chinese under Universal Dependencies

Research Project

Project/Area Number	20H04481
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Review Section	Basic Section 90020:Library and information science, humanistic and social informatics-related
Research Institution	Kyoto University
Principal Investigator	YASUOKA Koichi 京都大学, 人文科学研究所, 教授 (20230211)
Co-Investigator(Kenkyū-buntansha)	山崎直樹関西大学, 外国語学部, 教授 (30230402) 二階堂善弘関西大学, 文学部, 教授 (70292258) 師茂樹花園大学, 文学部, 教授 (70351294) Wittern C. 京都大学, 人文科学研究所, 教授 (20333560) 池田巧京都大学, 人文科学研究所, 教授 (90259250) 守岡知彦京都大学, 人文科学研究所, 助教 (40324701) 白須裕之京都大学, 人文科学研究所, 助教 (30828570) 鈴木慎吾大阪大学, 大学院人文学研究科(外国学専攻、日本学専攻), 准教授 (20513360)
Project Period (FY)	2020-04-01 – 2023-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥17,420,000 (Direct Cost: ¥13,400,000、Indirect Cost: ¥4,020,000) Fiscal Year 2022: ¥6,760,000 (Direct Cost: ¥5,200,000、Indirect Cost: ¥1,560,000) Fiscal Year 2021: ¥5,850,000 (Direct Cost: ¥4,500,000、Indirect Cost: ¥1,350,000) Fiscal Year 2020: ¥4,810,000 (Direct Cost: ¥3,700,000、Indirect Cost: ¥1,110,000)
Keywords	言語処理 / 古典漢文 / 孤立語
Outline of Research at the Start	本研究では、漢から清にかけて大量に蓄積された古典漢文テキストに対し、品詞情報を付加した形態素解析と依存文法解析をおこなった上で、単語と単語の間の係り受け構造、節と節の間の係り受け構造、さらには文と文の間の係り受け構造を、自動抽出する手法を構築する。本研究は、古典漢文における構文解析の主要な部分となる研究であり、文法的な構造化がおこなわれず白文（単なる漢字の列）のままで放置されている大量の古典漢文テキストに対し、その構造化すなわち文法解析をおこなうための基礎的手法となるものである。
Outline of Final Research Achievements	We have developed RoBERTa-Classical-Chinese and its fine-tuned models for Classical Chinese to perform sentence segmentation, word tokenization, part-of-speech tagging, dependency-parsing between words, phrase detection, and dependency-parsing between phrases. And we have applied our methods to other isolating languages, such as Vietnamese and Thai.
Academic Significance and Societal Importance of the Research Achievements	学術的意義としては、古典漢文の白文（単なる漢字の列）が、本研究の手法により、文・節・単語の単位に区切ることが出来るようになる上に、それらの関係（どの単語が動詞で、その主語や目的語はどれなのか、など）が、非常に高い精度で自動解析できるようになった。一方、社会的意義としては、本研究の手法が、ベトナム語やタイ語にも適用可能であるという点が挙げられる。ベトナム語もタイ語も、単語の切れ目すら見極めるのが難しい言語であり、それが自動解析できるようになる意義は大きい。

Report

(4 results)

Research Products
(24 results)

All 2023 2022 2021 2020 Other

All Int'l Joint Research (8 results) Journal Article (11 results) (of which Peer Reviewed: 6 results, Open Access: 11 results) Presentation (3 results) (of which Int'l Joint Research: 1 results, Invited: 3 results) Remarks (2 results)

[Int'l Joint Research] カレル大学(チェコ)
- Related Report
  2022 Annual Research Report
[Int'l Joint Research] カレル大学(チェコ)
- Related Report
  2021 Annual Research Report
[Int'l Joint Research] スタンフォード大学(米国)
- Related Report
  2021 Annual Research Report
[Int'l Joint Research] 北京理工大学/南京農業大学(中国)
- Related Report
  2021 Annual Research Report
[Int'l Joint Research] 東呉大学(その他の国・地域（台湾）)
- Related Report
  2021 Annual Research Report
[Int'l Joint Research] カレル大学(チェコ)
- Related Report
  2020 Annual Research Report
[Int'l Joint Research] スタンフォード大学(米国)
- Related Report
  2020 Annual Research Report
[Int'l Joint Research] 東呉大学(その他の国・地域（台湾）)
- Related Report
  2020 Annual Research Report
[Journal Article] Sequence-Labeling RoBERTa Model for Dependency-Parsing in Classical Chinese and Its Application to Vietnamese and Thai2023
- Author(s)
  Yasuoka Koichi
- Journal Title
  
  8th International Conference on Business and Industrial Research
  
  Volume: ICBIR 2023 Pages: 169-173
- DOI
  10.1109/icbir57571.2023.10147628
- Related Report
  2022 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Universal DependenciesとBERT/RoBERTaモデルによる古典中国語情報処理 (in Korean)2022
- Author(s)
  安岡孝一
- Journal Title
  
  Journal of Applied Studies on Sinograph and Literary Sinitic
  
  Volume: 1 Pages: 127-163
- Related Report
  2022 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] 古典中国語の形態素解析と係り受け解析2022
- Author(s)
  安岡孝一, 安岡素子
- Journal Title
  
  槿域漢文学会2022年秋季企画学術大会
  
  Volume: 2022 Pages: 171-183
- Related Report
  2022 Annual Research Report
- Open Access
[Journal Article] 画像とテキストの位置づけ2022
- Author(s)
  二階堂善弘
- Journal Title
  
  KU-ORCASが開くデジタル化時代の東アジア文化研究
  
  Volume: 2022 Pages: 123-130
- URL
  https://kansai-u.repo.nii.ac.jp/records/22571
- Related Report
  2022 Annual Research Report
- Open Access
[Journal Article] 古典中国語（漢文）Universal Dependenciesとその応用2022
- Author(s)
  安岡孝一, ウィッテルンクリスティアン, 守岡知彦, 池田巧, 山崎直樹, 二階堂善弘, 鈴木慎吾, 師茂樹, 藤田一乘
- Journal Title
  
  情報処理学会論文誌
  
  Volume: 63 Pages: 355-363
- NAID
  120007192875
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Transformersを用いた古典中国語(漢文)文切りモデルの製作2021
- Author(s)
  安岡孝一
- Journal Title
  
  人文科学とコンピュータシンポジウム「じんもんこん2021」論文集
  
  Volume: 2021 Pages: 104-109
- NAID
  120007174942
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] CHISEのWeb API化の試み、ついでに、RDF化四度目の正直？2021
- Author(s)
  守岡知彦
- Journal Title
  
  東洋学へのコンピュータ利用
  
  Volume: 33 Pages: 69-87
- Related Report
  2021 Annual Research Report
- Open Access
[Journal Article] TransformersのBERTは共通テスト『国語』を係り受け解析する夢を見るか2021
- Author(s)
  安岡孝一
- Journal Title
  
  東洋学へのコンピュータ利用
  
  Volume: 33 Pages: 3-34
- NAID
  120006979744
- Related Report
  2020 Annual Research Report
- Open Access
[Journal Article] Kanripo X: A tagset for connecting digital texts2021
- Author(s)
  Christian Wittern
- Journal Title
  
  東洋学へのコンピュータ利用
  
  Volume: 33 Pages: 35-67
- Related Report
  2020 Annual Research Report
- Open Access
[Journal Article] Universal Dependenciesにもとづく多言語係り受け可視化ツールdeplacy2020
- Author(s)
  安岡孝一
- Journal Title
  
  人文科学とコンピュータシンポジウム「じんもんこん2020」論文集
  
  Volume: 2020 Pages: 95-100
- NAID
  170000183904
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Viewpoints on the Structural Description of Chinese Characters2020
- Author(s)
  Tomohiko Morioka
- Journal Title
  
  Grapholinguistics in the 21st Century―2020
  
  Volume: Part II Pages: 683-712
- DOI
  10.36824/2020-graf-mori
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Presentation] 古典中国語の形態素解析と係り受け解析2022
- Author(s)
  安岡孝一
- Organizer
  槿域漢文学会2022年秋季企画学術大会
- Related Report
  2022 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] 漢字・漢語・漢文の言語情報処理2021
- Author(s)
  安岡孝一
- Organizer
  日本ソフトウェア科学会第38回大会
- Related Report
  2021 Annual Research Report
- Invited
[Presentation] 世界のUniversal Dependenciesと係り受け解析ツール群2021
- Author(s)
  安岡孝一
- Organizer
  第3回Universal Dependencies公開研究会
- Related Report
  2021 Annual Research Report
- Invited
[Remarks] 「古典中国語のコーパスの研究」共同研究班ログ
- URL
  http://kanji.zinbun.kyoto-u.ac.jp/~yasuoka/kyodokenkyu/archive2023.html
- Related Report
  2022 Annual Research Report
[Remarks] 「古典中国語のコーパスの研究」共同研究班ログ
- URL
  http://kanji.zinbun.kyoto-u.ac.jp/~yasuoka/kyodokenkyu/
- Related Report
  2021 Annual Research Report 2020 Annual Research Report

Dependency-Parsing in Classical Chinese under Universal Dependencies

Principal Investigator

YASUOKA Koichi 京都大学, 人文科学研究所, 教授 (20230211)

¥17,420,000 (Direct Cost: ¥13,400,000、Indirect Cost: ¥4,020,000)

Report

Research Products

[Int'l Joint Research] カレル大学(チェコ)

Related Report

[Int'l Joint Research] カレル大学(チェコ)

Related Report

[Int'l Joint Research] スタンフォード大学(米国)

Related Report

[Int'l Joint Research] 北京理工大学/南京農業大学(中国)

Related Report

[Int'l Joint Research] 東呉大学(その他の国・地域（台湾）)

Related Report

[Int'l Joint Research] カレル大学(チェコ)

Related Report

[Int'l Joint Research] スタンフォード大学(米国)

Related Report

[Int'l Joint Research] 東呉大学(その他の国・地域（台湾）)

Related Report

[Journal Article] Sequence-Labeling RoBERTa Model for Dependency-Parsing in Classical Chinese and Its Application to Vietnamese and Thai2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Universal DependenciesとBERT/RoBERTaモデルによる古典中国語情報処理 (in Korean)2022

Author(s)

Journal Title

Related Report

[Journal Article] 古典中国語の形態素解析と係り受け解析2022

Author(s)

Journal Title

Related Report

[Journal Article] 画像とテキストの位置づけ2022

Author(s)

Journal Title

URL

Related Report

[Journal Article] 古典中国語（漢文）Universal Dependenciesとその応用2022

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Transformersを用いた古典中国語(漢文)文切りモデルの製作2021

Author(s)

Journal Title

NAID

Related Report

[Journal Article] CHISEのWeb API化の試み、ついでに、RDF化四度目の正直？2021

Author(s)

Journal Title

Related Report

[Journal Article] TransformersのBERTは共通テスト『国語』を係り受け解析する夢を見るか2021

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Kanripo X: A tagset for connecting digital texts2021

Author(s)

Journal Title

Related Report

[Journal Article] Universal Dependenciesにもとづく多言語係り受け可視化ツールdeplacy2020

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Viewpoints on the Structural Description of Chinese Characters2020

Author(s)

Journal Title

DOI

Related Report

[Presentation] 古典中国語の形態素解析と係り受け解析2022

Author(s)

Organizer

Related Report

[Presentation] 漢字・漢語・漢文の言語情報処理2021

Author(s)

Organizer