A Study of Stylistic Change in Japanese Based on Data Science and Modeling of its Structure
Project/Area Number |
18K00627
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 02070:Japanese linguistics-related
|
Research Institution | Doshisha University |
Principal Investigator |
KIN Meitetsu 同志社大学, 文化情報学部, 教授 (60275469)
|
Co-Investigator(Kenkyū-buntansha) |
山崎 誠 大学共同利用機関法人人間文化研究機構国立国語研究所, 言語変化研究領域, 教授 (30182489)
|
Project Period (FY) |
2018-04-01 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2020: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2019: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2018: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | 文体の変化 / モデリング / テキストマイニング / 助詞 / 文末パターン / 文体変化 / コーパス作成 / 正則化回帰 / ランダムフォレスト / コーパス / テキストアナリシス / 計量文献学 / 文体 / データサイエンス / 文体分析 |
Outline of Final Research Achievements |
In this study, we first created a corpus of 592 novels (9557078 characters) by 592 authors with sampling the works of five to six representative authors each year from the vast collection of novels spanning over 100 years. Next, we performed morphological and syntactic analysis for the corpus to analyze the stylistic features. The analysis was conducted by using unsupervised methods to provide an overview of stylistic features, and then using supervised learning methods to identify and model variables that changed significantly over time. As a result, it was found that there was a marked increase or decrease in particles and sentence-final patterns over time. In addition, we've attempted to interpret them from the perspective of linguistics and stylistics.
|
Academic Significance and Societal Importance of the Research Achievements |
本研究では,日本語の現代文における文体および言語の経時的変化について機械学習やモデリングなどのデータサイエンスの手法で変化の要素を明らかにすると同時に,その現象の裏に潜んでいる要因を社会学,文体学,言語学の視点で究明を試みた.本研究の成果は,日本語文体および言語学の研究などに有益な学問的情報を提供するだけではなく,現代社会における人文社会科学の研究にデータサイエンスの方法を用いる有効性を示すに値する.
|
Report
(4 results)
Research Products
(75 results)