• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A Study of Stylistic Change in Japanese Based on Data Science and Modeling of its Structure

Research Project

Project/Area Number 18K00627
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 02070:Japanese linguistics-related
Research InstitutionDoshisha University

Principal Investigator

KIN Meitetsu  同志社大学, 文化情報学部, 教授 (60275469)

Co-Investigator(Kenkyū-buntansha) 山崎 誠  大学共同利用機関法人人間文化研究機構国立国語研究所, 言語変化研究領域, 教授 (30182489)
Project Period (FY) 2018-04-01 – 2021-03-31
Project Status Completed (Fiscal Year 2020)
Budget Amount *help
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2020: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2019: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2018: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Keywords文体の変化 / モデリング / テキストマイニング / 助詞 / 文末パターン / 文体変化 / コーパス作成 / 正則化回帰 / ランダムフォレスト / コーパス / テキストアナリシス / 計量文献学 / 文体 / データサイエンス / 文体分析
Outline of Final Research Achievements

In this study, we first created a corpus of 592 novels (9557078 characters) by 592 authors with sampling the works of five to six representative authors each year from the vast collection of novels spanning over 100 years. Next, we performed morphological and syntactic analysis for the corpus to analyze the stylistic features. The analysis was conducted by using unsupervised methods to provide an overview of stylistic features, and then using supervised learning methods to identify and model variables that changed significantly over time. As a result, it was found that there was a marked increase or decrease in particles and sentence-final patterns over time. In addition, we've attempted to interpret them from the perspective of linguistics and stylistics.

Academic Significance and Societal Importance of the Research Achievements

本研究では,日本語の現代文における文体および言語の経時的変化について機械学習やモデリングなどのデータサイエンスの手法で変化の要素を明らかにすると同時に,その現象の裏に潜んでいる要因を社会学,文体学,言語学の視点で究明を試みた.本研究の成果は,日本語文体および言語学の研究などに有益な学問的情報を提供するだけではなく,現代社会における人文社会科学の研究にデータサイエンスの方法を用いる有効性を示すに値する.

Report

(4 results)
  • 2020 Annual Research Report   Final Research Report ( PDF )
  • 2019 Research-status Report
  • 2018 Research-status Report
  • Research Products

    (75 results)

All 2021 2020 2019 2018 Other

All Journal Article (23 results) (of which Int'l Joint Research: 9 results,  Peer Reviewed: 22 results,  Open Access: 2 results) Presentation (44 results) (of which Int'l Joint Research: 15 results) Book (7 results) Remarks (1 results)

  • [Journal Article] 明暗』と『続明暗』のトピック変化の計量分析2021

    • Author(s)
      李 広微, 金 明哲
    • Journal Title

      計量国語学

      Volume: 38 Pages: 469-505

    • NAID

      40022524929

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 日本語における機能フレーズを特徴量とした著者識別2020

    • Author(s)
      黄 善玉, 金 明哲
    • Journal Title

      報知識学会誌

      Volume: 30 Pages: 390-400

    • NAID

      130007937007

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 菊池寛「受難華」の代筆問題の研究2020

    • Author(s)
      柳 燁佳, 金 明哲
    • Journal Title

      データ分析の理論と応用

      Volume: 9 Pages: 1-11

    • NAID

      130007923307

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed
  • [Journal Article] テキストコーパスマイニングツールMTMineR2020

    • Author(s)
      金 明哲, 鄭 弯弯
    • Journal Title

      計量国語学

      Volume: 32 Pages: 265-276

    • NAID

      130008054298

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed
  • [Journal Article] The Effects of Class Imbalance and Training Data Size on Classifier Learning: An Empirical Study2020

    • Author(s)
      Zheng Wanwan、Jin Mingzhe
    • Journal Title

      SN Computer Science

      Volume: 1 Issue: 2 Pages: 1-13

    • DOI

      10.1007/s42979-020-0074-0

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Classification Analysis of Kouji Uno’s Novels Using Topic Model2020

    • Author(s)
      Xueqin Liu, Mingzhe Jin
    • Journal Title

      Behaviormetrika

      Volume: 47 Pages: 189-212

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] コーパスを用いた仮定形音融合使用に関する計量的研究2020

    • Author(s)
      入江 さやか, 金 明哲
    • Journal Title

      国立国語研究所論集

      Volume: 18 Pages: 1-16

    • NAID

      120006777749

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Comparing Multiple Categories of Feature Selection Methods for Text Classification2020

    • Author(s)
      Wanwan Zheng, Mingzhe Jin
    • Journal Title

      Digital Scholarship in the Humanities

      Volume: 35 Pages: 208-224

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Classification analysis of Kouji Uno’s novels using topic model2020

    • Author(s)
      X. Liu, M. Jin
    • Journal Title

      Behaviormetrika

      Volume: 47 Pages: 189-212

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] コーパスを用いた仮定形音融合使用に関する計量的研究2020

    • Author(s)
      入江 さやか , 金 明哲
    • Journal Title

      国立国語研究所論集/NINJAL Research Papers

      Volume: 18 Pages: 1-16

    • NAID

      120006777749

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] データ処理のためのプログラミング言語[Ⅲ]-R言語編-(Enjoy Data Processing[Ⅲ]: R Language)2019

    • Author(s)
      金 明哲
    • Journal Title

      電子情報通信学会誌

      Volume: 102(8) Pages: 822-828

    • Related Report
      2019 Research-status Report
  • [Journal Article] 統計解析からみた小説『続明暗』の文体模倣2019

    • Author(s)
      李 広微, 金 明哲
    • Journal Title

      計量国語学

      Volume: 32(1) Pages: 19-32

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] 方言録音文字化資料における拍bigramから見た方言分類―岐阜・愛知方言の所属は東か西か―2019

    • Author(s)
      入江 さやか, 金 明哲
    • Journal Title

      計量国語学

      Volume: 32(1) Pages: 1-18

    • NAID

      130007857968

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] 文体の個人差と個人内恒常性の検証―階層的ベイズモデルによる学術論文の比較―2019

    • Author(s)
      財津 亘, 金 明哲
    • Journal Title

      行動計量学

      Volume: 46(2) Pages: 87-95

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Comparing multiple categories of feature selection methods for text classification2019

    • Author(s)
      W. Zheng and M. Jin
    • Journal Title

      Digital Scholarship in the Humanities

      Volume: 35(1) Pages: 208-224

    • DOI

      10.1093/llc/fqz003

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] トピックモデルによる関西私鉄沿線の特徴分析2019

    • Author(s)
      前田 侑亮, 金 明哲
    • Journal Title

      情報知識学会誌

      Volume: 29(1) Pages: 3-22

    • NAID

      130007612494

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] A comparative study of feature selection methods2018

    • Author(s)
      W. Zheng and M. Jin
    • Journal Title

      International Journal on Natural Language Computing

      Volume: 7(5) Pages: 1-9

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] 性別を偽装した文章における文体的特徴変化2018

    • Author(s)
      財津 亘・金 明哲
    • Journal Title

      同志社大学ハリス理化学研究報告

      Volume: 59(3) Pages: 47-54

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] パソコン遠隔操作事件で調著者識別による犯人性立証は可能だったか?2018

    • Author(s)
      財津 亘・金 明哲
    • Journal Title

      情報知識学会誌

      Volume: 28(3) Pages: 2530258-2530258

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] 文末語の使用率に基づいた筆者識別―探索的多変量解析の実施と分析結果に対すスコアリングによる検討―2018

    • Author(s)
      財津 亘・金 明哲
    • Journal Title

      計量国語学

      Volume: 31(6) Pages: 417-425

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] 機械学習を用いた著者の年齢層推定―犯罪者プロファイリング実現に向けて―2018

    • Author(s)
      財津 亘・金 明哲
    • Journal Title

      同志社大学ハリス理化学研究報告

      Volume: 59(2) Pages: 57-65

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] テキストマイニングによる筆者識別の正確性ならびに判定手続きの標準化2018

    • Author(s)
      財津 亘・金 明哲
    • Journal Title

      行動計量学

      Volume: 45(1) Pages: 39-47

    • NAID

      130007504350

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] 川端康成小説『花日記』の代筆疑惑検証2018

    • Author(s)
      孫 昊, 金 明哲
    • Journal Title

      情報知識学会誌

      Volume: 28(1) Pages: 3-14

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Presentation] The Effectiveness of Maximal Information Coefficient in Real-world Classification Tasks2020

    • Author(s)
      Y. Chen, W. Zheng and M. Jin
    • Organizer
      日本分類学会第39回大会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 文字起こしデータを用いた話者識別2020

    • Author(s)
      柳 燁佳, 金 明哲
    • Organizer
      日本分類学会第39回大会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 構造的トピックモデルによる近現代小説の文体変化の考察2020

    • Author(s)
      李 広微, 金 明哲
    • Organizer
      計量国語学会第64回大会
    • Related Report
      2020 Annual Research Report
  • [Presentation] A Fast Class Noise Detector with Multi-factor-based Learning2020

    • Author(s)
      W. Zheng, M. Jin
    • Organizer
      2020年度統計関連学会連合大会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 構造トピックモデルを用いた文体変化の経時的分析2020

    • Author(s)
      劉 雪琴, 金 明哲
    • Organizer
      2020年度統計関連学会連合大会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 異なるジャンルの文章が教材する場合における著者識別分析2020

    • Author(s)
      柳 燁佳, 金 明哲
    • Organizer
      第48回日本行動計量学会
    • Related Report
      2020 Annual Research Report
  • [Presentation] テキストマイニングによる企業倒産分析2020

    • Author(s)
      許 麗夢, 金 明哲
    • Organizer
      第48回日本行動計量学会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 想起されたフレーズの長さから読み解く宇野浩二の文体変化2020

    • Author(s)
      劉 雪琴, 金 明哲
    • Organizer
      第48回日本行動計量学会
    • Related Report
      2020 Annual Research Report
  • [Presentation] トピックモデルに基づいた現代小説の接続表現の分析2020

    • Author(s)
      李 広微, 金 明哲
    • Organizer
      第48回日本行動計量学会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 構文情報に基づく中国語文章の著者識別.2020

    • Author(s)
      李 芸萱, 金 明哲
    • Organizer
      第48回日本行動計量学会
    • Related Report
      2020 Annual Research Report
  • [Presentation] 茶杓造形の計量分析―薮内家の茶杓系統について2020

    • Author(s)
      耕三寺 華蓮, 金 明哲
    • Organizer
      第48回日本行動計量学会
    • Related Report
      2020 Annual Research Report
  • [Presentation] Effects of Training Data Size and Class Imbalance on the Performance of Classifiers2019

    • Author(s)
      W. Zheng, M. Jin
    • Organizer
      Artificial Intelligence and Natural Language(8th Conference, AINL 2019
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] FTA: a novel feature training approach for classification2019

    • Author(s)
      W. Zheng, M. Jin
    • Organizer
      Proceedings of 33rd Pacific Asia Conference on Language,
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Ghostwriting Analysis Using Outlier Detection methods2019

    • Author(s)
      H. Sun , M. Jin
    • Organizer
      Language and literature 2020
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Diachronic changes of sentence-final expression in modern Japanese novels2019

    • Author(s)
      G. Li and M. Jin
    • Organizer
      International Islamic University Malaysia,
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] Improving the performance of Japanese authorship attribution with phonetic related information2019

    • Author(s)
      H. Sun , M. Jin
    • Organizer
      16th Conference of the International Federation of Classification Societies
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] 過去百年間における小説の文体変容についての定量的分析2019

    • Author(s)
      李 広微, 金 明哲
    • Organizer
      第47回日本行動計量学会
    • Related Report
      2019 Research-status Report
  • [Presentation] コーパスを用いた仮定形における音韻融合使用と印象評定に関する研究2019

    • Author(s)
      入江さやか・金明哲
    • Organizer
      シンポジウム「日常会話コーパス」IV
    • Related Report
      2018 Research-status Report
  • [Presentation] Quantitative Analysis of Writing Style Problem in Yasunari Kawabata’s Novels.2018

    • Author(s)
      H. Sun and M. Jin
    • Organizer
      9th International Conference of Digital Archives and Digital Humanities
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Collaborative Writing of Yasunari Kawabata's Novel Otome no minato.2018

    • Author(s)
      H. Sun and M. Jin
    • Organizer
      Proceedings of International Quantitative Linguistics Conference (QUALICO).
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Ghostwriting problem of Yasunari Kawabata's Novel Soranokatakana.2018

    • Author(s)
      H. Sun and M. Jin
    • Organizer
      Digital Humanities Australia 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Evaluate Lexical Richness Measures Using Coefficient of Variation and Relative Value2018

    • Author(s)
      W. Zheng and M. Jin
    • Organizer
      19th International Conference on Computational Linguistics and Intelligent Text Processing
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Comparing feature selection methods by using rank aggregation2018

    • Author(s)
      W. Zheng and M. Jin
    • Organizer
      16th IEEE International Conference on ICT and Knowledge Engineering
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Do we need more samples for text classification?,2018

    • Author(s)
      W. Zheng and M. Jin
    • Organizer
      ACM Artifical Intelligene and Cloud Computing Conference
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Feature analysis of paintings using color information of the image2018

    • Author(s)
      R. Yukimura, H. Sun,M. Jin
    • Organizer
      Digital Humanities Austria 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Classification of Osamu Dazai‘s works based on part-of-speech bigrams and usage of commas2018

    • Author(s)
      N. Oshiro, M. Jin, A. Kawase, H. Sun
    • Organizer
      Digital Humanities Austria 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Epoch changes of stylistic features in modern Japanese novels2018

    • Author(s)
      G. Li and M. Jin
    • Organizer
      Digital Humanities Austria 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Japanese Authorship Attribution Based on Sentence Pattern2018

    • Author(s)
      S. Huang and M. Jin
    • Organizer
      Digital Humanities Austria 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] 太宰治の前期文体における芥川作品からの影響の有無について2018

    • Author(s)
      尾城 奈緒子, 金明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 文末表現に着目した文学作品の分類.2018

    • Author(s)
      尾城 奈緒子, 金明哲
    • Organizer
      2018年度日本分類学会シンポジウム
    • Related Report
      2018 Research-status Report
  • [Presentation] 絵画作品における色彩的特徴の計量的比較分析2018

    • Author(s)
      行村 隆平, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 絵画作品における色彩情報を用いた画家の識別2018

    • Author(s)
      行村 隆平, 金 明哲
    • Organizer
      2018年度日本分類学会シンポジウム
    • Related Report
      2018 Research-status Report
  • [Presentation] 判別分析による宇野浩二と同時代作家の比較分析2018

    • Author(s)
      劉 雪琴, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] トピックモデルに基づく宇野浩二の創作時期についての検討2018

    • Author(s)
      劉 雪琴, 金 明哲
    • Organizer
      2018年度日本分類学会シンポジウム
    • Related Report
      2018 Research-status Report
  • [Presentation] 宇野文学の計量分析ー同時代の作家との比較として2018

    • Author(s)
      劉 雪琴, 金 明哲
    • Organizer
      第32回日本計算機統計学会
    • Related Report
      2018 Research-status Report
  • [Presentation] 特徴選択方法の性能評価分析2018

    • Author(s)
      鄭 弯弯, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 現代日本語小説の文体的特徴の変化について-大正・昭和の作品を中心として-2018

    • Author(s)
      李 広微, 金明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 戦前・戦後の日本小説の分類とその特徴分析2018

    • Author(s)
      李 広微, 金明哲
    • Organizer
      2018年度日本分類学会シンポジウム
    • Related Report
      2018 Research-status Report
  • [Presentation] 方言録音文字化資料における拍bigramを用いたトピックモデルによる方言分類2018

    • Author(s)
      入江 さやか, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 音素を文体特徴量とした日本語著者識別2018

    • Author(s)
      孫 昊, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 著者識別における文型特徴量の有効性に関する比較分析2018

    • Author(s)
      黄 善玉, 柳 燁佳, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 文型に基づいた著者識別2018

    • Author(s)
      黄 善玉,金 明哲
    • Organizer
      2018年度日本分類学会シンポジウム
    • Related Report
      2018 Research-status Report
  • [Presentation] 日本語文学作品の著者識別におけるfastTextの性能の比較分析2018

    • Author(s)
      柳 燁佳, 金 明哲
    • Organizer
      第46回日本行動計量学会年次大会
    • Related Report
      2018 Research-status Report
  • [Presentation] 複数特徴量を用いた菊池寛代作問題の分類分析2018

    • Author(s)
      柳 燁佳, 金 明哲
    • Organizer
      2018年度日本分類学会シンポジウム
    • Related Report
      2018 Research-status Report
  • [Book] テキストアナリティクスの基礎と実践2021

    • Author(s)
      金 明哲
    • Total Pages
      340
    • Publisher
      岩波書店
    • ISBN
      4000298968
    • Related Report
      2020 Annual Research Report
  • [Book] 文学と言語コーパスのマイニング2021

    • Author(s)
      金 明哲、中村 靖子、上阪 彩香、土山 玄、孫 昊、劉 雪琴、李 広微、入江 さやか
    • Total Pages
      248
    • Publisher
      岩波書店
    • ISBN
      4000299026
    • Related Report
      2020 Annual Research Report
  • [Book] 金融・経済分析のためのテキストマイニング2021

    • Author(s)
      和泉 潔、坂地 泰紀、松島 裕康
    • Total Pages
      172
    • Publisher
      岩波書店
    • ISBN
      4000299018
    • Related Report
      2020 Annual Research Report
  • [Book] テキストマイニングの基礎技術と応用2020

    • Author(s)
      那須川 哲哉、吉田 一星、宅間 大介、鈴木 祥子、村岡 雅康、小比田 涼介
    • Total Pages
      286
    • Publisher
      岩波書店
    • ISBN
      4000298976
    • Related Report
      2020 Annual Research Report
  • [Book] 文化情報学事典2019

    • Author(s)
      村上征勝監修・金明哲・他 編
    • Total Pages
      832
    • Publisher
      勉誠出版
    • Related Report
      2019 Research-status Report
  • [Book] テキストアナリティクス2018

    • Author(s)
      金 明哲
    • Total Pages
      210
    • Publisher
      共立出版
    • ISBN
      9784320112612
    • Related Report
      2018 Research-status Report
  • [Book] 犯罪捜査のためのテキストマイニング2018

    • Author(s)
      金 明哲 監修、財津 亘 著
    • Total Pages
      223
    • Publisher
      共立出版
    • ISBN
      9784320124424
    • Related Report
      2018 Research-status Report
  • [Remarks] テキストマイニング2018

    • URL

      https://www1.doshisha.ac.jp/~mjin/lab/TM2018.html

    • Related Report
      2018 Research-status Report

URL: 

Published: 2018-04-23   Modified: 2022-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi