• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Establishment of Automatic Word Segmentation Technology from Large-scale Text Data Independent of Language

Research Project

Project/Area Number 16K01267
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Social systems engineering/Safety system
Research InstitutionShonan Institute of Technology

Principal Investigator

Suzuki Makoto  湘南工科大学, 工学部, 教授 (80339796)

Co-Investigator(Kenkyū-buntansha) 三川 健太  湘南工科大学, 工学部, 准教授 (40707733)
Project Period (FY) 2016-04-01 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)
Fiscal Year 2018: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2017: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2016: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Keywords多言語処理 / 感情極性辞書 / テキストマイニング / N-gram / 単語抽出 / 単語切り出し / 自動抽出 / 自動分割
Outline of Final Research Achievements

In this research, we constructed a word segmentation technology that processes text data that is mixed with multiple languages expressed in Unicode with the same program. This technique is a language-independent word segmentation method based on a simple state transition model that does not require any dictionary or grammatical knowledge for each language. The research proceeded mainly in two directions: (1) extension of the language to be processed and (2) extension of application cases. Regarding (1), We confirmed that it is effective not only for Japanese but also for Japanese classics and foreign languages such as English, Chinese, and Korean. Regarding (2), we were able to propose a method for automatically creating an emotional polarity dictionary using user reviews of products and facilities.

Academic Significance and Societal Importance of the Research Achievements

本研究では、対象のレビューデータをもとに感情極性辞書を自動的に作成する手法を提案することができた。感情極性辞書とは、文章に含まれる単語に対し、文中に含まれる特有の極性(ポジティブ、ネガティブ)を持つ単語が含まれているという考えに基づき、単語に対し極性値を与えた辞書である。今回は商品や施設のユーザレビュー(5段階の評価値付きのテキストデータ)を用いて、評価値に基づいて感情極性値を算出することにより、感情極性辞書を自動的に作成する手法を提案した。これにより、コンピュータが自動的にユーザレビューを収集し、ある商品や施設に特化した感情極性辞書を構成できる可能性を示唆することができた。

Report

(5 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Research-status Report
  • 2017 Research-status Report
  • 2016 Research-status Report
  • Research Products

    (38 results)

All 2020 2019 2018 2017 2016

All Journal Article (15 results) (of which Int'l Joint Research: 1 results,  Peer Reviewed: 15 results,  Open Access: 10 results,  Acknowledgement Compliant: 1 results) Presentation (23 results) (of which Int'l Joint Research: 11 results)

  • [Journal Article] 周期性とイベント効果に着目した消費者の購買行動分析モデルに関する一考察2019

    • Author(s)
      安井一貴,中野修平,三川健太,後藤正幸
    • Journal Title

      経営情報学会誌

      Volume: Vol.28, No.2 Pages: 69-87

    • NAID

      40022046971

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] An Analytic Model to Represent Relation between Finish Date of Job-Hunting and Time-Series Variation of Entry Tendencies2019

    • Author(s)
      S. Nagamori, K. Mikawa, M. Goto, and T. Ogihara
    • Journal Title

      Industrial Engineering & Management Systems

      Volume: Vol.18, No.3 Pages: 292-304

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 販売履歴データに基づく中古ファッションアイテムの販売価格予測モデルに関する一考察2019

    • Author(s)
      仁ノ平 将人,三川健太,後藤正幸
    • Journal Title

      情報処理学会論文誌

      Volume: Vol.60, No.4 Pages: 1151-1161

    • NAID

      170000150291

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 気象情報とTweetデータの統合的分析による体感気温の定量化とその需要予測への応用2018

    • Author(s)
      馬賀嵩士, 三川健太,後藤正幸,吉開朋弘
    • Journal Title

      電子情報通信学会論文誌D

      Volume: Vol. J101-D, No. 7 Pages: 1037-1051

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] 推定購買確率と予測評価値をバランスする意外性指標に基づく推薦システム2018

    • Author(s)
      関口あゆみ, 仁ノ平 将人,三川健太,後藤正幸
    • Journal Title

      経営情報学会誌

      Volume: Vol.27, No.1 Pages: 67-78

    • NAID

      40021645389

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] マルコフ潜在クラスモデルに基づくECサイトにおける施策実施効果分析に関する一考察2017

    • Author(s)
      松嵜祐樹, 三川健太,後藤正幸
    • Journal Title

      情報処理学会論文誌

      Volume: Vol.58, No.12 Pages: 2034-2045

    • NAID

      170000149126

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Adaptive Prediction Method Based on Alternating Decision Forests with Considerations for Generalization Ability2017

    • Author(s)
      S. Misawa, K. Mikawa, and M. Goto
    • Journal Title

      Industrial Engineering & Management Systems

      Volume: Vol.16, No.3 Pages: 384-391

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Multi-Valused Classification of Text Data Based on an ECOC Approach Using a Ternary Orthogonal Table2017

    • Author(s)
      L. Suzuki, K. Mikawa, and M. Goto
    • Journal Title

      Industrial Engineering & Management Systems

      Volume: Vol.16, No.2 Pages: 155-164

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] A Proposal for Classification of Document Data with Unobserved Categories Considering Latent Topics2017

    • Author(s)
      Y. Yamamoto, K. Mikawa, and M. Goto
    • Journal Title

      Industrial Engineering & Management Systems

      Volume: Vol.16, No.2 Pages: 165-174

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] データの転送制御に基づいた分散型SVMの効率的な学習手法2017

    • Author(s)
      湯川 輝一朗, 三川健太,後藤正幸
    • Journal Title

      日本経営工学会論文誌

      Volume: Vol.68, No.2 Pages: 86-98

    • NAID

      130005991018

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] 閲覧及び購買行動を同時に表現するアスペクトモデルによる購買予測手法の提案2017

    • Author(s)
      藤原直広, 三川健太,後藤正幸
    • Journal Title

      経営情報学会誌

      Volume: Vol.26, No.1 Pages: 1-16

    • NAID

      40021242891

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] 層別木と混合ワイブル分布に基づく就職活動終了時期の分析モデルの構築2017

    • Author(s)
      早川真央, 三川健太,荻原大陸,後藤正幸
    • Journal Title

      情報処理学会論文誌

      Volume: Vol.58, No.5 Pages: 1189-1206

    • NAID

      170000148583

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Data pair selection for accurate classification based on information-theoretic metric learning2017

    • Author(s)
      T. Maga, K. Mikawa, and M. Goto
    • Journal Title

      Asian J. Management Science and Applications

      Volume: Vol.3, No.1 Pages: 61-74

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Language-Independent Word Acquisition Method Using a State-Transition Model", Industrial Engineering & Management Systems2016

    • Author(s)
      Bin Xu, Naohide Yamagishi, Makoto Suzuki, Masayuki Goto
    • Journal Title

      Industrial Engineering & Management Systems

      Volume: Vol.15, No.3 Issue: 3 Pages: 197-207

    • DOI

      10.7232/iems.2016.15.3.224

    • Related Report
      2016 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] カテゴリ毎に異なる計量行列を用いた計量距離学習手法に関する一考察2016

    • Author(s)
      三川健太,後藤正幸
    • Journal Title

      日本経営工学会論文誌

      Volume: Vol.66, No.4 Pages: 335-347

    • NAID

      130005127091

    • Related Report
      2016 Research-status Report
    • Peer Reviewed / Open Access / Acknowledgement Compliant
  • [Presentation] Analysis of Review Data on Educational Toys2020

    • Author(s)
      M.Suzuki, T.Onuma, N.Katsumata and N.Yamagishi
    • Organizer
      International Conference on Education
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] ゴルフ場のレビューデータを用いた感情極性辞書の作成2019

    • Author(s)
      勝間田 昇, 山岸 直秀, 鈴木 誠
    • Organizer
      経営情報学会春季全国研究発表大会
    • Related Report
      2019 Annual Research Report
  • [Presentation] 知育玩具に関するレビューデータの分析2019

    • Author(s)
      小沼 拓也, 山岸 直秀, 鈴木 誠
    • Organizer
      経営情報学会春季全国研究発表大会
    • Related Report
      2019 Annual Research Report
  • [Presentation] Factorization Machine with Mixed Norm Regularization using ADMM2019

    • Author(s)
      K. Mikawa
    • Organizer
      20th Asia Pacific Industrial Engineering and Management System (APIEMS2019)
    • Related Report
      2019 Annual Research Report
  • [Presentation] An Estimation Model of Open Price for Second-hand Fashion Items Based on Sales History Data2019

    • Author(s)
      Kuwata, T. Sugisaki, K. Mikawa, and M. Goto
    • Organizer
      Proc. 17th Asian Network for Quality Congress (ANQ2019)
    • Related Report
      2019 Annual Research Report
  • [Presentation] An Extension of Semi-Supervised Boosting to Multiclass Classification2019

    • Author(s)
      Y. Sakai, K. Yasui, K. Mikawa, and M. Goto
    • Organizer
      Proc. 17th Asian Network for Quality Congress (ANQ2019)
    • Related Report
      2019 Annual Research Report
  • [Presentation] An Analytical Model of Consumers Purchasing Behavior Considering the Variety of Products2019

    • Author(s)
      K. Yasui, K. Mikawa, and M. Goto
    • Organizer
      Proc. 17th Asian Network for Quality Congress (ANQ2019)
    • Related Report
      2019 Annual Research Report
  • [Presentation] Factorization Machines Considering the Latent Characteristics Behind Target Data2019

    • Author(s)
      T. Sugisaki, K. Mikawa, and M. Goto
    • Organizer
      Proc. 17th Asian Network for Quality Congress (ANQ2019)
    • Related Report
      2019 Annual Research Report
  • [Presentation] A Study of the Application of Canonical Correlation Forests to Text Classification2018

    • Author(s)
      Shuhei Nakano, Kenta Mikawa, and Masayuki Goto
    • Organizer
      The 19th Asia Pacific Industrial Engineering and Management Systems (APIEMS2018)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] Proposal for an l1 regularized Factorization Machine2018

    • Author(s)
      Kenta Mikawa, Manabu Kobayashi, Masayuki Goto, and Shigeichi Hirasawa
    • Organizer
      The 19th Asia Pacific Industrial Engineering and Management Systems (APIEMS2018)
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] A New Entry Behavior Model of Student Users on Job Board for New Graduates Considering the Interaction between Features2018

    • Author(s)
      Tomoya Sugisaki, Yuri Nishio, Kenta Mikawa, Masayuki Goto, and Takashi Sakurai
    • Organizer
      16th Asian Network for Quality Congress
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] 周期性とイベント効果に着目した消費者の購買行動分析モデルに関する一考察2018

    • Author(s)
      安井一貴,中野修平,三川健太,後藤正幸
    • Organizer
      日本経営工学会平成30年度春季研究大会予稿集
    • Related Report
      2018 Research-status Report
  • [Presentation] 特徴間の交互作用を考慮した学生ユーザの企業エントリー行動分析モデルに関する一考察2018

    • Author(s)
      杉崎智哉,西尾友里,三川健太,後藤正幸,桜井 崇
    • Organizer
      日本経営工学会平成30年度春季研究大会予稿集
    • Related Report
      2018 Research-status Report
  • [Presentation] l1正則化に基づくFactorization Machineに関する一考察2018

    • Author(s)
      三川健太,小林 学,後藤正幸,平澤茂一
    • Organizer
      日本経営工学会平成30年度春季研究大会予稿集
    • Related Report
      2018 Research-status Report
  • [Presentation] MineCraftを用いたDQNによる構造物の自動構築の検討2018

    • Author(s)
      畠山一輝,三川健太,小林学
    • Organizer
      2018年電子情報通信学会ソサイエティ大会予稿集
    • Related Report
      2018 Research-status Report
  • [Presentation] Canonical Correlation Forests におけるラベル行列のスパース性を考慮した 分類法に関する一考察2018

    • Author(s)
      中野修平,三川健太,後藤正幸
    • Organizer
      情報処理学会第121回数理モデル化と問題解決研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] Characteristics of a Word Segmentation Method Based on a State-transition Model2017

    • Author(s)
      M.Suzuki, N.Yamagishi, K.Mikawa and M.Goto
    • Organizer
      Proc. of Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS2017)
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] Distance Metric Learnig using Each Category Centroid with Nuclear Norm Regularization2017

    • Author(s)
      K. Mikawa, M. Kobayashi, M. Goto, and S. Hirasawa
    • Organizer
      The 2017 IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2017)
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] Word Acquisition of Japanese Classical Literature Using State Transition Model2016

    • Author(s)
      Makoto Suzuki, Bin Xu, Naohide Yamagishi, Masayuki Goto
    • Organizer
      Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS2016)
    • Place of Presentation
      Taipei, Taiwan
    • Year and Date
      2016-12-07
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] A Study on Distance Metric Learning using Distance Structure among Category Centroids2016

    • Author(s)
      Kenta Mikawa, Manabu Kobayashi, Masayuki Goto, Shigeichi Hirasawa
    • Organizer
      Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS2016)
    • Place of Presentation
      Taipei, Taiwan
    • Year and Date
      2016-12-07
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] A proposal of document recommendation based on topic model2016

    • Author(s)
      Yusei Yamamoto, Kenta Mikawa, Masayuki Goto
    • Organizer
      Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS2016)
    • Place of Presentation
      Taipei, Taiwan
    • Year and Date
      2016-12-07
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] Modeling customer purchase behavior based on page transitions by latent class model for customer segmentation2016

    • Author(s)
      Yuki Matsuzaki, Kenta Mikawa, Masayuki Goto
    • Organizer
      Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS2016)
    • Place of Presentation
      Taipei, Taiwan
    • Year and Date
      2016-12-07
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] A study of extended RFM analysis based on PLSA model for Purchase History Data2016

    • Author(s)
      Qian Zhang, Haruka Yamashita, Kenta Mikawa, Masayuki Goto
    • Organizer
      Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS2016)
    • Place of Presentation
      Taipei, Taiwan
    • Year and Date
      2016-12-07
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2016-04-21   Modified: 2021-02-19  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi