• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A STUDY OF RESOLVING LONG TAIL PHENOMENA BY MACHINE LEARNING

Research Project

Project/Area Number 21240011
Research Category

Grant-in-Aid for Scientific Research (A)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionThe University of Tokyo

Principal Investigator

NAKAGAWA Hiroshi  東京大学, 情報基盤センター, 教授 (20134893)

Co-Investigator(Kenkyū-buntansha) YOSHIDA Minoru  東京大学, 情報基盤センター, 助教 (40361688)
KIYOTA Youji  東京大学, 情報基盤センター, 助教 (10401316)
SATO Issei  東京大学, 情報基盤センター, 助教 (90610155)
Co-Investigator(Renkei-kenkyūsha) NINOMIYA Takashi  東京大学, 情報基盤センター, 講師 (20444094)
Project Period (FY) 2009 – 2012
Project Status Completed (Fiscal Year 2012)
Budget Amount *help
¥47,060,000 (Direct Cost: ¥36,200,000、Indirect Cost: ¥10,860,000)
Fiscal Year 2011: ¥14,560,000 (Direct Cost: ¥11,200,000、Indirect Cost: ¥3,360,000)
Fiscal Year 2010: ¥14,950,000 (Direct Cost: ¥11,500,000、Indirect Cost: ¥3,450,000)
Fiscal Year 2009: ¥17,550,000 (Direct Cost: ¥13,500,000、Indirect Cost: ¥4,050,000)
Keywords知識発見 / データマイニング / 機械学習 / テキストマイニング / Web / ネットワークデータ / 統計 / 曖昧正解消 / プライバシー保護 / 言語学習 / クラスタリング / 曖昧性解消 / テキスト / 非負行列分解 / GPU / アルゴリズム
Research Abstract

We developed a clustering system which makes clusters of web pages in response to a person name query in 2009 as planned, and evaluate it experimentally. In 2010, our contribution is a new non-negative probabilistic matrix decomposition algorithm and application of Variational Bayes method to Pitman-YO process. In 2011, our contribution for PPDM is a link analysis algorithm with public key encryption and specific protocol. In 2012, we developed a new online learning algorithm as well as new PPDM method.

Report

(4 results)
  • 2012 Final Research Report ( PDF )
  • 2011 Annual Research Report
  • 2010 Annual Research Report
  • 2009 Annual Research Report
  • Research Products

    (57 results)

All 2013 2012 2011 2010 2009 Other

All Journal Article (21 results) (of which Peer Reviewed: 21 results) Presentation (30 results) Book (2 results) Remarks (4 results)

  • [Journal Article] Personalized Reading Support for Second-Language Web Documents2013

    • Author(s)
      Yo Ehara,Nobuyuki Shimizu,Takashi Ninomoya,Hiroshi Nakagawa
    • Journal Title

      ACM Transactions on Intelligent Systems and Technology

      Volume: 4(2)

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Personalized Reading Support for Second-Language Web Documents2013

    • Author(s)
      Yo Ehara, Nobuyuki Shimizu, Takashi Ninomiya, Hiroshi Nakagawa
    • Journal Title

      ACM Transactions on Intelligent Systems and Technology

      Volume: 4(2)

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Privacy-Preserving EM Algorithm for Clustering on Social Network2012

    • Author(s)
      Yang Bin,Hiroshi Nakagawa
    • Journal Title

      P.-N.Tan et al.(Eds.):PAKDD 2012,Part I

      Volume: LNAI 7301 Pages: 542-553

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Healing Truncation Bias : Self-weighted Truncation framework for Dual Averaging2012

    • Author(s)
      Hidekazu Oiwa, Shin Matsushima, and Hiroshi Nakagawa
    • Journal Title

      IEEE International Conference on Data Mining(ICDM)

      Volume: 12 Pages: 575-584

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Personalized Reading Support for Second-Language Web Document2012

    • Author(s)
      Yo Ehara, Nobuyuki Shimizu, Takashi Ninomoya, Hiroshi Nakagawa
    • Journal Title

      ACM Transactions on Intelligent Systems and Technology

      Volume: (掲載確定)

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Probabilistic Matrix Factorization Leveraging Contexts for Unsupervised Relation Extraction2011

    • Author(s)
      Shingo Takamatsu,Issei Sato,Hiroshi Nakagawa
    • Journal Title

      PAKDD2011, Springer Lecture Notes Artificial Intelligence (LNAI)6634,Part I.

      Pages: 87-99

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 特徴の出現回数に応じたL1正則化を実現する教師ありオンライン学習手法2011

    • Author(s)
      大岩秀和,松島慎,中川裕志
    • Journal Title

      情報処理学会論文誌

      Volume: Vol.50 TOM4(3) Pages: 84-93

    • NAID

      170000066490

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 統合したグラフのプライバシ保護リンク解析2011

    • Author(s)
      森井正覚,佐久間淳,佐藤一誠,中川裕志
    • Journal Title

      情報処理学会論文誌

      Volume: Vol.50 TOD4(2) Pages: 52-60

    • NAID

      40019597872

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Nobuyuki Shimizu and Hiroshi Nakagawa.Deterministic shift-reduce parsing for unification-based grammars2011

    • Author(s)
      Takashi Ninomiya,Takuya Matsuzaki
    • Journal Title

      Natural Language Engineering

      Volume: vol.17,no.3 Pages: 331-365

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 特徴の出現回数に応じたL1正則化を実現する教師ありオンライン学習手法2011

    • Author(s)
      大岩秀和, 松島慎, 中川裕志
    • Journal Title

      情報処理学会論文誌

      Volume: 50 TOM 4 Pages: 84-93

    • NAID

      170000066490

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 統合したグラフのプライバシ保護リンク解析2011

    • Author(s)
      森井正覚, 佐久間淳, 佐藤一誠, 中川裕志
    • Journal Title

      情報処理学会論文誌

      Volume: 50 TOD 4 Pages: 52-60

    • NAID

      40019597872

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 確率的潜在意味解析における特異値行列の非対角化の解釈とその評価2011

    • Author(s)
      柴山直樹, 中川裕志
    • Journal Title

      人工知能学会論文誌

      Volume: 26(1) Pages: 262-272

    • NAID

      130000455375

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Succinct Semi-structured Data Mining Based on FREQT2010

    • Author(s)
      佐藤一誠、中川裕志
    • Journal Title

      日本データベース学会論文誌

      Volume: Vol.9,No.1 Pages: 76-81

    • NAID

      130000337146

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] PAアルゴリズムにおけるラベルなしデータからの学習2010

    • Author(s)
      松島慎、佐藤一誠、二宮崇、中川裕志
    • Journal Title

      日本データベース学会論文誌

      Volume: Vol.9,No.1 Pages: 82-87

    • NAID

      40017216480

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Mining Numbers in Text Using Suffix Arrays and Clustering Based on Dirichlet Process Mixture Models2010

    • Author(s)
      Minoru Yoshida.Hiroshi Nakagawa
    • Journal Title

      (PAKDD 2010) Part II

      Pages: 230-237

    • NAID

      120007131162

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 多クラス識別問題におけるPassive-Aggressiveアルゴリズムの効率的厳密解法2010

    • Author(s)
      松島慎、清水伸幸、吉田和弘、二宮崇、中川裕志
    • Journal Title

      電子情報通信学会論文誌:情報爆発特集号

      Volume: Vol.J93-D.No.6 Pages: 724-732

    • NAID

      110007618347

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Spectral Methods and Text Mining Automatic Expansion of User2010

    • Author(s)
      Nobuyuki Shimizu,Masashi Sugiyama,Hiroshi Nakagawa
    • Journal Title

      IEICE Transactions,E93-D

      Volume: 6 Pages: 1378-1385

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] コーパス検索支援のための動的同義語候補抽出2010

    • Author(s)
      吉田稔,中川裕志,寺田昭
    • Journal Title

      人工知能学会論文誌

      Volume: 25(1) Pages: 122-132

    • NAID

      130000151243

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 確率的潜在意味解析における特異値行列の非対角化の解釈とその評価2010

    • Author(s)
      柴山直樹、中川裕志
    • Journal Title

      人工知能学会論文誌

      Volume: Vol.26,No.1 Pages: 262-272

    • NAID

      130000455375

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 二段階クラスタリングを単語重み付与に応用した人名曖昧性解消2010

    • Author(s)
      吉田稔、池田雅紀、小野真吾、佐藤一誠、中川裕志
    • Journal Title

      日本データベース学会論文誌

      Volume: 9(2) Pages: 19-24

    • NAID

      40017420150

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] コーパス検索支援のための動的同義語候補抽出2009

    • Author(s)
      吉田稔、中川裕志、寺田昭
    • Journal Title

      人工知能学会論文誌

      Volume: 25(1) Pages: 122-132

    • NAID

      130000151243

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Presentation] Privacy-Preserving EM Algorithm for Clustering on Social Network.2013

    • Author(s)
      Bing Yang,Issei Sato,Hiroshi Nakagawa
    • Organizer
      The 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2012)
    • Place of Presentation
      Kuala Lumpur,Malaysia
    • Related Report
      2012 Final Research Report
  • [Presentation] Mining words in the minds of second language learners:learner-specific word difficulty2012

    • Author(s)
      Yo Ehara,Issei Sato,Hidekazu Oiwa,and Hiroshi Nakagawa
    • Organizer
      25th International Conference on Computational Linguistics (COLING 2012)
    • Place of Presentation
      Mumbai, India
    • Related Report
      2012 Final Research Report
  • [Presentation] Healing Truncation Bias:Self-weighted Truncation framework for Dual Averaging2012

    • Author(s)
      Hidekazu Oiwa,Shin Matsushima,and Hiroshi Nakagawa
    • Organizer
      12th IEEE International Conference on Data Mining(ICDM)
    • Place of Presentation
      Brussels
    • Related Report
      2012 Final Research Report
  • [Presentation] Practical Collapsed Variational Bayes Inference for Hierarchical Dirichlet Process.2012

    • Author(s)
      Issei Sato,Ken-ich Kurihara,Hiroshi Nakagawa
    • Organizer
      18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (KDD 2012)
    • Place of Presentation
      Beijing,China
    • Related Report
      2012 Final Research Report
  • [Presentation] Rethinking Collapsed Variational Bayes Inference for LDA.2012

    • Author(s)
      Issei Sato,Hiroshi Nakagawa
    • Organizer
      29th International Conference on Machine Learning (ICML 2012)
    • Place of Presentation
      Edinburgh,Scotland
    • Related Report
      2012 Final Research Report
  • [Presentation] Reducing Wrong Labels in Distant Supervision for Relation Extraction.2012

    • Author(s)
      Shingo Takamatsu,Issei Sato,Hiroshi Nakagawa
    • Organizer
      ACL 2012
    • Place of Presentation
      Jeju,Korea on
    • Related Report
      2012 Final Research Report
  • [Presentation] テキストマイニングによる機器異常診断支援の試み第4回データ工学と情報マネジメントに関するフォーラム2012

    • Author(s)
      吉田稔,中川裕志,渋谷久恵,前田俊二
    • Organizer
      第10回日本データベース学会年次大会
    • Related Report
      2012 Final Research Report
  • [Presentation] ブートストラップ法のための能動学習2012

    • Author(s)
      江原遥,佐藤一誠,中川裕志
    • Organizer
      言語処理学会第18回年次大会
    • Related Report
      2012 Final Research Report
  • [Presentation] ソーシャルメディアによる風邪流行の予測2012

    • Author(s)
      谷田和章,荒牧英治,佐藤一誠,吉田稔,中川裕志
    • Organizer
      言語処理学会第18回年次大会
    • Related Report
      2012 Final Research Report
  • [Presentation] ソーシャルメディアを用いた風邪薬販売量の予測2012

    • Author(s)
      谷田和章,荒牧英治,佐藤一誠,吉田稔,中川裕志
    • Organizer
      言語処理学会第18回年次大会,広島
    • Related Report
      2012 Final Research Report
  • [Presentation] Privacy-Preserving EM Algorithm for Clustering on Social Network2012

    • Author(s)
      Yang Bin, Hiroshi Nakagawa
    • Organizer
      The 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)
    • Place of Presentation
      Kuala Lumpur, Malaysia
    • Related Report
      2011 Annual Research Report
  • [Presentation] Reducing Wrong Labels in Distant Supervision for Relation Extraction2012

    • Author(s)
      Shingo Takamatsu, Issei Sato, Hiroshi Nakagawa
    • Organizer
      50th annual meeting of the Association for Computational Linguistics (ACL)
    • Place of Presentation
      Jeju, Korea
    • Related Report
      2011 Annual Research Report
  • [Presentation] Probabilistic Matrix Factorization Leveraging Contexts for Unsupervised Relation Extraction2011

    • Author(s)
      Shingo Takamatsu, Issei Sato, Hiroshi Nakagawa
    • Organizer
      PAKDD2011
    • Place of Presentation
      Shenzhen, China
    • Year and Date
      2011-05-24
    • Related Report
      2010 Annual Research Report
  • [Presentation] Secure Clustering in Private Networks.2011

    • Author(s)
      Bing Yang,Issei Sato,Hiroshi Nakagawa
    • Organizer
      11th IEEE International Conference on Data Mining(ICDM)
    • Place of Presentation
      Vancouver,Canada
    • Related Report
      2012 Final Research Report
  • [Presentation] Probabilistic Frequency-aware Truncated methods for Sparse Online Learning.2011

    • Author(s)
      Hidekazu Ooiwa,Shin Matsushima,Hiroshi Nakagawa
    • Organizer
      The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2011)
    • Place of Presentation
      Athens,Greek
    • Related Report
      2012 Final Research Report
  • [Presentation] Twitterによる風邪流行の推測2011

    • Author(s)
      谷田和章,荒牧英治,佐藤一誠,吉田稔,中川裕志
    • Organizer
      人工知能学会情報編纂研究会第6回研究会,東京
    • Related Report
      2012 Final Research Report
  • [Presentation] Person Name Disambiguation and Other Problems2010

    • Author(s)
      Minoru Yoshida,Hiroshi Nakagawa:Web People Search
    • Organizer
      Tutorial of The 2nd Asian Conference on Machine Learning (ACML2010)
    • Place of Presentation
      Tokyo,Japan
    • Year and Date
      2010-11-08
    • Related Report
      2012 Final Research Report
  • [Presentation] ITC-UT : Tweet Categorization by Query Categrization for On-line Reputation management2010

    • Author(s)
      Minoru Yoshida, Shin Matsushima, Shingo Ono, Hiroshi Nakagawa
    • Organizer
      CLEF 2010 Labs WePS
    • Place of Presentation
      Padua, Italy
    • Year and Date
      2010-09-22
    • Related Report
      2010 Annual Research Report
  • [Presentation] Topic Models with Power-Law Using Pitman-Yor Process2010

    • Author(s)
      Issei Sato, Hiroshi Nakagawa
    • Organizer
      16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    • Place of Presentation
      Washington, DC, USA
    • Year and Date
      2010-07-26
    • Related Report
      2010 Annual Research Report
  • [Presentation] Person Name Disambiguation by Bootstrapping2010

    • Author(s)
      Minoru Yoshida
    • Organizer
      The 33rd ACM SIGIR Conference
    • Place of Presentation
      Geneva, Swiss
    • Year and Date
      2010-07-20
    • Related Report
      2009 Annual Research Report
  • [Presentation] Deterministic Single-Pass Algorithm for LDA.2010

    • Author(s)
      Issei Sato,Kenich Kurihara,Hiroshi Nakagawa
    • Organizer
      Neural Information Processing Systems Conference (NIPS2010)
    • Related Report
      2012 Final Research Report
  • [Presentation] Topic Models with Power-Law Using Pitman-Yor Process.2010

    • Author(s)
      Issei Sato,Hiroshi Nakagawa
    • Organizer
      <16>^ ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,(KDD2010)
    • Related Report
      2012 Final Research Report
  • [Presentation] Collusion-Resistant Privacy-Preserving Data Mining2010

    • Author(s)
      Bin Yang,Hiroshi Nakagawa,Issei Sato,Jun Sakuma
    • Organizer
      <16>^ ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,(KDD2010)
    • Related Report
      2012 Final Research Report
  • [Presentation] Person Name Disambiguation by Bootstrapping2010

    • Author(s)
      Minoru Yoshida,Masaki Ikeda,Shingo Ono,Issei Sato,Hiroshi Nakagawa
    • Organizer
      The 33rd Annual ACM SIGIR Conference.
    • Related Report
      2012 Final Research Report
  • [Presentation] Exact Passive-Aggressive Algorithm for Multiclass Classification Using Support Class2010

    • Author(s)
      Shin Matsushima,Nobuyuki Shimizu,Kazuhiro Yoshida,Takashi Ninomiya,Hiroshi Nakagawa
    • Organizer
      the 2010 SIAM International Conference on Data Mining (SDM'2010)
    • Place of Presentation
      This paper is selected as top 12 papers of SDM
    • Related Report
      2012 Final Research Report
  • [Presentation] Discovering Serendipitous Information from Wikipedia by Using its Network Structure2010

    • Author(s)
      Yohei Noda,Yoji Kiyota,Hiroshi Nakagawa
    • Organizer
      In Proceedings of 4th Int'l AAAI Conference on Weblogs and Social Media(ICWSM 2010),poster session
    • Place of Presentation
      Washington,D.C.,USA
    • Related Report
      2012 Final Research Report
  • [Presentation] Person Name Disambiguation on the Web by TwoStage Clustering.2nd Web People Search Evaluation Workshop (WePS 2009)2009

    • Author(s)
      Masaki Ikeda,Shingo Ono,Issei Sato,Minoru Yoshida and Hiroshi Nakagawa
    • Organizer
      18th WWW Conference
    • Place of Presentation
      Madrid, Spain
    • Year and Date
      2009-04-21
    • Related Report
      2012 Final Research Report
  • [Presentation] Quantum Annealing for Variational Bayes Inference2009

    • Author(s)
      Issei Sato,Kenichi Kurihara,Shu Tanaka,Seiji Miyashita and Hiroshi Nakagawa
    • Organizer
      The <25>^ Conference on Uncertainty in Artificial Intelligence (UAI2009)
    • URL

      http://www.cs.mcgill.ca/~uai2009/proceedings.html

    • Related Report
      2012 Final Research Report
  • [Presentation] Latent Dirichlet Allocation における決定論的オンラインベイズ学習2009

    • Author(s)
      佐藤 一誠, 中川裕志
    • Organizer
      情報処理学会自然言語処理研究会
    • Related Report
      2012 Final Research Report
  • [Presentation] Wikipediaからの意外性のある情報の抽出2009

    • Author(s)
      野田陽平,清田陽司,中川裕志
    • Organizer
      NLP若手の会第4回シンポジウム,京都大学
    • Related Report
      2012 Final Research Report
  • [Book] 情報法,(宇賀克也,長谷部恭男 編:第 8 章 データベースサービスとコンテンツ)2012

    • Author(s)
      中川裕志
    • Publisher
      有斐閣
    • Related Report
      2012 Final Research Report
  • [Book] 情報法 (第8章)2012

    • Author(s)
      中川裕志
    • Publisher
      有斐閣
    • Related Report
      2011 Annual Research Report
  • [Remarks]

    • URL

      http://www.r.dl.itc.u-tokyo.ac.jp/node/46/

    • Related Report
      2012 Final Research Report
  • [Remarks] 公表論文リスト

    • URL

      http://www.r.dl.itc.u-tokyo.ac.jp/node/46/

    • Related Report
      2011 Annual Research Report
  • [Remarks]

    • URL

      http://www.r.dl.itc.u-tokyo.ac.jp/node/46/

    • Related Report
      2010 Annual Research Report
  • [Remarks]

    • URL

      http://www.r.dl.itc.u-tokyo.ac.jp/node/10

    • Related Report
      2009 Annual Research Report

URL: 

Published: 2009-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi