• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Hierarchical Classification from Big Data

Research Project

Project/Area Number 25330271
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Intelligent informatics
Research InstitutionToyota Technological Institute

Principal Investigator

Yutaka Sasaki  豊田工業大学, 工学(系)研究科(研究院), 教授 (60395019)

Project Period (FY) 2013-04-01 – 2016-03-31
Project Status Completed (Fiscal Year 2015)
Budget Amount *help
¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2015: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2014: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2013: ¥2,210,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥510,000)
Keywords階層的分類 / 機械学習 / LSHTC3 / 分散ベクトル表現 / 文書分類 / Big Data / 大規模文書分類 / DCASVM / ACCS / LSHTC3 Wikipedia data / Pegasos / LSHTC / ビッグデータ / SVM
Outline of Final Research Achievements

We constructed fast and accurate hierarchical classification systems on the basis of the LSHTC3 Wikipedia data, which are huge hierarchical classification datasets. The training time of our system on the LSHTC3 Wikipedia Medium data has been reduced to 30 minutes. Conventional methods for the same data took several hours or even several days. The predictive performance for the test data showed the world highest scores. Moreover, we generated new features based on the distributed embedding vectors which have been created from the original features. Adding the new features further improved the predictive performance over the test data to 44.92%. We made our hierarchical classification system Eze publicly available as open-source software.

Report

(4 results)
  • 2015 Annual Research Report   Final Research Report ( PDF )
  • 2014 Research-status Report
  • 2013 Research-status Report
  • Research Products

    (5 results)

All 2016 2015 2014 Other

All Int'l Joint Research (1 results) Presentation (4 results) (of which Int'l Joint Research: 1 results)

  • [Int'l Joint Research] TTI at Chicago(米国)

    • Related Report
      2015 Annual Research Report
  • [Presentation] IN-DEDUCTIVE and DAG-TREE Approaches for Large-Scale Extreme Multi-label Hierarchical Text Classification2016

    • Author(s)
      Sohrab, Makoto Miwa, Yutaka Sasaki
    • Organizer
      17th International Conference on Intelligent Text Processing and Computational Linguistics
    • Place of Presentation
      Konya, Turkey
    • Year and Date
      2016-04-03
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Word Embeddings in Large-Scale Deep Architecture Learning2016

    • Author(s)
      Mohammad Golam Sohrab, Makoto Miwa, Yutaka Sasaki
    • Organizer
      言語処理学会第22回年次大会
    • Place of Presentation
      東北大学(宮城県・仙台市)
    • Year and Date
      2016-03-07
    • Related Report
      2015 Annual Research Report
  • [Presentation] DCASVMを用いた高性能な大規模階層的文書分類2015

    • Author(s)
      佐々木裕, Mohammad Golam Sohrab, 三輪誠
    • Organizer
      第21回言語処理学会年次大会
    • Place of Presentation
      京都大学
    • Year and Date
      2015-03-18
    • Related Report
      2014 Research-status Report
  • [Presentation] LSHTC4 のための TTI 文書分類システム2014

    • Author(s)
      佐々木裕, Mohammad Golam Sohrab
    • Organizer
      第20回言語処理学会年次大会
    • Place of Presentation
      北海道大学
    • Related Report
      2013 Research-status Report

URL: 

Published: 2014-07-25   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi