• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2014 Fiscal Year Final Research Report

Mining Numbers in Text for Various Kinds of Text Data

Research Project

  • PDF
Project/Area Number 24500162
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Intelligent informatics
Research InstitutionThe University of Tokushima (2013-2014)
The University of Tokyo (2012)

Principal Investigator

YOSHIDA Minoru  徳島大学, ソシオテクノサイエンス研究部, 講師 (40361688)

Project Period (FY) 2012-04-01 – 2015-03-31
Keywords数値情報抽出 / レイアウト解析
Outline of Final Research Achievements

We studied a method for extracting contexts (i.e., attributes or topics) of numbers written in text. Our goal is to develop a system that accept numbers as queries and returns appropriate data from the various kinds of text data such as Wikipedia, Twitter, etc. To achieve this goal, we proposed a method for extracting numbers and their contexts applicable both to unstructured texts (e.g., sentences) and semi-structured texts (e.g., tables). Our method uses unsupervised learning algorithms based on probabilistic generative models for texts to extract attributes and hierarchical topics from Web documents. We also proposed a method to extract corpus-specific number expressions from any kind of text data. For number expressions, we found a coding scheme that can be used both for indexing and probabilistic generative models.

Free Research Field

テキストマイニング

URL: 

Published: 2016-06-03  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi