Foundational study in extracting, measuring and searching presentation styles and their patterns in digital documents
Project/Area Number |
23650045
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Media informatics/Database
|
Research Institution | Hitotsubashi University |
Principal Investigator |
TAKEMURA Tomoko 一橋大学, 大学院・言語社会研究科, 教授 (60323896)
|
Co-Investigator(Kenkyū-buntansha) |
HASHIMOTO Kiyota 大阪府立大学, 人間社会学部, 准教授 (50278818)
|
Project Period (FY) |
2011 – 2012
|
Project Status |
Completed (Fiscal Year 2012)
|
Budget Amount *help |
¥3,640,000 (Direct Cost: ¥2,800,000、Indirect Cost: ¥840,000)
Fiscal Year 2012: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2011: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
|
Keywords | コーパス / スタイル / ウェブサイト / 電子文書 |
Research Abstract |
A "Style Corpus" database system,which serves as infrastructure for interdisciplinary researches on various aspects of document-based communication, was constructed on the basis of a correlation analysis between visual presentation styles of digital documents and underlying markup codes. Based on a set of style data extracted from manually-selected web resources, algorithms for searching, analyzing and visualizing its underlying patterns were developed and implemented. Although, due to the limited size of the data, some areas still remain unexplored, the Style Corpus is proved to be effective as a tool for the correlation analysis.
|
Report
(3 results)
Research Products
(3 results)