2010 Fiscal Year Final Research Report
Studies on Stream Mining in Web Archive
Project/Area Number |
19500098
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Media informatics/Database
|
Research Institution | Nanzan University |
Principal Investigator |
KAWANO Hiroyuki Nanzan University, 情報理工学部, 教授 (70224813)
|
Project Period (FY) |
2007 – 2010
|
Keywords | デジタルアーカイブ / コンテンツ流通 / 評判モデル / Webアーカイブ / Webクローリング |
Research Abstract |
The size of the web archive is increasing exponentially, many national libraries and IIPC (International Internet Preservation Consortium) are making efforts to decide guidelines of long-term preservation of digital contents. In this research, from the view points of data mining techniques for reputation model, we reconsider a growth model of storage volume in web archive system. We discuss a basic architecture of hierarchical storage system based on characteristics of memory devices such as RAM, HDD, magnetic tapes and disks. We improve the file moving algorithm by using file retrieval patterns and access frequencies.
|