2021 Fiscal Year Final Research Report
Credibility Analysis of Web contents based on 10 billion Web pages
Project/Area Number |
17KT0085
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Multi-year Fund |
Section | 特設分野 |
Research Field |
The Information Society and Trust
|
Research Institution | Waseda University |
Principal Investigator |
YAMANA HAYATO 早稲田大学, 理工学術院, 教授 (40230502)
|
Project Period (FY) |
2017-07-18 – 2022-03-31
|
Keywords | Webコンテンツ / 信憑性 / 信頼性 / フィッシング / Webクローラ |
Outline of Final Research Achievements |
In this research project, efficient web page crawlers (gathering programs), web page content analysis methods, methods for estimating web content reliability without accessing web contents (i.e., using only URLs), revealing the problems of previous benchmarks where the ground truth is usually based on human first-impression decisions, and distributing the related research survey of web content reliability have been completed. Especially, the crawler achieved a 10% improvement in efficiency compared to previous methods, and the method that can judge credibility using only URLs (achieving an accuracy of 99.4%) achieved significant results for future practical use, as it can judge credibility using only URLs without accessing content.
|
Free Research Field |
ビッグデータ解析
|
Academic Significance and Societal Importance of the Research Achievements |
日々の暮らしに必要不可欠な存在となったWebコンテンツについて,その信頼性を判定する指標(判定手法)を考案することで,今後さらに巧妙となってくる信憑性・信頼性が低いWebコンテンツを自動判定する仕組みを構築することができた.構築された基盤技術を用いて今後ツールを構築していくことで,インターネット利用者が安心してWebコンテンツを利活用できる基盤を築くことができた.さらに,本分野の研究において欠くことのできないベンチマークの問題点を明らかにし,今後の本分野の研究のあり方を提言することができた.
|