A Study on Methods for Automatically Finding Important Features in Sequential Labeling
Project/Area Number |
22500121
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Ehime University |
Principal Investigator |
|
Project Period (FY) |
2010-04-01 – 2013-03-31
|
Project Status |
Completed (Fiscal Year 2013)
|
Budget Amount *help |
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2012: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2011: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2010: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
|
Keywords | 自然言語処理 / 機械学習 / オンライン学習 / 素性選択 / ロジスティック回帰 / L1正則化 |
Research Abstract |
In natural language processing, millions of feature functions are defined for the discriminative models used in many natural language tasks. These feature functions are elaborated by human experts, but it is obviously not easy even for the human experts to find and develop such millions of feature functions by hands. This research proposes efficient methods for online grafting and ensemble methods for improving accuracy of online grafting. Online grafting is a method for automatically selecting features and optimizing the parameters in L1-regularized logistic regression. The experiments have shown that our methods significantly improved efficiency of online grafting. Though our methods are approximation techniques, deterioration of prediction performance was negligibly small. The ensemble methods using probabilistic algorithms achieved to improve the accuracy of online grafting.
|
Report
(4 results)
Research Products
(28 results)