A Sequence-to-sequence Model based Dissimilarity Measurement for Clustering Structural Data
Project/Area Number |
18K18068
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61010:Perceptual information processing-related
|
Research Institution | Tokyo University of Agriculture and Technology |
Principal Investigator |
NGUYENTUAN CUONG 東京農工大学, 工学(系)研究科(研究院), 特任助教 (10814246)
|
Project Period (FY) |
2018-04-01 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2020: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2019: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2018: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
|
Keywords | clustering / online handwriting / offline handwriting / generative sequence / sequence to sequence / handwritten answers / mathematical expressions / handwriting recognition / handwriting / mathematical expression / weakly supervised / hierarchical features / CNN / dissimilarity / semi-supervised learning / sequential data / structural data |
Outline of Final Research Achievements |
We have finished applying the proposed generative sequence dissimilarity for clustering of handwritten mathematical answers. The method outperforms other global feature based clustering methods such as Deep Embedded Clustering and Siamese Networks. The method also superior to the hierarchical feature representations by Convolutional Neural Networks with Weakly Supervised learning. We have applied the method for clustering online handwritten mathematical expressions and show that the proposed metric is better than edit distance metric. We continue to apply the method for a large-scale database of offline handwritten mathematical answers collected from the preliminary examination.
|
Academic Significance and Societal Importance of the Research Achievements |
大規模な手書き数式回答をクラスタリングできると,同じ回答がグループ化され,採点する手間を削減し,採点の効率と信頼性を向上する.本研究は,クラスタリングするため,構造認識とそれらの関係を学習することの重要性を強調している.
|
Report
(4 results)
Research Products
(24 results)