Image Representation for Fast Query and Its Applications
Project/Area Number |
12680377
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Shinshu University |
Principal Investigator |
MARUYAMA Minoru Shinshu Univ., Dept. Inf. Eng, Assoc. Prof., 工学部, 助教授 (80283232)
|
Project Period (FY) |
2000 – 2002
|
Project Status |
Completed (Fiscal Year 2002)
|
Budget Amount *help |
¥2,800,000 (Direct Cost: ¥2,800,000)
Fiscal Year 2002: ¥500,000 (Direct Cost: ¥500,000)
Fiscal Year 2001: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2000: ¥1,400,000 (Direct Cost: ¥1,400,000)
|
Keywords | data query / hash function / PCA / mixture model / image query / clustering / SVM / multiple hypotheses / クラス識別 / 多クラス問題 / 例題からの学習 / 汎化誤差推定 / 最近接探索 / 手書き文字認識 / 多クラス識別 / DAG / 順位正解率 / 次元圧縮 / EMアルゴリズム / 画像表現 |
Research Abstract |
In many recognition problems, basic procedure is the matching between the input and the data set. This procedure can be characterized as the nearest neighbor problem in high dimensional vector space. If each data resides in the high-dimensional space, nearest neighbor search is getting very difficult depending the dimensionality. For example, well-known kd-tree is useless for the search problem in high dimensional vector space. In this report, we first compare the search algorithms, including brute-force method, kd-tree and LSH(locality sensitivity hashing). We show, LSH can be very effective, although it can give only the approximation solution. To realize fast query in high-dimensional vector space, it is important to reduce the dimensionality and/or to reduce the size of the target data set. LSH can be seen as one of the method to reduce the data set to examine. Based on these preliminary results, we first propose an image query system based on the Gaussian mixture model and PCA(principle component analysis). Then, we show if the distribution of the data set can be described as the "clusters", fast query can be made possible. If data set is made up of several clusters, recognizing the appropriate cluster for each input vector is the key to the fast and reliable search. For that purpose, we have to learn classifiers from examples. Since we cannot expect the classifier 100% accurate, it is desirable to obtain classifiers which give rise to multiple hypotheses. In our report, we describe a method to extend DAG-SVM to make multiple hypotheses.
|
Report
(4 results)
Research Products
(4 results)