2014 Fiscal Year Research-status Report
Active Learningを用いた大腸癌自動診断システム
Project/Area Number |
25330337
|
Research Institution | Hiroshima University |
Principal Investigator |
ライチェフ ビセル 広島大学, 工学(系)研究科(研究院), 助教 (00531922)
|
Project Period (FY) |
2013-04-01 – 2016-03-31
|
Keywords | 癌の自動診断システム / Active learning / 大腸癌 / Ensemble methods |
Outline of Annual Research Achievements |
In developing computerized systems for automatic diagnosis of cancer from images, it is very important to have a sufficiently large data set of training images, exhibiting the whole range of variation which can be observed in the different types of cancer. However obtaining such data sets is often difficult and impractical (even though huge volumes of raw data might be easily available) due mainly to the fact that obtaining labeled samples in a proper form for use in machine learning requires the costly time and effort of busy medical experts. The aim of this research project is to investigate methods which could make it possible to drastically reduce the number of required labeled training images for cancer diagnosis, while at the same time obtaining training data sets of very good quality, by using active learning and semi-supervised learning methods. These are being developed in a concrete setting, in the context of detection of colorectal cancer from Narrow Band Imaging (NBI) images obtained through colorectal endoscopy.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
At this stage we have successfully developed the basic ingredients of the automatic diagnosis method, based on local context features at the feature extraction level, and a randomized decision forest at the classification level. The local context features are based on a texton map from which texture and local context-based information is extracted from the surrounding area centered on each pixel. The features are very high-dimensional (infinite in principle) and therefore very discriminative, which combined with their huge number and the ability of random forests to handle efficiently such data without over-fitting enables us to achieve very good accuracy from a very small number of training images. Additionally, by providing local pixel-level classification the resulting method is much more general and does not depend on the concrete configuration of patterns available in the training images. The method operates locally and therefore is much better suited for video data also, which makes it applicable to more realistic diagnostic scenarios.
|
Strategy for Future Research Activity |
Still more work needs to be done on integrating/smoothing the classification results done at a pixel level to an image level, i.e. developing a more sophisticated scheme than the presently used one to determine the label at an image level. Also, additional reduction of the number of necessary labeled training images (while keeping high recognition accuracy) through a combination of active and semi-supervised learning needs to be further investigated. Additionally, tests on different datasets are planned to confirm the generality of the proposed approach.
|