Zero-shot recognition of generic objects
Project/Area Number |
19K24344
|
Research Category |
Grant-in-Aid for Research Activity Start-up
|
Allocation Type | Multi-year Fund |
Review Section |
1001:Information science, computer engineering, and related fields
|
Research Institution | Kobe University |
Principal Investigator |
|
Project Period (FY) |
2019-08-30 – 2022-03-31
|
Project Status |
Granted (Fiscal Year 2020)
|
Budget Amount *help |
¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
Fiscal Year 2020: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2019: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | Zero-Shot Learning / Self-Supervised Learning / Visual Representation / Feature Extraction / Semantic representations / Resource Efficiency / CNN / Deep learning / Computational efficiency / Semantic representation / Computer vision / Object recognition |
Outline of Research at the Start |
This study focuses on deriving new principles for optimization and semantic feature learning applied to generic object recognition. On the optimization front, we will focus on improving the computational and algorithmic efficiency of training deep learning models. In order to expand our search space by enabling quicker iteration over different architectural designs. On the semantic learning front, we aim to achieve a better understanding of the visual features that can be derived from semantic data, which we believe to be the key missing element to enable practical Zero-Shot recognition.
|
Outline of Annual Research Achievements |
In this academic year of research, efforts have been focused along two axis. Along the first axis, self-supervised visual representations of the Generic Object ZSL (GOZ) Dataset images were proposed and compared with traditional supervised representations on the GOZ benchmark. These representations tend to perform better on standard zero-shot learning task whereas they do not match the supervised representations on the generalized zero-shot learning setting. Closing the gap between closely clustered supervised representations that perform well on training classes and more scattered unsupervised representations on the training classes while retaining higher accuracy on the unseen test classes has been identified as a promising research question. The second axis concerns the computational efficiency of Convolutional Neural Network (CNN) training. Indeed, training CNN on Imagenet scale dataset is computationally very expensive, which hinders the investigation of different training and fine-tuning strategies. Towards that end, we have focused our efforts on reducing both the amount of computations and the memory footprint of CNN training in order to enable larger batch training, and hence shorter training times.
|
Current Status of Research Progress |
Current Status of Research Progress
3: Progress in research has been slightly delayed.
Reason
Zero-Shot Learning and Self-Supervised Learning of visual representations are two topics at the frontier of computer vision research. Catching up with recent research on self-supervision while evaluating their efficiency on Zero-Shot Learning has been a very time consuming task.
|
Strategy for Future Research Activity |
In our final year of study we plan to publish the results of our investigation, including a publication on GPU memory optimization and self-supervised visual representation learning.
|
Report
(2 results)
Research Products
(3 results)