Budget Amount *help |
¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)
Fiscal Year 2025: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2024: ¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)
|
Outline of Research at the Start |
Object detection and image captioning tasks are connected, but each has the potential to recognize and depict objects that are beyond the scope of the other. This research investigates a more comprehensive and cohesive understanding of visual content by unifying both tasks in the context of generative task. We aim to develop a vision - language knowledge base method that not only detects and describes the objects in the training dataset, but also on novel objects not seen during training.
|