Budget Amount *help |
¥16,250,000 (Direct Cost: ¥12,500,000、Indirect Cost: ¥3,750,000)
Fiscal Year 2020: ¥3,510,000 (Direct Cost: ¥2,700,000、Indirect Cost: ¥810,000)
Fiscal Year 2019: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2018: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2017: ¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)
|
Outline of Final Research Achievements |
To enable an advanced retrieval system or an intelligent knowledge extraction system that deals with a large set of video contents, it is essential to semantically annotate them adequately. Towards this ultimate goal, this study researched fundamental technologies that combine vision and language technologies. More specifically, we have developed an effective yet efficient scene graph generation systems and an action captioning system. Empirical results show that the resulting systems generally performed better than the comparative systems. These systems respectively achieve information structure adequate for computer processing and for human consumption.
|