Query-by-Singing music information Retrieval system supporting various singing style
Project/Area Number |
18K11321
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 60080:Database-related
|
Research Institution | Osaka Institute of Technology |
Principal Investigator |
|
Project Period (FY) |
2018-04-01 – 2023-03-31
|
Project Status |
Completed (Fiscal Year 2022)
|
Budget Amount *help |
¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2021: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2020: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2019: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2018: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
|
Keywords | 楽曲検索 / 歌唱音声認識 / 歌詞誤りに頑健な検索 / 歌詞の記憶誤り / Query-by-Singing / 楽曲検索システム / 検索スコアの統合 / 大語彙言語モデル / 音素系列による検索 / 歌詞認識 / 擬音語歌唱 |
Outline of Final Research Achievements |
In this research, we had developed various elemental technologies with the aim of constructing a music retrieval system using singing voice as input.First, we had developed a method to recognize singing voice with high accuracy. Since a note generally corresponds to a mora in singing voice, we developed a recognition method that uses note boundary information to improve the accuracy. Next, we developed a robust retrieval method for lyrics containing errors. For recognition errors, we used phoneme sequences instead of word sequences, and for human memory errors, we improved the retrieval accuracy by reflecting error tendencies in the retrieval score. Finally, a method for combining the retrieval results obtained from both melody and lyrics was studied. We tried to improve the accuracy by matching the retrieval positions, but did not achieve the expected results.
|
Academic Significance and Societal Importance of the Research Achievements |
歌唱音声認識の精度向上に大きな貢献をした。歌唱音声の認識が難しい事は従来から知られていたが,音響モデルや言語モデルを適応させる程度しか対処法が提案されていなかった。本研究では音符の区切り時刻を利用する,という新たな発想を取り入れ,認識精度を大きく向上させることができた。更に歌唱音声において任意の位置に無音区間が挿入される可能性があること,それが認識性能を劣化させる大きな原因であった事を初めて明らかにした。 また,歌詞を用いた楽曲検索において,認識誤りだけではなく,人間の記憶誤りにも注目し,適切に対処を行うことで検索精度を向上させたことも大きな貢献である。
|
Report
(6 results)
Research Products
(7 results)