2019 Fiscal Year Annual Research Report

A Computational Model of Music Understanding Based on Statistical Grammar and Constructive Semantics

Research Project

Project/Area Number	16H01744
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	東条敏北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (90272989)
Co-Investigator(Kenkyū-buntansha)	北原鉄朗日本大学, 文理学部, 准教授 (00454710) 吉井和佳京都大学, 情報学研究科, 准教授 (20510001) 平田圭二公立はこだて未来大学, システム情報科学部, 教授 (30396121) 浜中雅俊国立研究開発法人理化学研究所, 革新知能統合研究センター, チームリーダー (30451686) 長尾確名古屋大学, 情報学研究科, 教授 (70343209) 大村英史東京理科大学, 理工学部情報科学科, 助教 (90645277) 松原正樹筑波大学, 図書館情報メディア系, 助教 (90714494)
Project Period (FY)	2016-04-01 – 2021-03-31
Keywords	音楽情報処理 / 木構造 / 時系列処理 / 文法理論 / 和声解析
Outline of Annual Research Achievements	文法発見においてはGTTMなどによる分析的な手法に加えて，機械学習による統計的な手法を応用する試みを続けているところであるが，2019年度は比較対照のため音楽家による音楽構造木の分析データを蓄積した．さらに特筆すべきは，構文木生成において楽曲の進行に合わせて漸進的に木を構築していくモデルに行き着いたことであり，我々の認知モデルにも合致した木構造の生成を提案した．この木構造においては，論理学の記号を用いて予測に関するアノテーションを行うことも併せて提案した．論理記号を用いることは，構成的な意味論，すなわち部分木の情報の統合が全体のゲシュタルトを形成するという考え方に沿うものである．また音楽言語モデルを考慮することにより，音楽的に妥当な楽譜を出力できる自動採譜システムの研究を進めた．具体的には，歌声・ドラム自動採譜において，音符系列やドラムパターンに内在する複雑な構造を，深層生成モデルを用いてモデル化することを試みた．さらに，隠れマルコフモデルを用いた和声からのベースパートの生成，ルールベースによる自動編曲システムなどを実現した．上記音楽モデルはより一般的な認知モデルにも発展可能なものである．2019年度は既存曲のピッチ情報から，ピッチ格子空間内にガウシアンに基づくピッチの分布を生成し，既存曲の分析および類似曲の生成を行うシステムを開発した．また，人間の認知特性の一つである聴覚ゲシュタルトを考慮して，複数時系列データの音楽表現の提案を行った．これらの手法は音楽以外のメディアにも応用され，会議における一連の発言の時系列に対してさまざまなメタ情報を収集して分析し，ディスカッションに含まれる重要発言の抽出を行った．さらに機械学習を用いては，活性化関数に振動関数を用いるニューラルネットワークで任意の信号を学習・生成する方式を考案し，そのフィージビリティスタディを行った．
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason 楽曲における木構造生成において，漸進的構築モデルを具体化したことは本研究プロジェクト遂行上における重要な進展である．漸進的構築モデルとは曲全体をパーサに一度に渡すことではなく，我々の音楽認知と同様に音楽を聴きながら時間を追って構文木を構築していく方法であり，文法理論と認知モデルの統合への端緒となる．さらに論理記号を用いたことにより，部分木の情報の統合を構成的に全体意味に反映させることができた．また，音楽構造の木構造解析を容易に行うためのWebベースのツールの開発を行い，音楽構造の利用法をデモンストレーションするためのシステムを構築した．我々はかねてより木構造の類似性検証にも成果を挙げており，木構造を確率的に表現し確率分布間の距離として類似性を定義し，相対擬補元を定義するアプローチを検討した．また文法獲得においては，機械学習の成果を援用しており，自動作曲・編曲の複数のタスクに関して，各タスクの性質に合わせて確率モデルやルールベースなどの適切なアプローチを採用して定式化およびシステムの実装を行った．しかしタスクごとに異なる定式化となっているため，より統一的な視点からの定式化が望まれる．さらに，深層生成モデルを用いて大量の楽譜データからそこに内在する文法規則を教師なし学習させる技術について進展があった．認知モデルの構築においては，既存曲の分析を行い，ピッチとリズムの格子空間にガウシアンに基づく確率分布を生成した．また木構造を用いた認知的音楽理論の拡張として，短期記憶や動的な聴取モデルの形式化に取り組んだ．これら認知モデルの応用として会議録解析をおこない，ミーティングレコーダーと呼ばれるシステムを開発して1年間運用し，会議中の発話内容以外に顔特徴や心拍数を測定した．同時に「集中」「混乱」などの心的状態ラベルを発言者に付与した．
Strategy for Future Research Activity	2020年度は研究計画最終年度であることに鑑み，木の構成と言語学的知見から追及する具体的目標として，以下のように設定する．まず，木の構成において時系列に沿った漸進的な木構築を実装し，木の予測が認知的な音楽の期待感と合致することを示す．このため，木の予測について論理記号を用いたアノテーションを行う．また楽譜データベースの整備を進め，LSTMなどのよりモダンな手法を活用し作編曲の質を改善する．木構造推定の精度を高め，類似性検証のアルゴリズムを研磨することで，信頼性の高い楽曲検索システムを実現する．音楽構造の木構造解析を容易に行うために，構造木を深層学習に基づき自動獲得するシステムを構築する．深層生成モデルを用いて大量の楽譜データからそこに内在する文法規則を教師なし学習させる技術を深めるとともに，他の損額要素であるキー・コード・ビートに関しても包括的かつ同時にモデル化することに取り組む．人間の音楽認知を理解するためにピッチとリズムの格子空間にガウシアンに基づく確率分布を生成し，期待感との関係に結びつけ，IR理論などをもとに音楽的期待感の定式化を行う．木構造を用いた認知的音楽理論の拡張として，短期記憶や動的な聴取モデルの形式化に取り組み，モデルの形式化の評価として計算論的アプローチによる認知的リアリティの有無を検討する．木構造が時系列の応用であることを示すため談話構造理解を行い，会議中の発話とともに心的状態ラベルを付与し発話の重要性との関係を明らかにする．以上の成果は秋季に日本で国際学会International Symposium on Computer Music Multidisciplinary Resaerch (CMMR2020)を開催し，特別なセッションを設けて本研究成果を発表する．また海外よりこの分野の第一線の研究者を招き評価を問う．

Research Products
(45 results)

All 2020 2019 Other

All Int'l Joint Research (2 results) Journal Article (5 results) (of which Peer Reviewed: 4 results, Open Access: 2 results) Presentation (36 results) (of which Int'l Joint Research: 30 results, Invited: 1 results) Book (2 results)

[Int'l Joint Research] Digital and Cognitive Musicology Lab/EPFL/Lausanne(スイス)
- Country Name
  SWITZERLAND
- Counterpart Institution
  Digital and Cognitive Musicology Lab/EPFL/Lausanne
[Int'l Joint Research] Universidad de Alicante/Alicante(スペイン)
- Country Name
  SPAIN
- Counterpart Institution
  Universidad de Alicante/Alicante
[Journal Article] Statistical Learning and Estimation of Piano Fingering2020
- Author(s)
  Eita Nakamura, Yasuyuki Saito, Kazuyoshi Yoshii
- Journal Title
  
  Information Sciences
  
  Volume: 517 Pages: 68-85
- DOI
  10.1016/j.ins.2019.12.068
- Peer Reviewed
[Journal Article] 言語の構文解析から音楽の構造分析へ2020
- Author(s)
  平田圭二, 東条敏
- Journal Title
  
  音楽知覚認知研究
  
  Volume: 25 Pages: 29-39
[Journal Article] ゲーム風演出で読書を促進するモバイルアプリケーション2019
- Author(s)
  草野有沙, 西由佳梨, 北原鉄朗
- Journal Title
  
  情報処理学会論文誌
  
  Volume: 60 Pages: 1978-1982
- Peer Reviewed
[Journal Article] HamoKara: A System that Enables Amateur Singers to Practice Backing Vocals for Karaoke2019
- Author(s)
  Mina Shiraishi, Kozue Ogasawara, and Tetsuro Kitahara
- Journal Title
  
  Journal of Information Processing
  
  Volume: 27 Pages: 683-692
- DOI
  10.2197/ipsjjip.27.683
- Peer Reviewed / Open Access
[Journal Article] A Non-notewise Melody Editing Method for Supporting Musically Untrained People's Music Composition2019
- Author(s)
  Yusuke Tsuchiya, Tetsuro Kitahara
- Journal Title
  
  Journal of Creative Music Systems
  
  Volume: 3 Pages: 1-25
- DOI
  10.5920/jcms.624
- Peer Reviewed / Open Access
[Presentation] Audio-guided Video Interpolation via Human Pose Features2020
- Author(s)
  Takayuki Nakatsuka, Masatoshi Hamanaka, Shigeo Morishima
- Organizer
  15th International Conference on Computer Vision Theory and Applications
- Int'l Joint Research
[Presentation] Reading Students’ Multiple Mental States in Conversation from Facial and Heart Rate Cues2020
- Author(s)
  Shimeng Peng, Shigeki Ohira and Katashi Nagao
- Organizer
  12th International Conference on Computer Supported Education (CSEDU 2020)
- Int'l Joint Research
[Presentation] マルチモーダル情報の統合により技能差に適応する楽譜追跡システム2020
- Author(s)
  能登楓, 竹川佳成, 平田圭二
- Organizer
  (社) 情報処理学会音楽情報科学研究会
[Presentation] 学習者の熟達度を予測するピアノ学習支援システムの提案2020
- Author(s)
  松井遼太, 竹川佳成, 平田圭二, 柳沢豊
- Organizer
  (社) 情報処理学会音楽情報科学研究会
[Presentation] 合成音声におけるヴィブラートのパラメータ自動推定2020
- Author(s)
  田中瑞穂, 竹川佳成, 平田圭二
- Organizer
  (社) 情報処理学会音楽情報科学研究会
[Presentation] 主成分回帰による音楽的緊張モデルの構築と特徴量の同定2020
- Author(s)
  樋口梨花, 竹川佳成, 平田圭二
- Organizer
  (社) 情報処理学会音楽情報科学研究会
[Presentation] Development of Agents that Create Melodies based on Estimating Gaussian Functions in the Pitch Space of Consonance2020
- Author(s)
  Hidefumi Ohmura, Takuro Shibayama, Keiji Hirata, and Satoshi Tojo
- Organizer
  HAMT, 12th International Conference on Agents and Artificial Intelligence
- Int'l Joint Research
[Presentation] Progressive Training in Recurrent Neural Networks for Chord Progression Modeling2020
- Author(s)
  Trung-Kien Vu, Teeradaj Racharak, Satoshi Tojo, Nguyen Ha Thanh, Nguyen Le Minh
- Organizer
  12th International Conference on Agents and Artificial Intelligence
- Int'l Joint Research
[Presentation] Generating Walking Bass Lines with HMM2019
- Author(s)
  Ayumi Shiga and Tetsuro Kitahara
- Organizer
  The 14th International Symposium on Computer Music Multidisciplinary Research (CMMR 2019)
- Int'l Joint Research
[Presentation] An Investigation towards Verbally Controllable Equalizer for Singing Voices2019
- Author(s)
  Seiya Masuda, Eriko Aiba, and Tetsuro Kitahara
- Organizer
  The 5th Workshop on Intelligent Music Production (WIMP 2019)
- Int'l Joint Research
[Presentation] Statistical Music Structure Analysis Based on a Homogeneity-, Repetitiveness-, and Regularity-Aware Hierarchical Hidden Semi-Markov Model2019
- Author(s)
  Go Shibata, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii
- Organizer
  International Society for Music Information Retrieval Conference (ISMIR)
- Int'l Joint Research
[Presentation] Blending Acoustic and Language Model Predictions for Automatic Music Transcription2019
- Author(s)
  Adrien Ycart, Andrew McLeod, Emmanouil Benetos, Kazuyoshi Yoshii
- Organizer
  International Society for Music Information Retrieval Conference (ISMIR)
- Int'l Joint Research
[Presentation] End-to-End Melody Note Transcription Based on a Beat-Synchronous Attention Mechanism2019
- Author(s)
  Ryo Nishikimi, Eita Nakamura, Masataka Goto, Kazuyoshi Yoshii
- Organizer
  IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- Int'l Joint Research
[Presentation] Joint Singing Pitch Estimation and Voice Separation Based on a Neural Harmonic Structure Renderer2019
- Author(s)
  Tomoyasu Nakano, Kazuyoshi Yoshii, Yiming Wu, Ryo Nishikimi, Kin Wah Edward Lin, Masataka Goto
- Organizer
  IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- Int'l Joint Research
[Presentation] Multi-Step Chord Sequence Prediction Based on Aggregated Multi-Scale Encoder-Decoder Networks2019
- Author(s)
  Tristan Carsault, Andrew McLeod, Philippe Esling, Jerome Nika, Eita Nakamura, Kazuyoshi Yoshii
- Organizer
  IEEE International Workshop on Machine Learning for Signal Processing (MLSP)
- Int'l Joint Research
[Presentation] Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations2019
- Author(s)
  Yiming Wu, Tristan Carsault, Kazuyoshi Yoshii
- Organizer
  European Signal Processing Conference (EUSIPCO)
- Int'l Joint Research
[Presentation] Automatic Singing Transcription Based on Encoder-Decoder Recurrent Neural Networks with a Weakly-Supervised Attention Mechanism2019
- Author(s)
  Ryo Nishikimi, Eita Nakamura, Satoru Fukayama, Masataka Goto, Kazuyoshi Yoshii
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Int'l Joint Research
[Presentation] Joint Transcription of Lead, Bass, and Rhythm Guitars Based on a Factorial Hidden Semi-Markov Model2019
- Author(s)
  Kentaro Shibata, Ryo Nishikimi, Satoru Fukayama, Masataka Goto, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Int'l Joint Research
[Presentation] Bayesian Drum Transcription Based on Nonnegative Matrix Factor Decomposition with a Deep Score Prior2019
- Author(s)
  Shun Ueda, Kentaro Shibata, Yusuke Wada, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Int'l Joint Research
[Presentation] Unsupervised Melody Style Conversion2019
- Author(s)
  Eita Nakamura, Kentaro Shibata, Ryo Nishikimi, Kazuyoshi Yoshii
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Int'l Joint Research
[Presentation] Improved Metrical Alignment of MIDI Performance Based on a Repetition-Aware Online-Adapted Grammar2019
- Author(s)
  Andrew McLeod, Eita Nakamura, Kazuyoshi Yoshii
- Organizer
  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Int'l Joint Research
[Presentation] Melody Slot Machine2019
- Author(s)
  Masatoshi Hamanaka, Takayuki Nakatsuka, Shigeo Morishima
- Organizer
  ACM Siggraph2019 Emerging Technologies ET-245
- Int'l Joint Research
[Presentation] Melody Slot Machine: A Controllable Holographic Virtual Performer2019
- Author(s)
  Masatoshi Hamanaka
- Organizer
  Proceedings of the 27th ACM International Conference on Multimedia (MM’19)
- Int'l Joint Research
[Presentation] Melody Slot Machine: Melody Morphing by Using Time-span Tree of GTTM2019
- Author(s)
  Masatoshi Hamanaka
- Organizer
  International Computer Music Conference (ICMC2019)
- Int'l Joint Research
[Presentation] Proposal of an Annotation Method for Integrating Musical Technique Knowledge Using a GTTM Time-Span Tree2019
- Author(s)
  Nami Iino, Mayumi Shimada, Takuichi Nishimura, Hideki Takeda, Masatoshi Hamanaka
- Organizer
  Proceedings of the 25th International Conference on MultiMedia Modeling (MMM2019)
- Int'l Joint Research
[Presentation] Discussion-skill Analytics with Acoustic, Linguistic and Psychophysiological Data2019
- Author(s)
  Katashi Nagao, Kosuke Okamoto, Shimeng Peng, Shigeki Ohira
- Organizer
  11th International Conference on Knowledge Discovery and Information Retrieval (KDIR 2019)
- Int'l Joint Research
[Presentation] AI-Powered Education: Smart Learning Environment with Large Interactive Displays2019
- Author(s)
  Katashi Nagao
- Organizer
  International Display Workshops 2019
- Int'l Joint Research / Invited
[Presentation] Feasibility Study of Deep Frequency Modulation Synthesis2019
- Author(s)
  Keiji Hirata, Masatoshi Hamanaka, Satoshi Tojo
- Organizer
  Proceedings of the 14th International Symposium on Computer Music Multidisciplinary Research (CMMR 2019)
- Int'l Joint Research
[Presentation] Adaptive Score-Following System by Integrating Gaze Information2019
- Author(s)
  Kaede Noto, Yoshinari Takegawa, and Keiji Hirata
- Organizer
  Proceedings of 16th Sound and Music Computing Conference (SMC 2019)
- Int'l Joint Research
[Presentation] New Implementation Method for Generalized Frequency Modulation Synthesizer2019
- Author(s)
  Keiji Hirata
- Organizer
  The 20th International Society for Music Information Retrieval Conference (ISMIR 2019)
- Int'l Joint Research
[Presentation] Auditory Gestalt Formation for Exploring Dynamic Triggering Earthquakes2019
- Author(s)
  Matsubara, M., Uchide, T. and Morimoto, Y.
- Organizer
  14th International Symposium on Computer Music Multidisciplinary Research (CMMR2019)
- Int'l Joint Research
[Presentation] Modal Logic for Tonal Music2019
- Author(s)
  Satoshi Tojo
- Organizer
  14th International Symposium on Computer Music Multidisciplinary Research (CMMR2019)
- Int'l Joint Research
[Presentation] Chord Function Identification with Modulation Detection Based on HMM2019
- Author(s)
  Yui Uehara, Eita Nakamura, and Satoshi Tojo
- Organizer
  14th International Symposium on Computer Music Multidisciplinary Research (CMMR2019)
- Int'l Joint Research
[Presentation] Music Temperaments Evaluation Based on Triads2019
- Author(s)
  Tong Meihui and Satoshi Tojo
- Organizer
  The 16th Sound and Music Computing Conference
- Int'l Joint Research
[Presentation] Chord Function Identification with Modulation Detection Based on HMM2019
- Author(s)
  Yui Uehara, Eita Nakamura, and Satoshi Tojo
- Organizer
  (社) 情報処理学会音楽情報科学研究会
[Presentation] Jazz harmony analysis based on Tonal Pitch Space2019
- Author(s)
  Hiroyuki Yamamoto, Satoshi Tojo
- Organizer
  (社) 情報処理学会音楽情報科学研究会
[Book] 人工知能が音楽を創る2019
- Author(s)
  David Cope, 平田圭二(監訳), 今井慎太郎, 大村英史, 東条敏(訳)
- Total Pages
  443
- Publisher
  音楽之友社
- ISBN
  978-4-276-21413-2
[Book] 人工知能事典2019
- Author(s)
  中島秀之他
- Total Pages
  384
- Publisher
  近代科学社
- ISBN
  978-4-7649-0604-4

2019 Fiscal Year Annual Research Report

A Computational Model of Music Understanding Based on Statistical Grammar and Constructive Semantics

Principal Investigator

東条 敏 北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (90272989)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] Digital and Cognitive Musicology Lab/EPFL/Lausanne(スイス)

Country Name

Counterpart Institution

[Int'l Joint Research] Universidad de Alicante/Alicante(スペイン)

Country Name

Counterpart Institution

[Journal Article] Statistical Learning and Estimation of Piano Fingering2020

Author(s)

Journal Title

DOI

[Journal Article] 言語の構文解析から音楽の構造分析へ2020

Author(s)

Journal Title

[Journal Article] ゲーム風演出で読書を促進するモバイルアプリケーション2019

Author(s)

Journal Title

[Journal Article] HamoKara: A System that Enables Amateur Singers to Practice Backing Vocals for Karaoke2019

Author(s)

Journal Title

DOI

[Journal Article] A Non-notewise Melody Editing Method for Supporting Musically Untrained People's Music Composition2019

Author(s)

Journal Title

DOI

[Presentation] Audio-guided Video Interpolation via Human Pose Features2020

Author(s)

Organizer

[Presentation] Reading Students’ Multiple Mental States in Conversation from Facial and Heart Rate Cues2020

Author(s)

Organizer

[Presentation] マルチモーダル情報の統合により技能差に適応する楽譜追跡システム2020

Author(s)

Organizer

[Presentation] 学習者の熟達度を予測するピアノ学習支援システムの提案2020

Author(s)

Organizer

[Presentation] 合成音声におけるヴィブラートのパラメータ自動推定2020

Author(s)

Organizer

[Presentation] 主成分回帰による音楽的緊張モデルの構築と特徴量の同定2020

Author(s)

Organizer

[Presentation] Development of Agents that Create Melodies based on Estimating Gaussian Functions in the Pitch Space of Consonance2020

Author(s)

Organizer

[Presentation] Progressive Training in Recurrent Neural Networks for Chord Progression Modeling2020

Author(s)

Organizer

[Presentation] Generating Walking Bass Lines with HMM2019

Author(s)

Organizer

[Presentation] An Investigation towards Verbally Controllable Equalizer for Singing Voices2019

Author(s)

Organizer

[Presentation] Statistical Music Structure Analysis Based on a Homogeneity-, Repetitiveness-, and Regularity-Aware Hierarchical Hidden Semi-Markov Model2019

Author(s)

Organizer

[Presentation] Blending Acoustic and Language Model Predictions for Automatic Music Transcription2019

Author(s)

Organizer

[Presentation] End-to-End Melody Note Transcription Based on a Beat-Synchronous Attention Mechanism2019

Author(s)

Organizer

[Presentation] Joint Singing Pitch Estimation and Voice Separation Based on a Neural Harmonic Structure Renderer2019

Author(s)

Organizer

[Presentation] Multi-Step Chord Sequence Prediction Based on Aggregated Multi-Scale Encoder-Decoder Networks2019

Author(s)

Organizer

[Presentation] Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations2019

Author(s)

Organizer

[Presentation] Automatic Singing Transcription Based on Encoder-Decoder Recurrent Neural Networks with a Weakly-Supervised Attention Mechanism2019

東条敏北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (90272989)