• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

構音障がい者のための声質変換

Research Project

Project/Area Number 14J04514
Research Category

Grant-in-Aid for JSPS Fellows

Allocation TypeSingle-year Grants
Section国内
Research Field Perceptual information processing
Research InstitutionKobe University

Principal Investigator

相原 龍  神戸大学, システム情報学研究科, 特別研究員(DC1)

Project Period (FY) 2014-04-25 – 2017-03-31
Project Status Completed (Fiscal Year 2016)
Budget Amount *help
¥2,800,000 (Direct Cost: ¥2,800,000)
Fiscal Year 2016: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2015: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2014: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords声質変換 / 障がい者支援 / 識別学習 / 発話リズム / Duration / 構音障がい / 脳性麻痺 / アテトーゼ現象 / 不特定話者 / 発話支援 / 障がい者福祉
Outline of Annual Research Achievements

声質変換は,ある話者の声をあたかも別人が発話しているかのように変換する技術である.アテトーゼ型脳性麻痺による構音障がい者の不明瞭な発話を,この声質変換技術を用いて聞き取りやすく変換することが本研究の目標である.声質変換技術はテキスト認識を行わない,音声から音声へ変換するシステムであるため,手足の動きが不自由な発話障がい者にとっても使いやすい技術であると考えられる.本年度は,声質変換精度の向上を目標として,「識別的学習」と「発話リズム変換」の2つのタスクに取り組んだ.
構音障がい者発話が不明瞭になる原因として,音素の曖昧性が指摘されている.音素は音声において分割可能な最小単位とされている.構音障がい者は,口や舌など発話する機構が不自由であるため,健常者と比較して発話が曖昧になりやすい.提案手法では,これまで我々が研究してきた,構音障がい者の声質変換で用いられてきたアルゴリズムに,音素を識別するモデルを導入し,発話が明瞭に変換されるよう改良を加えた.この研究成果は,音声信号処理において世界最大級の国際学会INTERSPEECH2016において発表された.
構音障害がい者の発話の特徴として,発話が不自然に間延びするという点がある.健常者の発話リズムは基本的に一定であるのに対して,障がい者の発話リズムは,その前後の音素の関係や発話者の体調によって大きく変化する.この発話リズムの変動が,障がい者の発話を聞き取りにくくする原因の一つとなっていた.発話リズムの変換はこれまで例が少なく,特に声質変換システムにおいて,発話リズムは入力話者のものをそのまま用いることがほとんどであった.そのため,発話リズムを変換する新たな特徴量を提案し,リズムを健常者に近づけることに成功した.これらの研究成果は日本音響学会ならびに電子情報通信学会で発表され,現在,INTERSPEECH2017に投稿中である.

Research Progress Status

28年度が最終年度であるため、記入しない。

Strategy for Future Research Activity

28年度が最終年度であるため、記入しない。

Report

(3 results)
  • 2016 Annual Research Report
  • 2015 Annual Research Report
  • 2014 Annual Research Report
  • Research Products

    (45 results)

All 2017 2016 2015 2014

All Journal Article (10 results) (of which Peer Reviewed: 8 results,  Open Access: 2 results) Presentation (34 results) (of which Int'l Joint Research: 10 results) Book (1 results)

  • [Journal Article] Multiple Non-negative Matrix Factorization for Many-to-many Voice Conversion2016

    • Author(s)
      Ryo Aihara, Testuya Takiguchi, Yasuo Ariki
    • Journal Title

      EEE/ACM Transactions on Audio, Speech, and Language Processing

      Volume: 24 Issue: 7 Pages: 1175-1184

    • DOI

      10.1109/taslp.2016.2522643

    • Related Report
      2016 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Multiple Non-negative Matrix Factorization for Many-to-many Voice Conversion2016

    • Author(s)
      Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    • Journal Title

      IEEE/ACM Trans. on Audio, Speech, and Language Processing

      Volume: PP Pages: 1-10

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars2015

    • Author(s)
      Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    • Journal Title

      ACM Trans. on Accessible Computing; Special Issue on Speech and Language Processing for AT

      Volume: 6

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss2015

    • Author(s)
      Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono
    • Journal Title

      IPSJ Trans. on Computer Vision and Applications

      Volume: 7 Pages: 64-68

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Multimodal voice conversion based on non-negative matrix factorization2015

    • Author(s)
      Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    • Journal Title

      EURASIP Journal on Audio, Speech, and Music Processing

      Volume: 2015:24 Issue: 1 Pages: 1-9

    • DOI

      10.1186/s13636-015-0067-4

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization2015

    • Author(s)
      Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    • Journal Title

      EURASIP Journal on Audio, Speech, and Music Processing

      Volume: 2015:32 Issue: 1 Pages: 1-9

    • DOI

      10.1186/s13636-015-0075-4

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Individuality-preserving Voice Conversion for Articulation Disorders Using Phoneme-categorized Exemplars2015

    • Author(s)
      Ryo Aihara, Tetsuya Takiguchi and Yasuo Ariki
    • Journal Title

      Transactions on Accessible Computing

      Volume: 未定

    • Related Report
      2014 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization2014

    • Author(s)
      Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki
    • Journal Title

      IEICE Transactions on Information and Systems

      Volume: E97-D Pages: 1411-1418

    • Related Report
      2014 Annual Research Report
    • Peer Reviewed
  • [Journal Article] スパース辞書学習による構音障害者の話者性を維持した声質変換2014

    • Author(s)
      相原 龍,滝口哲也,有木康雄
    • Journal Title

      電子情報通信学会技術研究報告

      Volume: 91 Pages: 39-44

    • NAID

      40020156739

    • Related Report
      2014 Annual Research Report
  • [Journal Article] Multiple Non-negative Matrix Factorization を用いた多対一声質変換2014

    • Author(s)
      相原龍, 滝口哲也, 有木康雄
    • Journal Title

      電子情報通信学会技術研究報告

      Volume: 114 Pages: 75-80

    • NAID

      110009850921

    • Related Report
      2014 Annual Research Report
  • [Presentation] isual-to-Speech Conversion Based on Maximum Likelihood Estimation2017

    • Author(s)
      羅里奈
    • Organizer
      MVA2017, The Fifteenth IAPR International Conference on Machine Vision Applications
    • Place of Presentation
      Nagoya University, Nagoya, Japan
    • Year and Date
      2017-05-08
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 声質変換のための音素識別的特徴量2017

    • Author(s)
      相原龍
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学,神奈川,日本
    • Year and Date
      2017-03-09
    • Related Report
      2016 Annual Research Report
  • [Presentation] 最尤変換における唇動画像からの音声生成2017

    • Author(s)
      羅里奈
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学,神奈川,日本
    • Year and Date
      2017-03-09
    • Related Report
      2016 Annual Research Report
  • [Presentation] 構音障害者のためのDurationを含んだ統計的声質変換2017

    • Author(s)
      相原龍
    • Organizer
      電子情報通信学会音声研究会(SP)
    • Place of Presentation
      沖縄産業支援センター,沖縄,日本
    • Year and Date
      2017-03-01
    • Related Report
      2016 Annual Research Report
  • [Presentation] 非負値行列因子分解に基づく声質変換のための Graph Embedding を用いたパラレル辞書学習2016

    • Author(s)
      相原龍
    • Organizer
      日本音響学会2016年秋季研究発表会
    • Place of Presentation
      富山大学,富山,日本
    • Year and Date
      2016-09-14
    • Related Report
      2016 Annual Research Report
  • [Presentation] 複素NMFを用いた声質変換の検討2016

    • Author(s)
      李権俊
    • Organizer
      日本音響学会2016年秋季研究発表会
    • Place of Presentation
      富山大学,富山,日本
    • Year and Date
      2016-09-14
    • Related Report
      2016 Annual Research Report
  • [Presentation] 非負値行列因子分解を用いたマルチモーダル声質変換における画像特徴量の検討2016

    • Author(s)
      羅里奈
    • Organizer
      日本音響学会2016年秋季研究発表会
    • Place of Presentation
      富山大学,富山,日本
    • Year and Date
      2016-09-14
    • Related Report
      2016 Annual Research Report
  • [Presentation] Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-embedded Non-negative Matrix Factorization2016

    • Author(s)
      相原龍
    • Organizer
      INTERSPEECH2016
    • Place of Presentation
      Hyatt Regency, San Francisco, USA
    • Year and Date
      2016-09-08
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss2016

    • Author(s)
      高島悠樹
    • Organizer
      INTERSPEECH2016
    • Place of Presentation
      Hyatt Regency, San Francisco, USA
    • Year and Date
      2016-09-08
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Discriminative Graph-embedded Non-negative Matrix_Factorizationを用いた声質変換のためのパラレル辞書学習2016

    • Author(s)
      相原龍
    • Organizer
      電子情報通信学会音声研究会(SP)
    • Place of Presentation
      京都大学,京都,日本
    • Year and Date
      2016-08-24
    • Related Report
      2016 Annual Research Report
  • [Presentation] SEMI-NON-NEGATIVE MATRIX FACTORIZATION USING ALTERNATING DIRECTION METHOD OF MULTIPLIERS FOR VOICE CONVERSION2016

    • Author(s)
      Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
    • Organizer
      IEEE ICASSP 2016
    • Place of Presentation
      Shanghai, China
    • Year and Date
      2016-03-20
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Dysarthric Speech Modification Using Parallel Utterance Based on Non-negative Temporal Decomposition2016

    • Author(s)
      相原龍
    • Organizer
      SLPAT 2016, 7th Workshop on Speech and Language Processing for Assistive Technologies
    • Place of Presentation
      San Francisco, USA
    • Related Report
      2016 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Alternating Direction Method of Multipliersを用いた声質変換のためのパラレル辞書学習2015

    • Author(s)
      相原龍,滝口哲也,有木康雄
    • Organizer
      第17回音声言語シンポジウム
    • Place of Presentation
      名古屋工業大学
    • Year and Date
      2015-12-02
    • Related Report
      2015 Annual Research Report
  • [Presentation] Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars2015

    • Author(s)
      Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
    • Organizer
      The 17th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2015)
    • Place of Presentation
      Lisbon, Portugal
    • Year and Date
      2015-10-26
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] MANY-TO-ONE VOICE CONVERSION USING EXEMPLAR-BASED SPARSE REPRESENTATION2015

    • Author(s)
      Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
    • Organizer
      IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA2015)
    • Place of Presentation
      New Paltz, USA
    • Year and Date
      2015-10-18
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 任意話者を対象としたExemplar-based声質変換2015

    • Author(s)
      相原龍, 滝口哲也, 有木康雄
    • Organizer
      電子情報通信学会音声研究会(SP)
    • Place of Presentation
      神戸大学
    • Year and Date
      2015-10-15
    • Related Report
      2015 Annual Research Report
  • [Presentation] LIP-TO-SPEECH SYNTHESIS USING LOCALITY-CONSTRAINT NON-NEGATIVE MATRIX FACTORIZATION2015

    • Author(s)
      Ryo AIHARA, Kenta MASAKA, Tetsuya TAKIGUCHI, Yasuo ARIKI
    • Organizer
      The First International Workshop on Machine Learning in Spoken Language Processing (MLSLP)
    • Place of Presentation
      Aizu-Wakamatsu, Japan
    • Year and Date
      2015-09-19
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Multiple Non-negative Matrix Factorization に基づく多対多声質変換2015

    • Author(s)
      相原龍, 滝口哲也, 有木康雄
    • Organizer
      日本音響学会2015年秋季研究発表会
    • Place of Presentation
      会津大学
    • Year and Date
      2015-09-16
    • Related Report
      2015 Annual Research Report
  • [Presentation] Many-to-many Voice Conversion Based on Multiple Non-negative Matrix Factorization2015

    • Author(s)
      Ryo Aihara, Testuya Takiguchi, and Yasuo Ariki
    • Organizer
      INTERSPEECH 2015
    • Place of Presentation
      Dresden, Germany
    • Year and Date
      2015-09-06
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] NOISE-ROBUST VOICE CONVERSION USING A SMALL PARALLEL DATA BASED ON NON-NEGATIVE MATRIX FACTORIZATION2015

    • Author(s)
      Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    • Organizer
      The 23rd European Signal Processing Conference (EUSIPCO)
    • Place of Presentation
      Nice, France
    • Year and Date
      2015-08-31
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] ACTIVITY-MAPPING NON-NEGATIVE MATRIX FACTORIZATION FOR EXEMPLAR-BASED VOICE CONVERSION2015

    • Author(s)
      Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Kobe University
    • Organizer
      ICASSP2015
    • Place of Presentation
      Brisbane, Australia
    • Year and Date
      2015-04-21 – 2015-04-24
    • Related Report
      2014 Annual Research Report
  • [Presentation] Multiple Non-negative Matrix Factorizationに基づく多対一声質変換2015

    • Author(s)
      相原龍, 滝口哲也, 有木康雄
    • Organizer
      日本音響学会2015年春季研究発表会
    • Place of Presentation
      中央大学
    • Year and Date
      2015-03-16 – 2015-03-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] 少量のパラレルデータを用いたNon-negative Matrix Factorizationによる雑音環境下の声質変換2015

    • Author(s)
      藤井貴生, 相原龍, 中鹿亘, 滝口哲也, 有木康雄
    • Organizer
      日本音響学会2015年春季研究発表会
    • Place of Presentation
      中央大学
    • Year and Date
      2015-03-16 – 2015-03-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] 非負値行列因子分解に基づく唇動画像からの音声生成2015

    • Author(s)
      真坂健太, 相原 龍, 滝口哲也, 有木康雄
    • Organizer
      日本音響学会2015年春季研究発表会
    • Place of Presentation
      中央大学
    • Year and Date
      2015-03-16 – 2015-03-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] Exemplar-based Emotional Voice Conversion Using Non-negative Matrix Factorization2014

    • Author(s)
      Ryo AIHARA, Reina UEDA, Tetsuya TAKIGUCHI, Yasuo ARIKI
    • Organizer
      APSIPA2014
    • Place of Presentation
      Siem Reap, Cambodia
    • Year and Date
      2014-12-09 – 2014-12-12
    • Related Report
      2014 Annual Research Report
  • [Presentation] Multimodal Exemplar-based Voice Conversion using Lip Features in Noisy Environments2014

    • Author(s)
      Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    • Organizer
      Interspeech2014
    • Place of Presentation
      Singpore
    • Year and Date
      2014-09-14 – 2014-09-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] Error Correction of Automatic Speech Recognition Based on Normalized Web Distance2014

    • Author(s)
      E. Byambakhishig, K. Tanaka, R. Aihara, T. Nakashika, T. Takiguchi, Y. Ariki
    • Organizer
      Interspeech2014
    • Place of Presentation
      Singpore
    • Year and Date
      2014-09-14 – 2014-09-18
    • Related Report
      2014 Annual Research Report
  • [Presentation] アクティビティマッピングによる非負値行列因子分解を用いた声質変換2014

    • Author(s)
      相原龍, 滝口哲也, 有木康雄
    • Organizer
      日本音響学会2014年秋季研究発表会
    • Place of Presentation
      北海学園大学
    • Year and Date
      2014-09-03 – 2014-09-05
    • Related Report
      2014 Annual Research Report
  • [Presentation] 話者適応を用いたNMFによる雑音環境下の声質変換2014

    • Author(s)
      藤井貴生,相原龍,中鹿亘,滝口哲也, 有木康雄
    • Organizer
      日本音響学会2014年秋季研究発表会
    • Place of Presentation
      北海学園大学
    • Year and Date
      2014-09-03 – 2014-09-05
    • Related Report
      2014 Annual Research Report
  • [Presentation] ハイスピードカメラ画像を用いたマルチモーダルNMF声質変換2014

    • Author(s)
      真坂健太,相原龍, 滝口哲也, 有木康雄
    • Organizer
      日本音響学会2014年秋季研究発表会
    • Place of Presentation
      北海学園大学
    • Year and Date
      2014-09-03 – 2014-09-05
    • Related Report
      2014 Annual Research Report
  • [Presentation] Individuality-preserving Voice Conversion for Articulation Disorders Using Dictionary Selective Non-negative Matrix Factorization2014

    • Author(s)
      Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    • Organizer
      SLPAT 2014, 5th Workshop on Speech and Language Processing for Assistive Technologies
    • Place of Presentation
      Baltimore, U.S.
    • Year and Date
      2014-06-26
    • Related Report
      2014 Annual Research Report
  • [Presentation] Normalized Web Distanceを用いた音声認識誤りの訂正法2014

    • Author(s)
      エンフボロルビャムバヒシグ, 田中克幸, 相原龍, 滝口哲也, 有木康雄
    • Organizer
      第28回人工知能学会全国大会
    • Place of Presentation
      愛媛県県民文化会館
    • Year and Date
      2014-05-12 – 2014-05-15
    • Related Report
      2014 Annual Research Report
  • [Presentation] VOICE CONVERSION BASED ON NON-NEGATIVE MATRIX FACTORIZATION USING PHONEME-CATEGORIZED DICTIONARY2014

    • Author(s)
      Ryo AIHARA, Toru NAKASHIKA, Tetsuya TAKIGUCHI, Yasuo ARIKI
    • Organizer
      ICASSP2014
    • Place of Presentation
      Florence, Italy
    • Year and Date
      2014-05-04 – 2014-05-09
    • Related Report
      2014 Annual Research Report
  • [Presentation] MULTIMODAL VOICE CONVERSION USING NON-NEGATIVE MATRIX FACTORIZATION IN NOISY ENVIRONMENTS2014

    • Author(s)
      Kenta MASAKA, Ryo AIHARA, Tetsuya TAKIGUCHI, Yasuo ARIKI
    • Organizer
      ICASSP2014
    • Place of Presentation
      Florence, Italy
    • Year and Date
      2014-05-04 – 2014-05-09
    • Related Report
      2014 Annual Research Report
  • [Book] Computer and Information Science2016

    • Author(s)
      Roger Lee (Editor), Ryo Aihara, Kenta Masaka, Tetsuya Takiguchi, Yasuo Ariki
    • Total Pages
      181
    • Publisher
      Springer International Publishing
    • Related Report
      2016 Annual Research Report

URL: 

Published: 2015-01-22   Modified: 2024-03-26  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi