A perceptual model of speech based on real-time speaker adaptation

Research Project

Project/Area Number	21700282
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Cognitive science
Research Institution	Tohoku Institute of Technology (2010-2011) Tohoku University (2009)
Principal Investigator	ITO Masashi 東北工業大学, 知能エレクトロニクス学科, 講師 (00436164)
Project Period (FY)	2009 – 2011
Project Status	Completed (Fiscal Year 2011)
Budget Amount *help	¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000) Fiscal Year 2011: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2010: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000) Fiscal Year 2009: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Keywords	音声 / 話者適応 / 認知モデル / 知覚実験 / 音声知覚 / フォルマント / 母音 / 正弦波モデル / 音声学 / 認知科学
Research Abstract	Perceptual experiments indicated that speakers of different vowels could be correctly identified with accuracy of more than 80%. Analyzing speech signals uttered by 632 speakers, a new analysis method was proposed on the basis of the sinusoidal representation of speech signal. Further, cosine expansion of speech spectra and the quadratic combination of their coefficients were shown to be effective features for vowel perception. The result supports the hypothesis that perceptual features for vowel might be extracted by two-step synaptic combination in auditory periphery.

Report

(4 results)

2011 Annual Research Report Final Research Report ( PDF )
2010 Annual Research Report
2009 Annual Research Report

Research Products
(19 results)

All 2012 2011 2010 2009

All Journal Article (4 results) (of which Peer Reviewed: 1 results) Presentation (15 results)

[Journal Article] 局所変化率変換と時間軸変換に基づく有声音声の正弦波モデル2010
- Author(s)
  伊藤仁, 伊藤彰則
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J93-D(9) Pages: 1745-1754
- NAID
  110007700688
- Related Report
  2011 Final Research Report 2010 Annual Research Report
[Journal Article] Source-filter separation for nonstationary voiced speech based on sinusoidal representation2010
- Author(s)
  Masashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano
- Journal Title
  
  Acoustical Science and Technology 31(2)
  
  Pages: 181-184
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] A source-filter separation for non-stationary voiced speech based on sinusoidal representation2009
- Author(s)
  Ito, M., Ohara, K., Ito, A., and Yano, M.
- Journal Title
  
  Acoustical Science and Technology
  
  Volume: Vol.31(2) Pages: 181-184
- Related Report
  2011 Final Research Report
[Journal Article] 局所変化率変換に基づく有声音声の正弦波モデル2009
- Author(s)
  伊藤仁, 伊藤彰則
- Journal Title
  
  第8回情報科学技術フォーラム講演論文集
  
  Volume: Vol2 Pages: 43-48
- Related Report
  2011 Final Research Report
[Presentation] ケプストラム係数を用いた母音のフォルマント分析2012
- Author(s)
  伊藤仁
- Organizer
  日本音響学会
- Place of Presentation
  東京工業大学(東京都)
- Year and Date
  2012-03-15
- Related Report
  2011 Annual Research Report
[Presentation] ケプストラム係数を用いた母音のフォルマント分析2012
- Author(s)
  伊藤仁, 蒔苗久則
- Organizer
  日本音響学会2012年春季研究発表会講演論文集
- Related Report
  2011 Final Research Report
[Presentation] 話者認識における母音の音韻性の影響2011
- Author(s)
  岩佐尚輝, 亀井大陸, 伊藤仁
- Organizer
  東北地区若手研究者研究発表会
- Place of Presentation
  宮城県仙台市
- Year and Date
  2011-03-13
- Related Report
  2010 Annual Research Report
[Presentation] 上肢運動情報を用いた歌声音声の生成システムの検討2011
- Author(s)
  伊藤仁, 伊藤貴徳, 庄子卓志
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  東京都早稲田大学
- Year and Date
  2011-03-11
- Related Report
  2010 Annual Research Report
[Presentation] 話者認識における母音の音韻性の影響2011
- Author(s)
  岩佐尚輝, 亀井大陸, 伊藤仁
- Organizer
  平成23年東北地区若手研究者研究発表会
- Related Report
  2011 Final Research Report
[Presentation] An effect of formant amplitude in vowel perception2010
- Author(s)
  Ito, M., Ohara, K., Ito, A., Yano, M.
- Organizer
  Interspeech 2010
- Place of Presentation
  千葉県幕張市
- Year and Date
  2010-09-29
- Related Report
  2010 Annual Research Report
[Presentation] フォルマントとスペクトル全体形状を統合した母音知覚モデルの検討2010
- Author(s)
  伊藤仁, 小原桂二, 伊藤彰則, 矢野雅文
- Organizer
  日本音響学会2010年春季研究発表会
- Place of Presentation
  調布
- Year and Date
  2010-03-08
- Related Report
  2009 Annual Research Report
[Presentation] フォルマントピークとスペクトル傾きが母音知覚に及ぼす影響2010
- Author(s)
  小原桂二, 伊藤仁, 矢野雅文
- Organizer
  日本音響学会2010年春季研究発表会
- Place of Presentation
  調布
- Year and Date
  2010-03-08
- Related Report
  2009 Annual Research Report
[Presentation] フォルマントとスペクトル全体形状を統合した母音知覚モデルの検討2010
- Author(s)
  伊藤仁, 小原桂二, 伊藤彰則, 矢野雅文
- Organizer
  日本音響学会2010年春季研究発表会講演論文集
- Related Report
  2011 Final Research Report
[Presentation] フォルマントピークとスペクトル傾きが母音知覚に及ぼす影響2010
- Author(s)
  小原桂二, 伊藤仁, 矢野雅文
- Organizer
  日本音響学2010年春季研究発表会講演論文集
- Related Report
  2011 Final Research Report
[Presentation] An effect of formant amplitude in vowel perception2010
- Author(s)
  Ito, M., Ohara, K., Ito, A., and Yano, M.
- Organizer
  Interspeech 2010
- Place of Presentation
  Makuhari
- Related Report
  2011 Final Research Report
[Presentation] マイクロホンアレイを用いた音声の指向特性の計測2009
- Author(s)
  伊藤仁, 伊藤彰則, 矢野雅文
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  郡山
- Year and Date
  2009-09-15
- Related Report
  2009 Annual Research Report
[Presentation] Relative importance of formant and whole-spectral cues for vowel perception2009
- Author(s)
  伊藤仁, 小原桂二, 伊藤彰則, 矢野雅文
- Organizer
  Interspeech 2009
- Place of Presentation
  Brighton(UK)
- Year and Date
  2009-09-06
- Related Report
  2009 Annual Research Report
[Presentation] Relative importance of formant and whole-spectral cues for vowel perception2009
- Author(s)
  Ito, M., Ohara, K., Ito, A. and Yano, M.
- Organizer
  Interspeech 2009
- Place of Presentation
  Brighton
- Related Report
  2011 Final Research Report
[Presentation] スペクトル全体形状モデルに基づく連続母音の音響特性2009
- Author(s)
  伊藤仁, 伊藤彰則, 矢野雅文
- Organizer
  日本音響学会2009年春季研究発表会講演論文集
- Related Report
  2011 Final Research Report

A perceptual model of speech based on real-time speaker adaptation

Principal Investigator

ITO Masashi 東北工業大学, 知能エレクトロニクス学科, 講師 (00436164)

¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)

Report

Research Products

[Journal Article] 局所変化率変換と時間軸変換に基づく有声音声の正弦波モデル2010

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Source-filter separation for nonstationary voiced speech based on sinusoidal representation2010

Author(s)

Journal Title

Related Report

[Journal Article] A source-filter separation for non-stationary voiced speech based on sinusoidal representation2009

Author(s)

Journal Title

Related Report

[Journal Article] 局所変化率変換に基づく有声音声の正弦波モデル2009

Author(s)

Journal Title

Related Report

[Presentation] ケプストラム係数を用いた母音のフォルマント分析2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] ケプストラム係数を用いた母音のフォルマント分析2012

Author(s)

Organizer

Related Report

[Presentation] 話者認識における母音の音韻性の影響2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 上肢運動情報を用いた歌声音声の生成システムの検討2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 話者認識における母音の音韻性の影響2011

Author(s)

Organizer

Related Report

[Presentation] An effect of formant amplitude in vowel perception2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] フォルマントとスペクトル全体形状を統合した母音知覚モデルの検討2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] フォルマントピークとスペクトル傾きが母音知覚に及ぼす影響2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] フォルマントとスペクトル全体形状を統合した母音知覚モデルの検討2010

Author(s)

Organizer

Related Report

[Presentation] フォルマントピークとスペクトル傾きが母音知覚に及ぼす影響2010

Author(s)

Organizer

Related Report

[Presentation] An effect of formant amplitude in vowel perception2010

Author(s)

Organizer

Place of Presentation

Related Report