• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Annotation and Computer Processing of Language Resources in Non-Latin Scripts and Phonetic Transcription

Research Project

Project/Area Number 15202008
Research Category

Grant-in-Aid for Scientific Research (A)

Allocation TypeSingle-year Grants
Section一般
Research Field Linguistics
Research InstitutionThe University of Tokyo

Principal Investigator

MATSUMURA Kazuto  The University of Tokyo, Graduate School of Humanities and Sociology, Professor, 大学院人文社会系研究科, 教授 (40165866)

Co-Investigator(Kenkyū-buntansha) FUKUI Rei  The University of Tokyo, Graduate School of Humanities and Sociology, Associate Professor, 大学院人文社会系研究科, 助教授 (50199189)
TAKIZAWA Naohiro  Nagoya University, Graduate School of International Development, Professor, 大学院国際開発研究科, 教授 (60252285)
YAMADA Hisanari  Otaru University of Commerce, Center for Language Studies, Associate Professor, 言語センター, 助教授 (60345246)
CHIBA Shoju  Reitaku University, College of Foreign Studies, Associate Professor, 外国語学部, 助教授 (70337723)
HATANO Toshie  The University of Tokyo, Graduate School of Humanities and Sociology, Research Associate, 大学院人文社会系研究科, 助手 (40376520)
Project Period (FY) 2003 – 2005
Project Status Completed (Fiscal Year 2005)
Budget Amount *help
¥20,930,000 (Direct Cost: ¥16,100,000、Indirect Cost: ¥4,830,000)
Fiscal Year 2005: ¥7,150,000 (Direct Cost: ¥5,500,000、Indirect Cost: ¥1,650,000)
Fiscal Year 2004: ¥6,760,000 (Direct Cost: ¥5,200,000、Indirect Cost: ¥1,560,000)
Fiscal Year 2003: ¥7,020,000 (Direct Cost: ¥5,400,000、Indirect Cost: ¥1,620,000)
Keywordsphonetic alphabet / Cyrillic / corpus / endangered language / markup / multilingual computing / Language resources / Unicode / コンピュータ / XML / フォント
Research Abstract

The main objective of this three-year project was the digitization of linguistic resources of endangered and minority languages of Russia and neighboring countries. A greater part of the linguistic resources digitized during the course of this project are texts written in some variety of Cyrillic script or phonetic transcriptions of recorded speech. The languages concerned were Avar (Daghestan), Itelmen (Kamchatka), and Uralic languages (Estonian, Mari, Vepsian). Each of the digitized texts or linguistic documents is encoded in UTF-8 (the most common method of Unicode encoding at the moment). A quality OpenType font (named JLOT-Fluralic) equipped with all the glyphs of the Uralic Phonetic Alphabet (UPA) as well as the Cyrillic characters defined in the Unicode Standard 4.0 was created for this purpose in collaboration with Finnish colleagues. An open hands-on seminar was held on the XML markup of texts. Most of the linguistic documents created in this project were provided with XML markup and converted into (well-formed) XML documents. Digitized linguistic resources constitute a major part of linguistic documentation of endangered and minority languages. In order to obtain concrete pictures of communities of minority languages, visits were paid to active members and researchers in local speech communities: the city of Hitoyoshi, Kumamoto Prefecture (Kuma dialect), as well as Naha, Ginowan and Itoman of Okinawa Prefecture (Okinawan).

Report

(4 results)
  • 2005 Annual Research Report   Final Research Report Summary
  • 2004 Annual Research Report
  • 2003 Annual Research Report
  • Research Products

    (52 results)

All 2006 2005 2004 2003 2002 Other

All Journal Article (40 results) Book (8 results) Publications (4 results)

  • [Journal Article] マリ語の言語資料とその電子化2006

    • Author(s)
      松村一登
    • Journal Title

      Uralica 14

      Pages: 45-56

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 『青空文庫』を言語コーパスとして使おう-メタデータ構築による歴史的・社会言語学的研究への応用の試み-2006

    • Author(s)
      千葉庄寿
    • Journal Title

      言語処理学会第12回年次大会発表論文集

      Pages: 915-918

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Digitization of Mari linguistic resources2006

    • Author(s)
      Kazuto Matsumura
    • Journal Title

      Uralica 14

      Pages: 45-56

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] How to use Aozora Bunko as linguistic corpora : building and utilizing metadata for diacronic and sociolinguistic study2006

    • Author(s)
      Shoju Chiba
    • Journal Title

      Proceedings of the 12th Annual Meeting of the Association for Natural Language Processing

      Pages: 915-918

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] マリ語の言語資料とその電子化2006

    • Author(s)
      松村一登
    • Journal Title

      Uralica 14(印刷中)

    • Related Report
      2005 Annual Research Report
  • [Journal Article] A corpus-based study of the'haven't NP' pattern in American English2005

    • Author(s)
      滝沢直宏
    • Journal Title

      Aspects of English Negation

      Pages: 159-171

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] コーパスと言語研究2005

    • Author(s)
      滝沢直宏
    • Journal Title

      日語教育 32

      Pages: 3-20

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] イテリメン語テキスト12005

    • Author(s)
      小野智香子
    • Journal Title

      環北太平洋の言語 12

      Pages: 81-88

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 1930年代のカルムイクにおける言語政策2005

    • Author(s)
      荒井幸康
    • Journal Title

      日本モンゴル学会紀要 35

      Pages: 41-56

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 1930年代のブリヤー卜の言語政策2005

    • Author(s)
      荒井幸康
    • Journal Title

      スラヴ研究 52

      Pages: 145-176

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A corpus-based study of the 'haven't NP' pattern in American English2005

    • Author(s)
      Naohiro Takizawa
    • Journal Title

      Aspects of English Negation

      Pages: 159-171

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Corpora and Linguistic Studies2005

    • Author(s)
      Naohiro Takizawa
    • Journal Title

      Journal of Japanese Language Education Association 32

      Pages: 3-20

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Itelmen Text 12005

    • Author(s)
      Chikako Ono
    • Journal Title

      Languages of the North Pacific Rim 12

      Pages: 81-88

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A corpus-based study of the 'haven't NP' pattern in American English2005

    • Author(s)
      滝沢直宏
    • Journal Title

      Aspects of English Negation (Iyeiri, Yoko (ed.))

      Pages: 159-171

    • Related Report
      2005 Annual Research Report
  • [Journal Article] マリ語の言語資料とその電子化2005

    • Author(s)
      松村 一登
    • Journal Title

      Uralica 14(印刷中)

    • Related Report
      2004 Annual Research Report
  • [Journal Article] コーパスと言語研究2005

    • Author(s)
      滝沢直宏
    • Journal Title

      日語教育 31(印刷中)

    • Related Report
      2004 Annual Research Report
  • [Journal Article] Mapping (in)direct causation : a corpus-based approach to the Finnish causative constructions2005

    • Author(s)
      千葉庄寿
    • Journal Title

      東北大学言語学論集 11(印刷中)

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 周辺的な構文を記述するためのコーパス利用-現代英語におけるSOV構文を例に-2004

    • Author(s)
      滝沢直宏
    • Journal Title

      英語コーパス研究 11

      Pages: 153-167

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 形態情報注釈入りロシア語コーパス作成のためのツール2004

    • Author(s)
      山田久就
    • Journal Title

      ロシア語ロシア文学研究 36

      Pages: 111-118

    • NAID

      110001247122

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary 2004 Annual Research Report
  • [Journal Article] The Cognitive Unit of Segmentation for Speech in Japanese2004

    • Author(s)
      畑野智栄
    • Journal Title

      The 18th Internat ional Congress on Acoustics

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] l telmen Verb stem : Morphological Features and Syntactic structure of Intransitive and Transitive2004

    • Author(s)
      小野智香子
    • Journal Title

      Languages of the North Pacific Rim 9

      Pages: 169-177

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] イテリメン語の両唇軟口蓋音について2004

    • Author(s)
      小野智香子
    • Journal Title

      環北太平洋の言語 11

      Pages: 79-90

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Corpus-Based Description of Peripheral Linguistic Phenomena : With Special Reference to the SOV Construction in Present-Day English2004

    • Author(s)
      Naohiro Takizawa
    • Journal Title

      English Corpus Studies 11

      Pages: 153-167

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Tools for building morphologically annotated corpora of Russian2004

    • Author(s)
      Hisanari Yamada
    • Journal Title

      Bulletin of the Japan Association for the Study of Russian Language and Literature 36

      Pages: 111-118

    • NAID

      110001247122

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] The cognitive unit of segmentation for speech in Japanese2004

    • Author(s)
      Toshie Hatano
    • Journal Title

      The 18th International Congress on Acoustics

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Itelmen verb stem : morphological features and syntactic structure of Intransitive and Transitive2004

    • Author(s)
      Chikako Ono
    • Journal Title

      Languages of the North Pacific Rim 9

      Pages: 169-177

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Itelmen labial-velar fricative and approximant2004

    • Author(s)
      Chikako Ono
    • Journal Title

      Languages of the North Pacific Rim 11

      Pages: 79-90

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Pitch accent systems in Korean2003

    • Author(s)
      福井玲
    • Journal Title

      Proceeding of the Symposium : Cross-linguistic studies of Tonal Phonomena. ILCAA

      Pages: 275-286

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] ロシア北東部における先住少数民族の言語使用2003

    • Author(s)
      小野智香子
    • Journal Title

      ことばと社会 7

      Pages: 63-87

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] カムチャッカの自然とともに生きる-イテリメン2003

    • Author(s)
      小野智香子
    • Journal Title

      北のことばフィールド・ノート-18の言語と文化-

      Pages: 119-134

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Pitch accent systems in Korean, Proceeding of the Symposium : Cross-linguistic Studies of Tonal Phonomena.2003

    • Author(s)
      Rei Fukui
    • Journal Title

      ILCAA

      Pages: 275-286

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Language use of the indigenous minority peoples in the north-eastern part of Russia2003

    • Author(s)
      Chikako Ono
    • Journal Title

      Language and Society 7

      Pages: 63-87

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] To Live in the Nature of Kamchatka-Itelmen2003

    • Author(s)
      Chikako Ono
    • Journal Title

      Field notes on Northern Languages-18 languages and cultures

      Pages: 119-134

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] カルムィクのことば2002

    • Author(s)
      荒井幸康
    • Journal Title

      日本モンゴル学会紀要 32

      Pages: 13-27

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Mapping (in) direct causation : a corpus-based approach to the Finnish causative constructions

    • Author(s)
      千葉庄寿
    • Journal Title

      東北大学言語学論集 (印刷中)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] 構造化された言語データが言語研究にもたらすもの-コーパスを利用する言語研究者の知識基盤としてのXML-

    • Author(s)
      千葉庄寿
    • Journal Title

      麗澤大学紀要 82(印刷中)

    • NAID

      110007326528

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] イテリメン語の動詞語幹の分類とその派生法

    • Author(s)
      小野智香子
    • Journal Title

      環北太平洋の言語 13(印刷中)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Mapping (in) direct causation : a corpus-based approach to the Finnish causative constructions

    • Author(s)
      Shoju Chiba
    • Journal Title

      Tohoku University Linguistics Journal (in press)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Structured electronic data and corpus-based research : XML as (coming) technological core for linguists

    • Author(s)
      Shoju Chiba
    • Journal Title

      Reitaku University Journal 82(in press)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Classification of Itelmen verb stems and their derivation system

    • Author(s)
      Chikako Ono
    • Journal Title

      Languages of the North Pacific Rim 13(in press)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] コーパスで一目瞭然2006

    • Author(s)
      滝沢直宏
    • Total Pages
      207
    • Publisher
      小学館
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] 言語の統合と分離-1920-1940年代のモンゴル・ブリヤート・カルムイクの言語政策の相関関係を中心に-2006

    • Author(s)
      荒井幸康
    • Total Pages
      251
    • Publisher
      三元社
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] A corpus is Indeed Informative!2006

    • Author(s)
      Naohiro Takizawa
    • Total Pages
      207
    • Publisher
      Shogakukan
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] 文科系研究者のための多言語処理入門-WindowsXP環境を例に-2005

    • Author(s)
      千葉庄寿
    • Total Pages
      80
    • Publisher
      麗澤大学言語研究センター
    • Related Report
      2004 Annual Research Report
  • [Book] 言語学 第2版2004

    • Author(s)
      松村一登
    • Total Pages
      272
    • Publisher
      東京大学出版会
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] Linguistics : an introduction 2 Edition2004

    • Author(s)
      Kazuto Matsumura
    • Total Pages
      272
    • Publisher
      University of Tokyo press
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] 麗澤大学言語研究センター

    • Author(s)
      千葉庄寿
    • Publisher
      文科系研究者のための多言語処理入門-Windows XP環境を例に-(印刷中)
    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Book] An Introduction to Multilingual Computing for the Humanities : Handling Multilingual Texts wih Windows XP

    • Author(s)
      Shoju Chiba
    • Publisher
      Linguistic Research Center, Reitaku University(in press)
    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Publications] MATSUMURA Kazuto: "The Activities of ICHEL and its Future Prospects."UNESCO. (印刷中).

    • Related Report
      2003 Annual Research Report
  • [Publications] FUKUI Rei: "Pitch accent systems in Korean."Kaji S.(ed.) Proceedings of the Symposium : Cross-linguistic Studies of Tonal Phenomena. ILCAA. 275-286 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 鈴木麗璽, 小野智香子, 松村一登: "フィールド言語学者のためのUnicodeツール(CD-ROM付)"大阪学院大学情報学部「環太平洋の言語」. v+33 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 千葉庄寿: "危機言語のコーパス構築のために"大阪学院大学情報学部「環太平洋の言語」. iv+119 (2003)

    • Related Report
      2003 Annual Research Report

URL: 

Published: 2003-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi