2011 Fiscal Year Final Research Report

Building a Native/Non溶ative EngIish Language Technical Paper Corpus from Web and its Release and Application

Research Project

Project/Area Number	20320082
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Foreign language education
Research Institution	Kyushu University
Principal Investigator	TOMIURA Yoich1 九州大学, システム情報科学研究院, 教授 (10217523)
Co-Investigator(Kenkyū-buntansha)	TANAKA Shosaku 立命館大学, 文学部, 准教授 (00325549) GOTO Kazuaki 摂南大学, 外国語学部, 講師 (90397662) HAYAMA Megumi 濁協大学, 外国語学部, 准教授 (60409555) ANDO Nahoko 九州大学, 大学院・法学研究院, 専門研究員 (50380655) SHIBATA Msahiro 九州大学, 情報基盤研究開発センター, 学術研究員 (00452813)
Project Period (FY)	2008 – 2011
Keywords	コーパス / Web / 英文の質判定 / 仮説検定 / 英作文支援 / 英語教育 / 著作権
Research Abstract	We developed a method for collecting English language technical papers on the private web pages using web search engine and a statistical method for estimating the English quality of a document based on the characteristics about the sequences of part of speeches in the document. Furthermore, using these methods, we developed a system to build a large-scale English language technical paper corpus from Web, which includes the information about English quality for each paper. We also investigated copyright problems and what we should consider on building a corpus form Web and releasing it.

Research Products
(13 results)

All 2011 2010 2009 Other

All Journal Article (7 results) (of which Peer Reviewed: 6 results) Presentation (5 results) Remarks (1 results)

[Journal Article] Tomiura, Extraction of Alternative Candidates for Unnatural Adjective-Noun Co-occurrence Construction of English2011
- Author(s)
  M. Shibata, T. Funatsu, Y. Tomiura
- Journal Title
  
  Procedia Social and Behavioral Science
  
  Volume: Vol.27 Pages: 32-41
- Peer Reviewed
[Journal Article] Webを源とした質情報付き英語科学論文コーパスの構築法2011
- Author(s)
  田中省作, 柴田雅博, 冨浦洋一
- Journal Title
  
  英語コーパス研究
  
  Volume: 第18巻 Pages: 61-71
- Peer Reviewed
[Journal Article] Webコーパスの言語情報処理基盤2010
- Author(s)
  田中省作
- Journal Title
  
  英語コーパス研究
  
  Volume: 第18巻 Pages: 97-111
- Peer Reviewed
[Journal Article] 著作権法のもとでの情報解析2010
- Author(s)
  安東奈穂子
- Journal Title
  
  人工知能学会誌
  
  Volume: 第25巻 Pages: 634-652
[Journal Article] dentification among Similar Languages Using Statistical Hypothesis Testing2009
- Author(s)
  M. Shibata, Y. Tomiura, T. Mizuta
- Journal Title
  
  Proc. of Pacific Association for Computational Linguistics
  
  Pages: 47-52
- Peer Reviewed
[Journal Article] 仮説検定に基づく英文書の母語話者性の判別2009
- Author(s)
  冨浦洋一, 青木さやか, 柴田雅博, 行野顕正
- Journal Title
  
  自然言語処理
  
  Volume: Vol.16 Pages: 23-46
- Peer Reviewed
[Journal Article] Webを源とした質情報付き英語科学論文コーパス
- Author(s)
  田中省作, 安東奈穂子, 冨浦洋一, コーパス構築と著作権
- Journal Title
  
  英語コーパス研究
  
  Volume: 第19巻(印刷中) Pages: 31-41
- Peer Reviewed
[Presentation] Extraction of Alternative Candidates for Unnatural Adjective-Noun Co-occurrence Construction of English2011
- Author(s)
  M. Shibata, T. Funatsu, Y. Tomiura
- Organizer
  Pacific Association for Computational Linguistics(PACLING' 11)
- Place of Presentation
  Malaysia
- Year and Date
  2011-07-19
[Presentation] 冨浦洋一,ランダムフォレストを用いた英語科学論文の分類と評価2011
- Author(s)
  小林雄一郎,田中省作,冨浦洋一
- Organizer
  情報処理学会人文科学とコンピュータ研究会第90回研究発表会
- Place of Presentation
  同志社大学
- Year and Date
  2011-05-21
[Presentation] Webコーパスの言語情報処理基盤2010
- Author(s)
  田中省作
- Organizer
  英語コーパス学会第35回大会シンポジウム
- Place of Presentation
  兵庫県立大学(兵庫県)
- Year and Date
  2010-04-24
[Presentation] Webを源とした英語科学論文コーパスの構築一技術的方法論と法的観点からの検討一2009
- Author(s)
  田中省作
- Organizer
  英語コーパス学会第34回大会
- Place of Presentation
  青山学院大学(東京都)
- Year and Date
  2009-10-03
[Presentation] dentification among Similar Languages Using Statistical Hypothesis Testing2009
- Author(s)
  M. Shibata, Y. Tomiura, T. Mizuta
- Organizer
  Pacific Association for Computational Linguistics(PACLING' 09)
- Place of Presentation
  Hokkaido University
- Year and Date
  2009-09-01
[Remarks]
- URL
  http://nlp.inf,kyushu-u.ac,jp

2011 Fiscal Year Final Research Report

Building a Native/Non溶ative EngIish Language Technical Paper Corpus from Web and its Release and Application

Principal Investigator

TOMIURA Yoich1 九州大学, システム情報科学研究院, 教授 (10217523)

Research Products

[Journal Article] Tomiura, Extraction of Alternative Candidates for Unnatural Adjective-Noun Co-occurrence Construction of English2011

Author(s)

Journal Title

[Journal Article] Webを源とした質情報付き英語科学論文コーパスの構築法2011

Author(s)

Journal Title

[Journal Article] Webコーパスの言語情報処理基盤2010

Author(s)

Journal Title

[Journal Article] 著作権法のもとでの情報解析2010

Author(s)

Journal Title

[Journal Article] dentification among Similar Languages Using Statistical Hypothesis Testing2009

Author(s)

Journal Title

[Journal Article] 仮説検定に基づく英文書の母語話者性の判別2009

Author(s)

Journal Title

[Journal Article] Webを源とした質情報付き英語科学論文コーパス

Author(s)

Journal Title

[Presentation] Extraction of Alternative Candidates for Unnatural Adjective-Noun Co-occurrence Construction of English2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 冨浦洋一,ランダムフォレストを用いた英語科学論文の分類と評価2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Webコーパスの言語情報処理基盤2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Webを源とした英語科学論文コーパスの構築一技術的方法論と法的観点からの検討一2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] dentification among Similar Languages Using Statistical Hypothesis Testing2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Remarks]

URL