2018 Fiscal Year Research-status Report
Modelling Linguistic Practices for Learners of German: A Data-driven Approach to Speech Act Sets and Speech Act Sequences
Project/Area Number |
18K00850
|
Research Institution | Waseda University |
Principal Investigator |
SCHARLOTH J. 早稲田大学, 国際学術院, 教授 (70585786)
|
Project Period (FY) |
2018-04-01 – 2021-03-31
|
Keywords | Pragmatics / Language Learning / Corpus Linguistics / Speech Acts / Basic Vocabulary / German |
Outline of Annual Research Achievements |
(1) Pre-Processing: During the first three months of the research project the database with speech act patterns was retrieved and tranformed from an outdated format into xml and json. Furthermore, all of the 244 subclasses of speech act types were reviewed and annotated with the positions of related slots. (2) Preparation for Visualization: During the following two months, various JavaScript libraries for visualization of and interaction with taxonomies were reviewed and tested. (3) Corpus building: The corpora used for the analysis werde updated and extended and comprise 1 billion words from newspapers and online discussion boards. (4) Data Analysis: The subsequent seven months were dedicated to the development and testing of software to compute the distribution of patterned and routinized forms of realization of certain speech act types as well as the most common fillers for slots. The size of search windows and the degree of attraction between elements were crucial parameters. (5) The scientist in charge has given two presentations at international conferences (German Society for Applied Linguistics, Society for Linguistics and Cultural Studies). Moreover, he has organized an international conference at Waseda University to discuss theoretical aspects of the research on speech acts with a focus on polite and invective speech acts on April 1 & 2, 2019. The conference featured 15 presentations, among those 6 from scholars from abroad.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
The progress of the project is in line with the original research plan. The following milestones have been reached: (1) The database of speech act patterns has been processed successfully, (2) the corpus building has been completed, (3) a potent JavaScript library has been identified with with D3.js, (4) The data analysis software was developed and tested on the whole dataset. (5) international scholarly exchange has been initiated. The following challenges occurred: (1) Some of the patterns have shown the tendency to result in too greedy search patterns leading to a overestimation of the frequency of their usage. This has highlighted the neccessity of a more context sensitive modeling of those patterns. (2) Developing algorithms for a reliable identification of filler patterns has turned out to be a demanding and time-consuming task.
|
Strategy for Future Research Activity |
In the months ahead the principal investigator is planning to (1) refine the context sensitive algorithms for speech act patterns associated with too greedy search patterns, (2) refine the algorithms for the detection of fillers and the boundaries of filler phrases, (3) compute pattern co-occurrences within single utterances, (4) compute pattern co-occurrences between subsequent utterances. Two international conference presentations and a research stay abroad are scheduled for the upcoming academic year.
|
Causes of Carryover |
In the fiscal year 2019 I am planning to participate in an international conference (250,000 Yen) and to visit a research institution in Germany or Switzerland for around two weeks (400,000). The aim is to discuss methods and findings with international scholars.
The costs for both trips will amount in a total of 650,000, which will be covered by the 500,000 Yen of costs assigned in the financial plan for year 2, as well as the carry-over from the previous year.
|
Research Products
(11 results)