2005 Fiscal Year Final Research Report Summary
Development of a Coding Supporting System for Responses to Open-ended Questions with Applying Natural Language Processing Technique
Project/Area Number |
16530341
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Sociology
|
Research Institution | Keiai University |
Principal Investigator |
TAKAHASHI Kazuko Keiai University, International Studies, Assistant Professor, 国際学部, 助教授 (30211337)
|
Co-Investigator(Kenkyū-buntansha) |
TAKAMURA Hiroya Tokyo Institute of Technology, Precision and Intelligence Laboratory, Research Assistant, 精密工学研究所, 助手 (80361773)
|
Project Period (FY) |
2004 – 2005
|
Keywords | open-ended question / supporting cording / machine learning / support vector machines / natural language processing / class membership probability / classification score / NANACO system |
Research Abstract |
The main research result is that we have almost completed for coders the NANACO system, which supports an occupation coding in social surveys. In the occupation coding, responses to open-ended questions about respondents' occupation have to been classified into one of nearly 200 occupation codes for statistical analysis, which is time consuming and sometimes misclassified. First the NANACO system presents 5 occupation codes automatically classified with machine learning (support vector machines) and hand-crafted rules as candidates on coder's monitor. Second the NANACO system also presents information about respondents such as age, sex and educational background which help for coders to understand responses. Third the NANACO system has various functions such as a view window for a "dictionary" which defines each occupation code for coder to work easily. Finely the NANACO system saves data including occupation codes set by coders as a Comma Separated Values (CSV) file. Furthermore coders who used the NANACO system in social surveys such as JGSS (Japanese General Social Surveys) or SSM (Social Stratification and social Mobility) survey, asked us to supply a measure of confidence a measure of confidence for the first-ranked candidate of their confidence decisions. Therefore, we proposed a new method to estimate a measure of confidence, class membership probability, by using not only the first-class score but also the second-class score from a classifier. We showed that the proposed method was more effective than previous methods. In future work, we will set class membership probabilities into the NANACO system. It is clear that the NANACO system can be easily extended for any responses to open-ended questions which have predefined categories.
|
Research Products
(14 results)