2015 Fiscal Year Final Research Report
Safety Text De-identification
Project/Area Number |
25540096
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Intelligent informatics
|
Research Institution | Nara Institute of Science and Technology (2015) Kyoto University (2013-2014) |
Principal Investigator |
Aramaki Eiji 奈良先端科学技術大学院大学, 研究推進機構, 特任准教授 (70401073)
|
Co-Investigator(Kenkyū-buntansha) |
Morita Mizuki 岡山大学, 医学部附属病院, 准教授 (00519316)
|
Project Period (FY) |
2013-04-01 – 2016-03-31
|
Keywords | 医療情報学 / 自然言語処理 |
Outline of Final Research Achievements |
The exponential growth in the amount of text data requires a method of a text de-identification. So far, most of de-identification methods detect the named entities, such as a person name, location name, IDs and so on, from the texts, and remove them. However, the named entity based methods suffer from a case that non-named entity conveys the personal information. To deal with this problem, this study proposes a new de-identification method that removes document specific expressions. By using this method, possible expression in a document appears in at least other k-1 documents.
|
Free Research Field |
自然言語処理
|