2020 Fiscal Year Final Research Report
Does the Word Embeddings of Medical Terms by Word2Vec Quantitatively Represent the Mathematical Distance between Diseases?
Project/Area Number |
19K16941
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 52010:General internal medicine-related
|
Research Institution | Chiba University |
Principal Investigator |
Yokokawa Daiki 千葉大学, 医学部附属病院, 特任助教 (80779869)
|
Project Period (FY) |
2019-04-01 – 2021-03-31
|
Keywords | 診療録 / 自然言語処理 / 分散表現 / 埋め込みベクトル / 疾患間距離 / 症状間距離 / Word2Vec / Doc2Vec |
Outline of Final Research Achievements |
In this study, we used Word2Vec and Doc2Vec, two deep learning techniques, to obtain distributed representations of words and sentences from the medical records (text data of electronic medical records) of patients who visited the Department of General Medicine, Chiba University Hospital from 2013 to 2019. In Word2Vec, a total of 10578020 words were used for deep learning of word-to-word proximity and relationships. As a result, "consultation" and "referral," "cough" and "nasal discharge," and "hay fever" and "allergic rhinitis" were shown to be similar words. We also tried to predict diagnosis names by deep learning using Doc2Vec by pairing embedding vectors of medical records with diagnosis names, but the accuracy was only 50%.
|
Free Research Field |
自然言語処理、診断推論
|
Academic Significance and Societal Importance of the Research Achievements |
疾患や症状がベクトルで数学的に表現でき、日本語の医学用語として正しい結果と解釈できる場合、疾患と疾患の類似度が表現でき、疾患同士の距離(近さや遠さ)と解釈することができます。医師は臨床診断をするときに疾患同士の距離をイメージしますが、これまでは医師の経験に大きく頼らざるを得ない状況でした。疾患同士の距離が私達が研究で得たベクトルにより定量的に数字で表現できるれば、病名の想起し忘れなどがないよう助けるシステムが構築できる可能性があります。医師も人間である以上、悲劇的な誤診を避けられず、誤診の削減は我々の大きな目標であり、今後は個人の努力だけでなくシステムとしてサポートできる可能性が期待できます。
|