• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Development of a new data cleaning method for questionnaires used in large cohorts

Research Project

Project/Area Number 18K10099
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 58030:Hygiene and public health-related: excluding laboratory approach
Research InstitutionTohoku University

Principal Investigator

MAKINO Satoshi  東北大学, 東北メディカル・メガバンク機構, 助教 (30423403)

Co-Investigator(Kenkyū-buntansha) 田宮 元  東北大学, 東北メディカル・メガバンク機構, 教授 (10317745)
櫻井 利恵子  東北大学, 東北メディカル・メガバンク機構, 非常勤講師 (50794541)
Project Period (FY) 2018-04-01 – 2021-03-31
Project Status Completed (Fiscal Year 2020)
Budget Amount *help
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2020: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2019: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2018: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Keywordsコホート研究 / データクリーニング / 外れ値検出
Outline of Final Research Achievements

In large-scale genomic cohort studies, in spite of careful study planning and implementation, and the introduction of error prevention methods, various errors are inevitable, and these errors may have a significant impact on the study results. However, it was not possible to manually clean a large number of questionnaires. Therefore, by using a statistical model that extends principal component analysis (PCA) by utilizing known information when detecting outliers from the data population, we developed and implemented a method to automate the detection of candidate errors and to improve its accuracy.

Academic Significance and Societal Importance of the Research Achievements

データクリーニングは、大規模コホート研究のみならず、その重要性が認識されているものの、世界的にコンセンサスを得られた手法は存在しなかった。海外の大規模コホートにおいても、多くはタッチスクリーンベースであるためデータ入力時のエラー発生率は低いと考えられるものの、単純なミスマッチやデータ形式の違いを検出しているのみである。本研究はパターンの違いをエラー検出に利用するため、これまで事実上不可能であった調査票の経時的データや家族間のデータのクリーニングに関しても応用可能となった。

Report

(4 results)
  • 2020 Annual Research Report   Final Research Report ( PDF )
  • 2019 Research-status Report
  • 2018 Research-status Report
  • Research Products

    (6 results)

All 2020 2019 2018

All Journal Article (5 results) (of which Peer Reviewed: 5 results,  Open Access: 3 results) Presentation (1 results)

  • [Journal Article] Clustering by phenotype and genome-wide association study in autism2020

    • Author(s)
      Narita Akira, Nagai Masato, Mizuno Satoshi, Ogishima Soichi et al.
    • Journal Title

      Translational Psychiatry

      Volume: 10 Issue: 1 Pages: 290-290

    • DOI

      10.1038/s41398-020-00951-x

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Genome-wide association study identifies new loci for albuminuria in the Japanese population2020

    • Author(s)
      Hiroshi Okuda, Koji Okamoto, Michiaki Abe, Kota Ishizawa, Satoshi Makino, Osamu Tanabe, Junichi Sugawara, Atsushi Hozawa, Kozo Tanno, Makoto Sasaki, Gen Tamiya, Masayuki Yamamoto, Sadayoshi Ito, Tadashi Ishii
    • Journal Title

      Clinical and Experimental Nephrology

      Volume: 印刷中 Issue: 8 Pages: 1-9

    • DOI

      10.1007/s10157-020-01884-x

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] 3.5KJPNv2: an allele frequency panel of 3552 Japanese individuals including the X chromosome2019

    • Author(s)
      Tadaka S. et al. Koshiba S. et al. Kinoshita K.(合計25名、本研究者は14番目)
    • Journal Title

      Hum Genome Var

      Volume: 6 Issue: 1 Pages: 28-28

    • DOI

      10.1038/s41439-019-0059-5

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Outlier detection for questionnaire data in biobanks2019

    • Author(s)
      Sakurai Rieko、Ueki Masao、Makino Satoshi、Hozawa Atsushi、Kuriyama Shinichi、Takai-Igarashi Takako、Kinoshita Kengo、Yamamoto Masayuki、Tamiya Gen
    • Journal Title

      International Journal of Epidemiology

      Volume: 印刷中 Issue: 4 Pages: 1305-1315

    • DOI

      10.1093/ije/dyz012

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] Goodness-of-fit test for the parametric proportional hazard regression model with interval-censored data2018

    • Author(s)
      Sakurai Rieko、Hattori Satoshi
    • Journal Title

      Biostatistics & Epidemiology

      Volume: 2 Issue: 1 Pages: 115-131

    • DOI

      10.1080/24709360.2018.1529347

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Presentation] バイオバンクにおける質問票データに対する外れ値検出2020

    • Author(s)
      櫻井利恵子
    • Organizer
      第2回日本メディカルAI学会学術集会
    • Related Report
      2019 Research-status Report

URL: 

Published: 2018-04-23   Modified: 2022-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi