Development and Applications of Knowledge Discovery System for Data Analysis
Grant-in-Aid for Scientific Research (C)
|Research Institution||Kwansei Gakuin University|
HIGA Mayumi Kwansei Gakuin University, Information Processing Research Center, Professor, 情報処理研究センター, 教授 (90103134)
OKADA Takashi Kwansei Gakuin University, Information Processing Research Center, Professor, 情報処理研究センター, 教授 (00103135)
|Project Fiscal Year
1995 – 1997
Completed(Fiscal Year 1997)
|Budget Amount *help
¥2,000,000 (Direct Cost : ¥2,000,000)
Fiscal Year 1997 : ¥600,000 (Direct Cost : ¥600,000)
Fiscal Year 1996 : ¥600,000 (Direct Cost : ¥600,000)
Fiscal Year 1995 : ¥800,000 (Direct Cost : ¥800,000)
|Keywords||knowledge discovery / data mining / graph structure / time series data / syntatic tree / data analysis / decision tree / visualization / 知識発見 / データマイニング / 時系列データ / グラフ構造データ / ルールの発見 / 構文解析木 / 散布図 / 多変量解析|
The subject of this research project consisted of the following two points. The first was to establish the usefulness of machine learning based data mining methodology in data analysis. The second was to develop a new data mining method for data analysis of structured objects. The main results obtained are summarized as follows.
I.Evaluation of data mining software for usual table structured datasets
(1) Two datasets : disease types from symptoms and bioactivity from chemical structure descriptors were examined by IDIS and Daralogic/R.The results have shown that these software are useful as an explorative data analysis method.
(2) Dlx software was developed to support the data visualization capability for the rules derived from Datalogic/R.
(3) Sentences written by three famous novelists are examined. The derived rules could explain the characteristics in the sentences by each novelists clearly.
(4) The analysis of user profile datasets for multimedia equipments could give us some new rules which could not be obtained by usual statistical methods.
(5) Financial time series dataset was examined, the results of which shows us the necessity to develop a new methodology handling the structured data.
II.Development of data mining software for structured data objects
(6) Theoretical analysis to handle syntactic trees was performed, leading to a new data analysis framework based on decision tree.
(7) The above-mentioned theory was implemented as SYKD (SYntactic tree analysis by Knowledge Discovery) software as a Windows C++ application.
(8) Syntactic trees in EDR corpus were analized by SYKD.Several new structural knowledge were obtained for the usage of case particles in Japanese sentences
Research Output (14results)