Research Abstract |
During the first phase of the project the average-case complexity of Lange and Wiehagen's (1991) pattern language learning algorithm has been studied. This algorithm learns the whole class of all pattern languages in the limit from positive data. We developed an appropriate average-case model addressing the total learning time behavior, i.e., the overall time taken by the algorithm until learning. Concerning Lange and Wiehagen's (1991) algorithm the following results have been obtained. Let ALPHA={0,1, ..} be any finite alphabet. For every pattern pi containing kappa different variables it has been shown that the total learning time is O (log^<|A|> (|A|+kappa) |pi|^2) in the best-case and unbounded in the worst-case. We calculated the expected total learning time to be O (2^<kappa>kappa^2|A||pi|^2log^<|A|> (kappa|A|)) with respect to the uniform distribution. However, most of the framework developed is distribution independent. Subsequently, using refined techniques we improved the resu
… More
lt to O (2^<kappa>kappa^2|pi|^2log^<|A|> (kappa|A|)) which is qualitatively different, since it shows that the expected total time decreases if the alphabet size increases. The results obtained have numerous consequences and applications to other pattern language learning algorithms, e.g., their learnability in the query model, their learnability from good examples. In the second phase, we intensively studied the learnability of one-variable pattern languages from positive data and obtained three new algorithms. The first algorithm is a considerable and uniform improvement over Angluin's (1980) algorithm, since it reduces the time to compute descriptive updates from O (eta^4logeta) to O (eta^2logeta) for inputs of size eta. The algorithm obtained could be effectively parallelized, too, and a slight modification learns all one-variable pattern languages within Angluin' (1988) query model from superset queries only. Finally, we succeeded to design the first one-variable pattern language learning algorithm whose expected total learning time is O (l^2logl) provided the positive data are randomly drawn with respect to a probability distribution with expected string length l. Thus, our algorithm has an expected average-case behavior concerning its total learning time which is provably only by a constant factor larger than the best known update time though the total learning time remains unbounded in the worst-case. A further paper analyzing the total learning time of Haussler's (1987) prediction algorithm for learning conjunctive concepts is under preparation. Less
|