This article presents a combination of unsupervised and supervised learning techniques for generation of word segmentation rules from a list of words. First, a bias for word segmentation is introduced and a simple genetic algorithm is used for the search of segmentation that corresponds to the best bias value. In the second phase, the segmentation obtained from the genetic algorithm is used as an input for two inductive logic programming algorithms, namely FoIDL and CLOG. The result is a logic program that can be used for segmentation of unseen words. The learnt program contains affixes which are characteristic for the given language and can be used in other morphology tasks.
|Published - 1998