## LSA Report

### Refine

#### Keywords

- CAPlan (2)
- Abstraction (1)
- Expertensysteme (1)
- Genetic Algorithm (1)
- Instance-based Learning (1)
- Nearest-Neighbor Classification (1)
- Problem Solvers (1)
- Smalltalk (1)
- Topology Preserving Networks (1)
- automated computer learning (1)

- 97,5E
- Using Prolog with Smalltalk-based Problem Solvers for Planning and Design Tasks (1999)
- Rules are an important knowledge representation formalism in constructive problem solving. On the other hand, object orientation is an essential key technology for maintaining large knowledge bases as well as software applications. Trying to take advantage of the benefits of both paradigms, we integrated Prolog and Smalltalk to build a common base architecture for problem solving. This approach has proven to be useful in the development of two knowledge-based systems for planning and configuration design (CAPlan and Idax). Both applications use Prolog as an efficient computational source for the evaluation of knowledge represented as rules.

- 96,2E
- Optimized Nearest-Neighbor Classifiers Using Generated Instances (1996)
- We present a novel approach to classification, based on a tight coupling of instancebased learning and a genetic algorithm. In contrast to the usual instance-based learning setting, we do not rely on (parts of) the given training set as the basis of a nearestneighbor classifier, but we try to employ artificially generated instances as concept prototypes. The extremely hard problem of finding an appropriate set of concept prototypes is tackled by a genetic search procedure with the classification accuracy on the given training set as evaluation criterion for the genetic fitness measure. Experiments with artificial datasets show that - due to the ability to find concise and accurate concept descriptions that contain few, but typical instances - this classification approach is considerably robust against noise, untypical training instances and irrelevant attributes. These favorable (theoretical) properties are corroborated using a number of hard real-world classification problems.

- 96,6E
- On Learning and Co-learning of Minimal Programs (1999)
- Freivalds, Karpinski and Smith [8] explored a special type of learning in the limit: identification of an unknown concept (function) by eliminating (erasing) all but one possible hypothesis (this type of learning is called co-learning). The motivation behind learning by erasing lies in the process of human and automated computer learning: often we can discard incorrect solutions much easier than to come up with the correct one. In Gödel numberings any learnable family can be learned by an erasing strategy. In this paper we concentrate on co-learning minimal programs. We show that co-learning of minimal programs, as originally defined is significantly weaker than learning minimal programs in Gödel numberings. In order to enhance the learning power

- 97,4E
- Investigating the Use of nearest-neighbour Interpolation for Cancer Research (1997)
- We investigate in how far interpolation mechanisms based on the nearest-neighbor rule (NNR) can support cancer research. The main objective is to usethe NNR to predict the likelihood of tumorigenesis based on given risk factors.By using a genetic algorithm to optimize the parameters of the nearest-neighbourprediction, the performance of this interpolation method can be improved sub-stantially. Furthermore, it is possible to detect risk factors which are hardly ornot relevant to tumorigenesis. Our preliminary studies demonstrate that NNR-based interpolation is a simple tool that nevertheless has enough potential to beseriously considered for cancer research or related research.

- 96,1E
- Interpreting Topology Preserving Networks (1999)
- In this report, we first propose a dichotomy of topology preserving network models based on the degree to which the structure of a network is determined by the given task. We then look closer at one of those groups and investigate the information that is contained in the graph structure of a topology preserving neural network. The task we have in mind is the usage of the network's topology for the retrieval of nearest neighbors of a neuron or a query, as it is of importance, e.g., in medical diagnosis systems. In general considerations, we propose certain properties of the structure and formulate the respective expectable results of network interpretation. From the results we conclude that both topology preservation as well as neuron distribution are highly influential for the network semantics. After a short survey on hierarchical models for data analysis, we propose a new network model that fits both needs. This so called SplitNet model dynamically constructs a hierarchically structured network that provides interpretability by neuron distribution, network topology and hierarchy of the network layers. We present empirical results for this new model and demonstrate its application in the medical domain of nerve lesion diagnosis. Further, we explain a view how the interpretation of the hierarchy in models like SplitNet can be understood in the context of integration of symbolic and connectionist learning.

- 96,7E
- Intelligente Suche im Internet mit Lernenden Systemen (1996)
- Der Trend zu einer immer stärkeren Kopplung von Systemen bei gleichzeitiger Dezentralisierung durch Vernetzung hat dazu geführt, daß Computernutzern auf Wunsch enorme Datenmengen zur Verfügung stehen, die sich einer sinnvollen Bearbeitung durch den Nutzer allein völlig entziehen. Unterschiedliche Repräsentationsformalismen für Informationen, Mehrdeutigkeiten, Redundanz sowie eingeschränkte Verfügbarkeit sowohl von Informationen als auch von Rechenleistung machen konventionelle Suchverfahren unanwendbar. Stattdessen werden Suchverfahren und Programme benötigt, die sich intelligent an unterschiedliche Formalismen anpassen, ihre Handlungen ständig evaluieren und fähig sind, ihre Benutzer individuell zu unterstützen. Schlagwörter wie Knowbots, Search-Engines oder Data-Miningsind deshalb zur Zeit in aller Munde. Ein umfassendes Buch, das die hinter diesen und ähnlichen Schlagwörtern verborgenen Ideen und Konzepte präsentiert, existiert jedoch zur Zeit noch nicht. Dies war für uns die Motivation, das Thema "Intelligente Suche im Internet mit Lernenden Systemen" in einem Seminar zu behandeln. Wir haben damit ein Forschungsgebiet aufgegriffen, das sowohl für alle am LSA beteiligten Gruppen von Interesse ist, aber darüber hinaus aktuell von vielen Seiten aufmerksam beobachtet wird. Daher haben wir uns entschlossen, die Ausarbeitungen, die im Rahmen dieses Seminars von den TeilmehmerInnen erstellt wurden, durch den vorliegenden Bericht einer breiteren Öffentlichkeit zugänglich zu machen.

- 97,3E
- Flexible Proof-Replay with Heuristics (1997)
- We present a general framework for developing search heuristics for au-tomated theorem provers. This framework allows for the construction ofheuristics that are on the one hand able to replay (parts of) a given prooffound in the past but are on the other hand flexible enough to deviate fromthe given proof path in order to solve similar proof problems. We substanti-ate the abstract framework by the presentation of three distinct techniquesfor learning appropriate search heuristics based on soADcalled features. Wedemonstrate the usefulness of these techniques in the area of equational de-duction. Comparisons with the renowned theorem prover Otter validatethe applicability and strength of our approach.

- 98,2E
- Extending Problem Specifications for Plan-Space Planners (1999)
- Problem specifications for classical planners based on a STRIPS-like representation typically consist of an initial situation and a partially defined goal state. Hierarchical planning approaches, e.g., Hierarchical Task Network (HTN) Planning, have not only richer representations for actions but also for the representation of planning problems. The latter are defined by giving an initial state and an initial task network in which the goals can be ordered with respect to each other. However, studies with a specification of the domain of process planning for the plan-space planner CAPlan (an extension of SNLP) have shown that even without hierarchical domain representation typical properties called goal orderings can be identified in this domain that allow more efficient and correct case retrieval strategies for the case-based planner CAPlan/CbC. Motivated by that, this report describes an extension of the classical problem specifications for plan-space planners like SNLP and descendants. These extended problem specifications allow to define a partial order on the planning goals which can interpreted as an order in which the solution plan should achieve the goals. These goal ordering can theoretically and empirically be shown to improve planning performance not only for case-based but also for generative planning. As a second but different way we show how goal orderings can be used to address the control problem of partial order planners. These improvements can be best understood with a refinement of Barrett's and Weld's extended taxonomy of subgoal collections.

- 95,8E
- Exploiting past proof experience (1999)
- We are going to present two methods that allow to exploit previous expe-rience in the area of automated deduction. The first method adapts (learns)the parameters of a heuristic employed for controlling the application of infer-ence rules in order to find a known proof with as little redundant search effortas possible. Adaptation is accomplished by a genetic algorithm. A heuristiclearned that way can then be profitably used to solve similar problems. Thesecond method attempts to re-enact a known proof in a flexible manner in orderto solve an unknown problem whose proof is believed to lie in (close) vicinity.The experimental results obtained with an equational theorem prover show thatthese methods not only allow for impressive speed-ups, but also make it possibleto handle problems that were out of reach before.

- 96,9E
- Experiments in the Automatic Selection of Problem-solving Strategies (1999)
- We present an approach to automating the selection of search-guiding heuris-tics that control the search conducted by a problem solver. The approach centerson representing problems with feature vectors that are vectors of numerical val-ues. Thus, similarity between problems can be determined by using a distancemeasure on feature vectors. Given a database of problems, each problem beingassociated with the heuristic that was used to solve it, heuristics to be employedto solve a novel problem are suggested in correspondence with the similaritybetween the novel problem and problems of the database.Our approach is strongly connected with instance-based learning and nearest-neighbor classification and therefore possesses incremental learning capabilities.In experimental studies it has proven to be a viable tool for achieving the finaland crucial missing piece of automation of problem solving - namely selecting anappropriate search-guiding heuristic - in a flexible way.This work was supported by the Deutsche Forschungsgemeinschaft (DFG).