Aggregation and Concept Complexity in Relational Learning

  • Claudia Perlich
  • Foster Provost

Traditional, flat-table learning algorithms have been scrutinized for many years from many angles, and are well understood with respect to their applicability and expressive power.  Much less can be said about relational learning approaches.  One main reason is the lack of clarity of the space of possible relational concepts.

We believe that a workshop that brings together people with a variety of perspectives on relational learning represents a unique opportunity to develop a formal characterization of relational concepts. Such formalization would contribute to research on relational learning in a number of ways.

1. It would provide a framework for the theoretical analysis and comparison of relational learners with respect to their ability to express and learn certain concepts.

2. It could thereby provide guidance in the choice of a method for a particular domain.

3. It would be a valuable tool for addressing the issue of model and representation complexity.

4. It would help to clarify the problem domain that a practitioner/research effort is targeting.

5. It would help to specify the scope of generalization of scientific claims.

We argue for a hierarchy of increasingly more general relational concept classes, conveniently placing standard, single-table (“propositional”) learning algorithms as learning concepts in the most specialized class.  Some other, more general, cases of relational concepts have been examined already, for example attribute hierarchies (Almuallim et al., 1995).  The most general (and complex) class would capture global concepts that include the entire relational structure and all object attributes.

We expect learners developed for more specialized concept classes to apply broadly within the class, but of course to be suboptimal (biased) when a more general class is required.  Nevertheless, it may be that the biased learning method actually performs better than an unbiased counterpart, for some problems.  This is in direct analogy to what has been observed time and time again in machine learning.  For example, linear models (including naïve Bayes) learn a more restricted concept class than does tree induction.  However, they are extremely useful. In some situations, linear models are preferable to tree induction even for problems where the true, underlying model is represented better by a tree (Domingos & Pazzani, 1996; Perlich, Simonoff, & Provost, 2003).  Relational learning systems (e.g., ILP systems) often perform suboptimally on purely propositional tasks, even when in principle they are capable of representing the true concept.

Furthermore, given the extreme complexity of higher level relational concept classes, it is likely that modeling approaches designed for lower levels will have broader expressive power within the lower-level class.  For example, ILP systems can represent very complex relational concepts.  However, they cannot take full advantage of statistical properties like relational auto correlation (Jensen and Neville 2000), which can be extremely useful for modeling in relational domains.

A well-designed concept-class hierarchy will also facilitate a “bi-directional search” approach to relational learning research.  Although we are not aware of it being framed as such, this type of approach already is being taken.  Some researchers are asking how ILP systems can be extended to deal more robustly with more specialized concept classes, and recently there has been a surge of interest in generalizing propositional algorithms (e.g., Bayesian networks, decision trees, logistic regression, and naïve Bayes).  A concept-class hierarchy will clarify and to some degree quantify the nature of extensions of propositional learning methods.  For example, these extensions of propositional methods typically cannot learn concepts as general as can ILP systems.  However, they may be more robust at learning concepts at the lower level.  Research results along these lines are much more satisfying than saying “my relational Foo algorithm is better than FOIL on the domains I’ve tested.”