Aggregation-Based Feature Invention and Relational Concept Classes

Claudia Perlich
Foster Provost

Venue: Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003)
2003
Type: Selected Conference Paper
Acceptance rates: 13%

Model induction from relational data requires aggregation of the values of attributes of related entities. This paper makes three contributions to the study of relational learning. (1) It presents a hierarchy of relational concepts of increasing complexity, using relational schema characteristics such as cardinality, and derives classes of aggregation operators that are needed to learn these concepts. (2) Expanding one level of the hierarchy, it introduces new aggregation operators that model the distributions of the values to be aggregated and (for classification problems) the differences in these distributions by class. (3) It demonstrates empirically on a noisy business domain that more-complex aggregation methods can increase generalization performance. Constructing features using target-dependent aggregations can transform relational prediction tasks so that well-understood feature-vector-based modeling algorithms can be applied successfully.

Aggregation-Based Feature Invention and Relational Concept Classes

Related Files: