Decision-Centric Active Learning of Binary-Outcome Models

  • Foster Provost
  • Maytal Saar-Tsechansky

It can be expensive to acquire the data required for businesses to employ data-driven predictive modeling—for example, to model consumer preferences to optimize targeting.  Prior research has introduced “active-learning” policies for identifying data that are particularly useful for model induction, with the goal of decreasing the statistical error for a given acquisition cost (error-centric approaches).  However, predictive models are used as part of a decision-making process, and costly improvements in model accuracy do not always result in better decisions.  This paper introduces a new approach for active data acquisition that specifically targets decision making.  The new decision-centric approach departs from traditional active learning by placing emphasis on acquisitions that are more likely to affect decision making.  We describe two different types of decision-centric techniques.  Next, using direct-marketing data, we compare various data-acquisition techniques.  We demonstrate that strategies for reducing statistical error can be wasteful in a decision-making context, and show that one decision-centric technique in particular can improve targeting decisions significantly.  We also show that this method is robust in the face of decreasing quality of utility estimations, eventually converging to uniform random sampling, and that it can be extended to situations where different data acquisitions have different costs.  The results suggest that businesses should consider modifying their strategies for acquiring information through normal business transactions.  For example, a firm such as Amazon.com that models consumer preferences for customized marketing may accelerate learning by proactively offering recommendations—not merely to induce immediate sales, but for improving recommendations in the future.