The WoRLD: Knowledge Discovery from Multiple Distributed Databases

  • John Aronis
  • Bruce Buchanan
  • Venkateswarlu Kolluri
  • Foster Provost

Inductive machine learning offers techniques for discovering new knowledge from business, medical, and scientific databases.  Most techniques assume that all the relevant information for discovery has been gathered and assembled into a single table or database.  With multiple databases it is possible to combine features from several perspectives and thus move beyond the confines of an ontology that was fixed by the designers of a single database.  We introduce WoRLD (“Worldwide Relational Learning Daemon”), a system that uses spreading activation to enable inductive learning from multiple tables in multiple databases spread across the network.  We describe the paradigm and the system, provide demonstrations on synthetic data sets, and then replicate two real-world successes of automated discovery.