OWL-Miner: Concept Induction in OWL Knowledge Bases
Date
2018
Authors
Ratcliffe, David
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The Resource Description Framework (RDF) and Web Ontology
Language (OWL)
have been widely used in recent years, and automated methods for
the analysis of
data and knowledge directly within these formalisms are of
current interest. Concept
induction is a technique for discovering descriptions of data,
such as inducing OWL
class expressions to describe RDF data. These class expressions
capture patterns in
the data which can be used to characterise interesting clusters
or to act as classifica-
tion rules over unseen data. The semantics of OWL is underpinned
by Description
Logics (DLs), a family of expressive and decidable fragments of
first-order logic.
Recently, methods of concept induction which are well studied in
the field of
Inductive Logic Programming have been applied to the related
formalism of DLs.
These methods have been developed for a number of purposes
including unsuper-
vised clustering and supervised classification. Refinement-based
search is a concept
induction technique which structures the search space of DL
concept/OWL class
expressions and progressively generalises or specialises
candidate concepts to cover
example data as guided by quality criteria such as accuracy.
However, the current
state-of-the-art in this area is limited in that such methods:
were not primarily de-
signed to scale over large RDF/OWL knowledge bases; do not
support class lan-
guages as expressive as OWL2-DL; or, are limited to one purpose,
such as learning
OWL classes for integration into ontologies. Our work addresses
these limitations
by increasing the efficiency of these learning methods whilst
permitting a concept
language up to the expressivity of OWL2-DL classes. We describe
methods which
support both classification (predictive induction) and subgroup
discovery (descrip-
tive induction), which, in this context, are fundamentally
related.
We have implemented our methods as the system called OWL-Miner
and show
by evaluation that our methods outperform state-of-the-art
systems for DL learning
in both the quality of solutions found and the speed in which
they are computed.
Furthermore, we achieve the best ever ten-fold cross validation
accuracy results on
the long-standing benchmark problem of carcinogenesis. Finally,
we present a case
study on ongoing work in the application of OWL-Miner to a
real-world problem
directed at improving the efficiency of biological macromolecular
crystallisation.
Description
Keywords
OWL, RDF, RDFS, DL, description logic, machine learning, subgroup discovery, concept learning, concept induction, knowledge base, semantic web
Citation
Collections
Source
Type
Thesis (PhD)
Book Title
Entity type
Access Statement
License Rights
Restricted until
Downloads
File
Description