|
Keynote Presentation
|
"Mining Frequent Conjunctive Queries in Star Schemas"
By Professor Dominique Laurent, ETIS, Univ. Cergy
Pontoise, France
|
 |
Bio:
Dominique Laurent received his doctoral degree in 1987 and
then his Habilitation degree in 1994 from the University
of Orléans (France). In 1988-1996, he was Assistant Professor
in the University of Orléans, and then, Professor in
the University of Tours (France) from September 1996 until
September 2003. Since then, he is Professor at the University
of Cergy-Pontoise (France), where he is the head of the
Department of Computer Science and the leader of the
database group of the laboratory ETIS (UMR CNRS). He is the
co-author of more than 60 papers published in
major international journals and conferences related to
information systems and databases. His research interests
include database theory, deductive databases, data mining,
data integration, data warehousing and OLAP queries.
Abstract:
In this
talk we address the issue of mining frequent queries in a
relational database (i.e., all queries whose answers
have a cardinality greater than a fixed threshold), a problem
known to be intractable even for conjunctive queries.
We restrict our attention to all relevant conjunctive
projection-selection-join queries in a star schemas, that is
those
queries in which the join involves the fact table. In this
particular case, it is shown that the underlying key and
foreign-key constraints can be used to define a query
pre-ordering with respect to which the support measure is
anti-monotonic. Moreover, we argue that significant
computational optimizations are possible, based on the fact
that
this pre-ordering induces an equivalence relation for which
all equivalent queries have the same support.
In the first part of the talk, we motivate our work and show
that the knowledge of frequent queries allows to discover
not only association rules, but also unknown constraints
holding on the database (such as functional and inclusion
dependencies as well as conditional functional dependencies).
Then, we provide the basic definitions and properties
of the considered query pre-ordering and its induced
equivalence relation.
In the second part of the talk, we provide algorithms for the
computation of frequent queries, and we focus on the
computational aspects. We show that the complexity of our
algorithms is linear in the size of the database to be mined.
We report on experiments showing the efficiency of our method.
Back
|