|
Keynote Speakers
"Binary Data Mining"
by Professor Václav Snášel, Vice - Dean for Research
and Science, University of Ostrava, Czech Republic
|
 |
Abstract
Binary data have been occupying a special place in the
domain of data analysis. Analysis of binary data sets,
however, generally leads to NP-complete/hard problems.
Consequently, the focus here is on effective heuristics
for reducing the problem size.
Matrix factorization or factor analysis is an important
task helpful in the analysis of high dimensional real
world data. There are several well known methods and
algorithms for factorization of real data but many
application areas including information retrieval, pattern
recognition and data mining require processing of binary
rather than real data. Unfortunately, the methods used for
real matrix factorization fail in the latter case. In this
paper we introduce background for binary matrix
factorization.
In order to perform object recognition (no matter which
one) it is necessary to learn representations of the
underlying characteristic components. Such components
correspond to object-parts, or features. These data sets
may comprise discrete attributes, such as those from
market basket analysis, information retrieval, and
bioinformatics, as well as continuous attributes such as
those in scientific simulations, astrophysical
measurements, and sensor networks.
The feature extraction if applied on binary datasets,
addresses many research and application fields, such as
association rule mining, market basket analysis, discovery
of regulation patterns in DNA microarray experiments, etc.
So called bars problem is used as the benchmark. Set of
artificial signals generated as a Boolean sum of given
number of bars is analyzed by these methods. Here we will
concentrate on the case of black and white pictures of
bars combinations represented as binary vectors, so the
complex feature extraction methods are unnecessary.
Many applications in computer and system science involve
analysis of large scale and often high dimensional data.
When dealing with such extensive information collections,
it is usually very computationally expensive to perform
some operations on the raw form of the data. Therefore,
suitable methods approximating the data in lower
dimensions or with lower rank are needed. In the
following, we focus on the factorization of
hight-dimensional binary data or high order binary
tensors.
Bio
Prof. RNDr. Václav Snášel, CSc. received
the Ph.D. degree in Algebra from the Masaryk University
Brno in 1991. Currently he is a full professor of the
Faculty of Electrical Engineering and Computer Science at
VSB-Technical University of Ostrava. From 2001 he is a
visiting scientist in the Institute of Computer Science,
Academy of Sciences of the Czech Republic. From 2003 he is
vice-dean for Research and Science at Faculty of
Electrical Engineering and Computer Science.
V. Snášel has published more than 350 papers and books on
data modelling, Ontology, Knowledge Management, Databases,
Multimedia, Information Retrieval, Neural Networks, Data
Compression and File Organization. His research interests
include also information retrieval, semistructured data,
evolutionary computing and indexing methods. He supervised
many Ph.D. students, and Ph.D. students outside Czech
Republic (Jordan, Yemen, Slovakia, Ukraine and Vietnam).
According to the Erdös Number Project, his Erdös number is
3. He is a member of IEEE, ACM, SIAM and AMS.
He participated in the project MS-2000 Mossbauer
spectrometer for Russian space mission "MARS-96" see
http://www.mossp2000.com/index.html
V. Snášel is Editor in Chief of the following journals:
International Journal of Grid and Utility Computing
(IJGUC)
International Journal of Autonomic Computing (IJAC)
He has recently served as a Chair, International Chair and
member of program committees of a number of international
conferences, e.g. Sofsem, (Springer), ADBIS (Springer),
ICDIM (IEEE), ECIR (Springer), ISDA (IEEE), CISIM (IEEE),
NAFIPS (IEEE), Co-Chair,Intelligent Web Interaction
Workshop 2007, Silicon Valley, USA, (IEEE/ACM), CIS
(IEEE), Chair, DEXA - ETID 2007, 2008, (IEEE), ICCIT 2008,
(IEEE), ICCSA 2008, (Springer), General Chair, CISIM 2008,
(IEEE), SMCia 2008, (IEEE), IAS 2008, (IEEE), AWIC 2008,
(Springer), Chair, DEXA - ETID 2008, (IEEE), International
Chair, ISDA 2008, (IEEE), HIS 2008, (IEEE), ADBIS 2008
(Springer), etc.
He is a member of scientific board of Faculty of
Electrical Engineering and Computer Science, VSB-Technical
University of Ostrava, Czech Republic, Faculty of Science,
Palacky University, Olomouc, Czech Republic and Faculty of
Informatics and Statistics, University of economics,
Prague, Czech Republic.
Back
|