MLDM / ICDM Medaillie Meissner Porcellan, the White Gold of King August the Strongest of Saxonia Gottfried Wilhelm von Leibniz, the great mathematician and son of Leipzig, was watching over us during our event in Machine Learning and Data Mining in Pattern Recognition (MLDM 2007). He can be proud of what we have achieved in this area so far. We had a great research program this year. This was the fifth MLDM in Pattern Recognition event held in Leipzig (www.mldm.de). Today, there are many international meetings carrying the title machine learning and data mining, whose topics are text mining, knowledge discovery, and applications. This meeting from the very first event has focused on aspects of machine learning and data mining in pattern recognition problems. We planned to reorganize classical and well-established pattern recognition paradigms from the view points of machine learning and data mining. Although it was a challenging program in the late 1990s, the idea has provided new starting points in pattern recognition and has influenced other areas such as cognitive computer vision. For this edition, the Program Committee received 258 submissions from 37 countries (see Fig. 1). To handle this high number of papers was a big challenge for the reviewers. Every paper was thoroughly reviewed and all authors received a detailed report on their submitted work.
Covers the state of the art in machine learning and data mining
More than 900 pages with online files and updates
Includes papers on medical, biological, and environmental data mining
Invited Talk.- Data Clustering: User's Dilemma.- Classification.- On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers.- Comparison of a Novel Combined ECOC Strategy with Different Multiclass Algorithms Together with Parameter Optimization Methods.- Multi-source Data Modelling: Integrating Related Data to Improve Model Performance.- An Empirical Comparison of Ideal and Empirical ROC-Based Reject Rules.- Outlier Detection with Kernel Density Functions.- Generic Probability Density Function Reconstruction for Randomization in Privacy-Preserving Data Mining.- An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams.- On the Combination of Locally Optimal Pairwise Classifiers.- Feature Selection, Extraction and Dimensionality Reduction.- An Agent-Based Approach to the Multiple-Objective Selection of Reference Vectors.- On Applying Dimension Reduction for Multi-labeled Problems.- Nonlinear Feature Selection by Relevance Feature Vector Machine.- Affine Feature Extraction: A Generalization of the Fukunaga-Koontz Transformation.- Clustering.- A Bounded Index for Cluster Validity.- Varying Density Spatial Clustering Based on a Hierarchical Tree.- Kernel MDL to Determine the Number of Clusters.- Critical Scale for Unsupervised Cluster Discovery.- Minimum Information Loss Cluster Analysis for Categorical Data.- A Clustering Algorithm Based on Generalized Stars.- Support Vector Machine.- Evolving Committees of Support Vector Machines.- Choosing the Kernel Parameters for the Directed Acyclic Graph Support Vector Machines.- Data Selection Using SASH Trees for Support Vector Machines.- Dynamic Distance-Based Active Learning with SVM.- Transductive Inference.- Off-Line Learning with Transductive Confidence Machines: An Empirical Evaluation.- Transductive Learning from Relational Data.- Association Rule Mining.- A Novel Rule Ordering Approach in Classification Association Rule Mining.- Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules.- Mining Spam, Newsgroups, Blogs.- Analyzing the Performance of Spam Filtering Methods When Dimensionality of Input Vector Changes.- Blog Mining for the Fortune 500.- A Link-Based Rank of Postings in Newsgroup.- Intrusion Detection and Networks.- A Comparative Study of Unsupervised Machine Learning and Data Mining Techniques for Intrusion Detection.- Long Tail Attributes of Knowledge Worker Intranet Interactions.- A Case-Based Approach to Anomaly Intrusion Detection.- Sensing Attacks in Computers Networks with Hidden Markov Models.- Frequent and Common Item Set Mining.- FIDS: Monitoring Frequent Items over Distributed Data Streams.- Mining Maximal Frequent Itemsets in Data Streams Based on FP-Tree.- CCIC: Consistent Common Itemsets Classifier.- Mining Marketing Data.- Development of an Agreement Metric Based Upon the RAND Index for the Evaluation of Dimensionality Reduction Techniques, with Applications to Mapping Customer Data.- A Sequential Hybrid Forecasting System for Demand Prediction.- A Unified View of Objective Interestingness Measures.- Comparing State-of-the-Art Collaborative Filtering Systems.- Structural Data Mining.- Reducing the Dimensionality of Vector Space Embeddings of Graphs.- PE-PUC: A Graph Based PU-Learning Approach for Text Classification.- Efficient Subsequence Matching Using the Longest Common Subsequence with a Dual Match Index.- A Direct Measure for the Efficacy of Bayesian Network Structures Learned from Data.- Image Mining.- A New Combined Fractal Scale Descriptor for Gait Sequence.- Palmprint Recognition by Applying Wavelet Subband Representation and Kernel PCA.- A Filter-Refinement Scheme for 3D Model Retrieval Based on Sorted Extended Gaussian Image Histogram.- Fast-Maneuvering Target Seeking Based on Double-Action Q-Learning.- Mining Frequent Trajectories of Moving Objects for Location Prediction.- Categorizing Evolved CoreWar Warriors Using EM and Attribute Evaluation.- Restricted Sequential Floating Search Applied to Object Selection.- Color Reduction Using the Combination of the Kohonen Self-Organized Feature Map and the Gustafson-Kessel Fuzzy Algorithm.- A Hybrid Algorithm Based on Evolution Strategies and Instance-Based Learning, Used in Two-Dimensional Fitting of Brightness Profiles in Galaxy Images.- Gait Recognition by Applying Multiple Projections and Kernel PCA.- Medical, Biological, and Environmental Data Mining.- A Machine Learning Approach to Test Data Generation: A Case Study in Evaluation of Gene Finders.- Discovering Plausible Explanations of Carcinogenecity in Chemical Compounds.- One Lead ECG Based Personal Identification with Feature Subspace Ensembles.- Classification of Breast Masses in Mammogram Images Using Ripley's K Function and Support Vector Machine.- Selection of Experts for the Design of Multiple Biometric Systems.- Multi-agent System Approach to React to Sudden Environmental Changes.- Equivalence Learning in Protein Classification.- Text and Document Mining.- Statistical Identification of Key Phrases for Text Classification.- Probabilistic Model for Structured Document Mapping.- Application of Fractal Theory for On-Line and Off-Line Farsi Digit Recognition.- Hybrid Learning of Ontology Classes.- Discovering Relations Among Entities from XML Documents.