PROTOTYPE SELECTION FOR DYNAMIC CLASSIFIER AND ENSEMBLE SELECTION
DYNAMIC CLASSIFIER AND ENSEMBLE SELECTION REVIEW:
Dynamic selection techniques consist, based on a pool of classifiers C, in finding a single classifier ci, or an ensemble of classifiers C, having the most competent classifiers to predict the lass label for a specific instance,xj. Recent works in classifier and ensemble selection have shown a preference for dynamic ensemble selection over static ensemble selection, especially in dealing with ill-defined problems, i.e., when the size of the dataset is small and there is not enough data to train a strong classifier having a lot of parameters to learn[16]. Inaddition,due to insufficient training data, the distribution of the training data may not adequately represent the real distribution of the problem. Consequently, the classifiers cannot learn the separation between the classes in those cases.
The rationale behind dynamic ensemble selection techniques resides in the observation that not every classifier in the pool is an expert in classifying all unknown samples. Each base classifier is an expert in a different local region of the feature space[27]. Moreover,differentpatternsare associated with distinct degrees of difficulties. It is therefore reasonable to assume that only a few base classifiers can predict the correct class label.
Early works in dynamic selection started with the selection of a single classifier rather than an EoC. In such techniques, only the classifier that attained the highest competence level is used for the classification of the given test sample. These techniques are called dynamic classifier selection (DCS). The local classifier accuracy (LCA) [22] and the multiple classifier behavior (MCB) [21] are examples of DCS techniques. However, given the fact that selecting only one classifier can be very error prone, some researchers decided to select a subset of the pool of classifiers, C, containing all classifiers that attained a certain competence level, rather than a single model. Such techniques are called dynamic ensemble selection (DES). Example of DES techniques are the K-Nearests Oracles (KNORA) [14], K-Nearests Output Profiles(KNOP)[16],Dynamic Over produce and Choose(DOCS)[15],and the method based on Randomized Reference Classifier (RRC) DES-RRC .
Individual-basedmeasures:
Ranking:
Classifier rank was one of the first criteria proposed for estimating the competence level of base classifiers in a dynamic selection. The ranking of a single base classifier ci could be estimated simply by the number of consecutive correctly classified samples. The classifier that correctly classifies the greatest number of consecutive samples derived from the validation data is considered to have the highest competence level or “rank”. In addition, alternative ranking techniques based on mutualin for mation were also proposed[38],but,because of their complexity and because they were only defined to work with the Nearest Neighbor (NN) as base classifiers, recent work has preferred the use of the simplified ranking method.
Local Accuracy:
Classifier accuracy is the most commonly used criterion for dynamic classifier and ensemble selection techniques [22; 14; 20; 30; 23; 28; 38; 29; 19]. Techniques that are based on local accuracy first computes the region of competence. θj of the given test sample xj. The region of competence can be defined either on the training set [22] or on the validation set [14].
Based on the samples belonging to the region of competence, θj, different means have been proposed for estimating the local accuracy of the base classifier. For example, the Overall Local Accuracy(OLA)[22]technique uses the accuracy of the base classifier in the whole region of competence as a criterion for measuring its level of competence. The classifier that obtains the highest accuracy rate is considered the most competent. The Local Classifier Accuracy (LCA) [22] computes the performance of the base classifier in relation to a specific class label. The Modified Local Accuracy [29] works similarly to the LCA technique, with the only difference being that each sample belonging to the region of competence is weighted by its Euclidean distance to the query instance. As such, instances from the region of competence that are closer to the test sample have a higher degree of influence when computing the performance of the base classifier. Moreover, variations of the OLA and LCA techniques using a priori and a posteriori probabilities were proposed by Didaci et al. [44] for obtaining more precise estimates of the competence level of a base classifier.
The difference between these techniques lies in how they utilize the local accuracy information in order to measure the level of competence of a base classifier. The main problem with these techniques is their dependence on the definition of the region of competence, often performed viaK-NNorclusteringtechniques. The dynamic selection technique is likely to commit errors when there is a high degree of overlap between the classes [20]. As reported in [14], using the local accuracy information alone is not sufficient to achieve results close to the Oracle. Moreover, any difference between the distribution of validation and test datasets may negatively affect the system performance.
META-DES:ADYNAMIC ENSEMBLE SELECTION FRAME WORK USING META-LEARNING:
Classifier competence for dynamic selection:
Classifier competence defines how much we trust an expert, given a classification task. The notion of competence used is extensively in the field of machine learning as away of selecting, from the plethora of different classification models, the one that best fits the given problem. LetC ={c1,…,cM}(M is the size of the pool of classifiers) be the pool of classifiers and ci a base classifier belonging to the pool C. The goal of dynamic selection is to find an ensemble of classifiersC⊂C that has the best classifiers to classify a given test sample xj. This is different from static selection,where the ensemble of classifiers C is selected during the training phase, and considering the global performance of the base classifiers over a validation dataset[10;11; 12; 13].
Nevertheless, the key issue in dynamic selection is how to measure the competence of a base classifier ci for the classification of a given query sample xj. In the literature, we can observe three categories: the classifier accuracy over a local region,i.e.,in a region of the feature space surrounding the query instance xj, decision templates [57], which are techniques that work in the decision space (i.e, a space defined by the outputs of the base classifiers) and the extent of consensus or confidence. The three categories are described in the following subsections.
|
Table des matières
INTRODUCTION
CHAPTER 1 DYNAMIC CLASSIFIER AND ENSEMBLE SELECTION REVIEW
1.1 Individual-based measures
1.1.1 Ranking
1.1.2 Local Accuracy
1.1.3 Oracle
1.1.4 Probabilistic
1.1.5 Behavior
1.2 Group based measures
1.2.1 Diversity
1.2.2 Ambiguity
1.2.3 Data handling
CHAPTER 2 META-DES: ADYNAMIC ENSEMBLE SELECTION FRAMEWORK USING META-LEARNING
2.1 Introduction
2.2 Classifier competence for dynamic selection
2.2.1 Classifier accuracy over a local region
2.2.2 Decision Templates
2.2.3 Extent of Consensus or confidence
2.3 The Proposed Framework: META-DES
2.3.1 Problem definition
2.3.2 The proposed META-DES
2.3.2.1 Overproduction
2.3.2.2 Meta-training
2.3.2.3 Generalization Phase
2.4 Experiments
2.4.1 Datasets
2.4.2 Experimental Protocol
2.4.3 Parameters Setting
2.4.3.1 The effect of the parameter hC
2.4.3.2 The effect of the parameter Kp
2.4.4 Comparison with the state-of-the-artdynamic selection techniques
2.5 Conclusion
CHAPTER 3 ADEEP ANALYSIS OF THE META-DES FRAMEWORKFOR DYNAMIC SELECTION OF ENSEMBLE OF CLASSIFIERS
3.1 Introduction
3.2 Why does dynamic selection of linear classifiers work?
3.3 The META-DES Framework
3.3.1 Overproduction
3.3.2 Meta-Training
3.3.2.1 Sample selection
3.3.2.2 Meta-feature extraction
3.3.2.3 Training
3.3.3 Generalization
3.4 Why does the META-DES work: A Step-by-step example
3.4.1 The P2 Problem
3.4.2 Overproduction
3.4.3 Meta-training: Sample Selection
3.4.4 Classification
3.5 Further Analysis
3.5.1 The Effect of the Pool Size
3.5.2 The effect of the size of the dynamic selection dataset (DSEL)
3.5.3 Results of static combination techniques
3.5.4 Single classifier models
3.6 Conclusion
3.7 Appendix
3.7.1 Plotting decision boundaries
3.7.2 Ensemble Generation
3.7.3 Sample Selection Mechanism: consensus threshold hc
3.7.4 Size of the dynamic selection dataset (DSEL)
CHAPTER 4 META-DES.ORACLE: META-LEARNING AND FEATURE SELECTION FOR DYNAMIC ENSEMBLE SELECTION
4.1 Introduction
4.2 Related Works
4.2.1 Dynamic selection
4.2.2 Feature selection using Binary Particle Swarm Optimization (BPSO)
4.3 The META-DES.Oracle
4.3.1 Overproduction
4.3.2 Meta-training Phase
4.3.2.1 Sample Selection
4.3.2.2 Meta-Feature Selection Using Binary Particle Swarm Optimization (BPSO)
4.3.3 Generalization Phase
4.4 Meta-Feature Extraction
4.4.1 Local Accuracy Meta-Features
4.4.1.1 Overall Local accuracy: fOverall
4.4.1.2 Conditional Local Accuracy: fcond
4.4.1.3 Neighbors’ hard classificationL: fHard
4.4.2 Ambiguity
4.4.2.1 Classifier’s confidence: fConf
4.4.2.2 Ambiguity: fAmb
4.4.3 Probabilistic Meta-Features
4.4.3.1 Posterior probability: fProb
4.4.3.2 Logarithmic: fLog
4.4.3.3 Entropy: fEnt
4.4.3.4 Minimal difference: fMD
4.4.3.5 Kullback-Leibler Divergence: fKL
4.4.3.6 Exponential: fExp
4.4.3.7 Randomized Reference Classifier: fPRC
4.4.4 Behavior meta-features
4.4.4.1 Output profiles classification: fOP
4.4.5 Ranking Meta-Features
4.4.5.1 Simplified classifier rank: fRank
4.4.5.2 classifier rank OP: fRankOP
4.5 Case study using synthetic data
4.6 Experiments
4.6.1 Datasets
4.6.2 Experimental protocol
4.6.3 Analysis of the selected meta-features
4.6.4 Comparative study
4.6.5 Comparison with the state-of-the-art DES techniques
4.6.6 Comparison with Static techniques
4.7 Conclusion
CHAPTER 5 PROTOTYPE SELECTION FOR DYNAMIC CLASSIFIER AND ENSEMBLE SELECTION
5.1 Introduction
5.2 Proposed method
5.2.1 Edited Nearest Neighbor (ENN)
5.2.2 K-Nearest Neighbor with Local Adaptive Distance
5.2.3 Case study
5.3 Experiments
5.3.1 Dynamic selection methods
5.3.2 Datasets
5.3.3 Comparison between different scenarios
5.3.4 Comparison between DES techniques
5.3.5 Discussion
5.4 Conclusion
GENERAL CONCLUSION
APPENDIX I ONMETA-LEARNINGFORDYNAMICESEMBLESELECTION
APPENDIX II META-DES.H:ADYNAMICENSEMBLESELECTIONTECHNIQUE USING META-LEARNING AND A DYNAMIC WEIGHTING APPROACH
APPENDIX III FEATURE REPRESENTATION SELECTION BASED ON CLASSIFIERPROJECTIONSPACEANDORACLEANALYSIS
APPENDIX IV ANALYZINGDYNAMICENSEMBLESELECTIONTECHNIQUES USING DISSIMILARITY ANALYSIS
BIBLIOGRAPHY
Télécharger le rapport complet