DDD: A NEW ENSEMBLE APPROACH FOR DEALING WITH CONCEPT DRIFT

DDD: A NEW ENSEMBLE APPROACH FOR DEALING WITH CONCEPT DRIFT

Online learning algorithms often have to operate in the presence of concept drifts. A recent study revealed that different diversity levels in an ensemble of learning machines are required in order to maintain high generalization on both old and new concepts. Inspired by this study and based on a further study of diversity with different strategies to deal with drifts, we propose a new online ensemble learning approach called Diversity for Dealing with Drifts (DDD). DDD maintains ensembles with different diversity levels and is able to attain better accuracy than other approaches. Furthermore, it is very robust, outperforming other drift handling approaches in terms of accuracy when there are false positive drift detections. In all the experimental comparisons we have carried out, DDD always performed at least as well as other drift handling approaches under various conditions, with very few exceptions.

Existing System:
We adopt the definition that online learning algorithms process each training example once “on arrival,” without the need for storage or reprocessing. In this way, they take as input a single training example as well as a hypothesis and output an updated hypothesis. We consider online learning as a particular case of incremental learning.
The latter term refers to learning machines that are also used to model continuous processes, but process incoming data in chunks, instead of having to process each training example separately.
Ensembles of classifiers have been successfully used to improve the accuracy of single classifiers in online and incremental learning . However, online environments are often no stationary and the variables to be predicted by the learning machine may change with time (concept drift).

Proposed system:

We propose a new online ensemble learning approach to handle concept drifts called Diversity for Dealing with Drifts (DDD). The approach aims at better exploiting diversity to handle drifts, being more robust to false alarms (false positive drift detections), and having faster recovery from drifts.
In this way, it manages to achieve improved accuracy in the presence of drifts at the same time as good accuracy in the absence of drifts is maintained.
Experiments with artificial and real-world data show that DDD usually obtains similar or better accuracy than Early Drift Detection Method (EDDM) and better accuracy than Dynamic Weighted Majority (DWM).

Software Requirements:

.Net
Front End – ASP.Net
Language – C#.Net
Back End – SQL Server
Windows XP
Hardware Requirements:
RAM : 512 Mb
Hard Disk : 80 Gb
Processor : Pentium IV
FUTURE ENHANCEMENT:

Future work includes experiments using a parameter to control the maximum number of time steps maintaining four ensembles, further investigation of the performance on skewed data sets, and extension of DDD to better deal with recurrent and predictable drifts. The analysis shows that different diversity levels obtain the best prequential accuracy depending on the type of drift.It also shows that it is possible to use information learned from the old concept in order to aid the learning of the new concept, by training ensembles which learned the old concept with high diversity, using low diversity on the new concept. Such ensembles are able to outperform new ensembles created from scratch after the beginning of the drift, especially when the drift has low severity and high speed, and soon after the beginning of medium or low speed drifts.


Comments are closed.