Semisupervised Biased Maximum Margin Analysis for Interactive Image Retrieval
Semisupervised Biased Maximum Margin Analysis for Interactive Image Retrieval
ABSTRACT:
With many potential practical applications, content-based image retrieval (CBIR) has attracted substantial attention during the past few years. A variety of relevance feedback (RF) schemes have been developed as a powerful tool to bridge the semantic gap between low-level visual features and high-level semantic concepts, and thus to improve the performance of CBIR systems. Among various RF approaches, support-vector-machine (SVM)-based RF is one of the most popular techniques in CBIR. Despite the success, directly using SVM as an RF scheme has two main drawbacks. First, it treats the positive and negative feedbacks equally, which is not appropriate since the two groups of training feedbacks have distinct properties. Second, most of the SVM-based RF techniques do not take into account the unlabeled samples, although they are very helpful in constructing a good classifier. To explore solutions to overcome these two drawbacks, in this paper, we propose a biased maximum margin analysis (BMMA) and a semisupervised BMMA (SemiBMMA) for integrating the distinct properties of feedbacks and utilizing the information of unlabeled samples for SVM-based RF schemes. The BMMA differentiates positive feedbacks from negative ones based on local analysis, whereas the SemiBMMA can effectively integrate information of unlabeled samples by introducing a Laplacian regularizer to the BMMA. We formally formulate this problem into a general subspace learning task and then propose an automatic approach of determining the dimensionality of the embedded subspace for RF. Extensive experiments on a large real-world image database demonstrate that the proposed scheme combined with the SVM RF can significantly improve the performance of CBIR systems.
EXISTING SYSTEM:
A variety of relevance feedback (RF) schemes have been developed as a powerful tool to bridge the semantic gap between low-level visual features and high-level semantic concepts, and thus to improve the performance of CBIR systems. Among various RF approaches, support-vector-machine (SVM)-based RF is one of the most popular techniques in CBIR.
DISADVANTAGES OF EXISTING SYSTEM:
Despite the success, directly using SVM as an RF scheme has two main drawbacks. First, it treats the positive and negative feedbacks equally, which is not appropriate since the two groups of training feedbacks have distinct properties. Second, most of the SVM-based RF techniques do not take into account the unlabeled samples, although they are very helpful in constructing a good classifier.
The low-level features captured from the images may not accurately characterize the high-level semantic concepts
PROPOSED SYSTEM:
The proposed scheme is mainly based on the following:
1) The effectiveness of treating positive examples and negative examples unequally
2) The significance of the optimal subspace or feature subset in interactive CBIR;
3) The success of graph embedding in characterizing intrinsic geometric properties of the data set in high-dimensional space
4) The convenience of the graph-embedding framework in constructing semi-supervised learning techniques.
ADVANTAGES OF PROPOSED SYSTEM:
To explore solutions to these two aforementioned problems in the current technology, we propose a biased maximum margin analysis (BMMA)and a semisupervised BMMA(SemiBMMA) for the traditional SVM RF schemes, based on the graph-embedding framework With the incorporation of BMMA, labeled positive feedbacks are mapped as close as possible, whereas labeled negative feedbacks are separated from labeled positive feedbacks by a maximum margin in the reduced subspace.
The traditional SVM combined with BMMA can better model the RF process and reduce the performance degradation caused by distinct properties of the two groups of feedbacks. The SemiBMMA can incorporate the information of unlabeled samples into the RF and effectively alleviate the over fitting problem caused by the small size of labeled training samples.
To show the effectiveness of the proposed scheme combined with the SVM RF, we will compare it with the traditional SVM RF and some other relevant existing techniques for RF on a real-world image collection.
Experimental results demonstrate that the proposed scheme can significantly improve the performance of the SVMRF for image retrieval.
MODULES:
- Training and Indexing Module
- Graph-Embedding Framework
- Features Extraction Based on Different Methods
- Visualization of the Retrieval Results
- Experiments on a Large-Scale Image Database:
- Experiments on a Small-Scale Image Database
MODULES DESCRIPTION:
Training and Indexing Module
In this module, we index and train the system. Indexing the whole set of images is done for making the search efficient and time consuming. If we don’t index the system, then it takes more time as it searches the whole disk space..Indexing is done using an implementation of the Document Builder Interface. A simple approach is to use the Document Builder Factory, which creates Document Builder instances for all available features as well as popular combinations of features (e.g. all JPEG features or all avail-able features). In a content based image retrieval system, target images are sorted by feature similarities with respect to the query (CBIR).In this indexing, we propose to classification of feature set obtained from the CBIR.First, it randomly selects k of the objects, each of which initially represents a cluster mean or center. For each of the remaining objects, an object is assigned to the cluster to which it is the most similar, based on the distance between the object and the cluster mean. It then computes the new mean for each cluster.
Graph-Embedding Framework
In order to describe our proposed approach clearly, we first review the graph-embedding framework Generally, for a classification problem, the sample set can be represented as matrix , where indicates the total number of the samples and is the feature dimensionality. Let be an undirected similarity graph, which is called an intrinsic graph, with vertices set and similarity matrix. The similarity matrix is real and symmetric, and measures the similarity between a pair of vertices; can be formed using various similarity criteria. The corresponding diagonal matrix and the Laplacian matrix of graph can G. Graph embedding of graph is defined as an algorithm to determine the low-dimensional vector representations of the vertex set, where is lower than for dimensionality. The column vector is the embedding vector for vertex, which preserves the similarities between pairs of vertices in the original high-dimensional space. Then, in order to characterize the difference between pairs of vertices in the original high-dimensional space, a penalty graph is also defined, where vertices are the same as those of, but the edge weight matrix corresponds to the similarity characteristics that are to be suppressed in the low-dimensional feature space. For a dimensionality reduction problem, direct graph embedding requires an intrinsic graph, whereas a penalty graph is not a necessary input.
Features Extraction Based on Different Methods
Six experiments are conducted for comparing the BMMA with the traditional LDA, the BDA method, and a graph-embedding approach, i.e., MFA, in finding the most discriminative directions. We plot the directions that correspond to the largest Eigen value of the decomposed matrices for LDA, BDA, MFA, and BMMA, respectively. From these examples, we can clearly notice that LDA can find the best discriminative direction when the data from each class are distributed as Gaussian with similar covariance matrices Biased toward the positive samples, BDA can find the direction that the positive samples are well separated with the negative samples when the positive samples have a Gaussian distribution, but it may also confuse when the distribution of the positive samples is more complicated. Biased toward positive samples, the BMMA method can find the most discriminative direction for all the six experiments based on local analysis, since it does not make any assumptions on the distributions of the positive and negative samples. It should be noted that BMMA is a linear method, and therefore, we only gave the comparison results of the aforementioned linear methods.
Visualization of the Retrieval Results
In the previous subsections, we have presented some statistically quantitative results of the proposed scheme. Here, we show the visualization of retrieval results.
In experiments, we randomly select some images (e.g., bobsled, cloud, cat, and car) as the queries and perform the RF process based on the ground truth. For each query image, we do four RF iterations. For each RF iteration, we randomly select some relevant and irrelevant images as positive and negative feedbacks from the first screen, which contains 20 images in total. The number of selected positive and negative feedbacks is about 4, respectively. We choose them according to the ground truth of the images, i.e., whether they share the same concept with the query image or not. The query images are given as the first image of each row. We show the top one to ten images of initial results without feedback and Semi BMMA SVM after four feedback iterations, respectively, and incorrect results are highlighted by green boxes. From the results, we can notice that our proposed scheme can significantly improve the performance of the system. For the first, second, and fourth query images, our system produces ten relevant images out of the top ten retrieved images. For the third query image, our system produces nine relevant images out of the top ten retrieved images. Therefore, Semi BMMA SVM can effectively detect the homogeneous concept shared by the positive samples and hence improve the performance of the retrieval system.
Experiments on a Large-Scale Image Database:
Here, we evaluate the performance of the proposed scheme on a real-world image database. We use precision–scope curve, precision rate, and standard deviation to evaluate the effectiveness of the image retrieval algorithms. The scope is specified by number of top-ranked images presented to the user. The precision is the major evaluation criterion, which evaluates the effectiveness of the algorithms. The precision–scope curve describes the precision with various scopes and can give the overall performance evaluation of the approaches. The precision rate is the ratio of the number of relevant images retrieved to the top retrieved images, which emphasizes the precision at a particular value of scope. Standard deviation describes the stability of different algorithms. Therefore, the precision evaluates the effectiveness of a given algorithm, and the corresponding standard deviation evaluates the robustness of the algorithm. We designed a slightly different feedback scheme to model the real world retrieval process. In a real image retrieval system, a query image is usually not in the image database. To simulate such an environment, we use fivefold cross validation to evaluate the algorithms. More precisely, we divide the whole image database into five subsets of equal size. Thus, there are 20% images per category in each subset. At each run of cross validation, one subset is selected as the query set, and the other four subsets are used as the database for retrieval. Then, 400 query samples are randomly selected from the query subset, and the RF is automatically implemented by the system. For each query image, the system retrieves and ranks the images in the database, and nine RF iterations are automatically executed.
Experiments on a Small-Scale Image Database
In order to show how efficient the proposed BMMA combined with SVM is in dealing with the asymmetric properties of feedback samples, the first evaluation experiment is executed on a small-scale database, which includes 3899 images with 30 different categories. We use all 3899 images in 30 categories as queries. Some example categories used in experiments. To avoid the potential problem caused by the asymmetric amount of positive and negative feedbacks, we selected an equal number of positive and negative feedbacks here. In practice, the first five query-relevant images and first five irrelevant images in the top 20 retrieved images in the previous iterations were automatically selected as positive and negative feedbacks, respectively.
HARDWARE REQUIREMENTS
- PROCESSOR : PENTIUM 4 CPU 2.40GHZ
- RAM : 128 MB
- HARD DISK : 40 GB
- KEYBOARD : STANDARD
- MONITOR : 15”
SOFTWARE REQUIREMENTS
- FRONT END : NET, C#.NET
- BACKEND : SQL SERVER 2005
- TOOL : VISUAL STUDIO 2008
- OPERATING SYSTEM : WINDOWS XP
- DOCUMENTATION : MS-OFFICE 2007
REFERENCE:
Lining Zhang, Student Member, IEEE, LipoWang, Senior Member, IEEE, and Weisi Lin, Senior Member, IEEE, “Semisupervised Biased Maximum Margin Analysis for Interactive Image Retrieval”, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012.
Comments are closed.