I'm trying to implement a Content Based Image Retrieval system for small image-dataset. By now, I'm using 1k images (40 categories) from [Caltech101][1].
This is the system workflow:
1. For each image `img` compute the SIFT descriptors (using `cv::SIFT`, default paramaters) and save the descriptor matrix in `std::vector descriptors`
2. Compute a [Gaussian Mixture Model][2] using VLfeat implementation [VlGMM][3] and the previously computed descriptors, using the k-means as base algorithm (again, using [VLFeat implementation][4]).
3. For each `img`, compute the correspondent fisher vector using GMM obtained before, one for each dataset image.
4. Given the query `q`, compute SIFT descriptors and fisher vectors (using the same GMM of before).
5. Compute the Euclidean distance between `q`'s fisher vector and each `img` fisher vector from the dataset.
6. Return the top `k` images, according to the distances obtained from 5.
This is the code from point 2 to point 3 and 5, which are the most important ones:
vl_size totalElements = totalKeypoints * dimension;
float *data = new float[totalElements];
size_t counter = 0;
//save into data all descriptors matrices (tested, it works)
for(size_t i=0; i codes(n); //n is the dataset size
for(size_t i=0;i
↧