Department of Computer Science and Engineering Skip to Content Michigan State University logo
Biometrics Research Group banner

Projects

Current Projects

Large Scale Kernel-Based Data Clustering

Kernel-based clustering algorithms achieve better performance on real world data than their linear counterparts but pose two important challenges: (i) they do not scale sufficiently in terms of run-time and memory complexity, i.e. their complexity is quadratic in the number of data instances, rendering them inefficient for large data sets, and (ii) the choice of the kernel function is very critical to the performance of the algorithm. In this project, we aim at developing efficient schemes to reduce the complexity of these clustering algorithms and learn appropriate kernel functions from the training data. In the first phase of the project, we employ randomization to achieve speedup and reduce the memory requirements of kernel-based clustering. We evaluate the efficiency of our techniques in the domains of object categorization and document clustering.

R. Chitta, R. Jin, T. C. Havens, and A. K. Jain, "Approximate Kernel k-means: solution to Large Scale Kernel Clustering", KDD, San Diego, CA, August 21-24, 2011 (To Appear).

T. C. Havens, R. Chitta, A. K. Jain, and R. Jin, "Speedup of Fuzzy and Possibilistic Kernel c-Means for Large-Scale Clustering", Proc. IEEE Int. Conf. Fuzzy Systems, Taipei, Taiwan, June 27-30, 2011 (To Appear).

 

bottom graphic