Validating clustering for gene expression data bioinformatics
In 2003, Dembele and Kastner  described a modified fuzzy c-means algorithm applied to genomic data, which automatically selects the fuzziness parameter.
Finally, the use of nonnegative matrix factorization (NMF) was introduced in 2004 by Brunet .
Finally, we comment on the application of clustering to genomics in section 6.Although the ability of clustering algorithms to make inferences has been addressed to some extent, a mathematical foundation for clustering has been provided only very recently [19, 20].In this paper we will cover a mathematical model of clustering and review learning in section 2.Although used for many years in the context of gene expression microarray data, clustering has remained highly problematic [2, 12, 17].Some criticisms raise the question as to whether clustering can be used for scientific knowledge : how may one judge the relative worth of clustering algorithms unless the assessment is based on their inference capabilities?
receives a set of vectors, and groups them based on a cost criterion or some other optimization rule.