A Population Background for Nonparametric Densitybased Clustering (Statistics Seminar)
Seminar/Forum
Despite its popularity, the investigation of some theoretical aspects of clustering has been relatively sparse. One of the main reasons for this lack of theoretical results is surely the fact that, whereas for other statistical problems the theoretical population goal is clearly defined (as in regression or classification), for some of the clustering methodologies it is difficult to specify the population goal to which the databased clustering algorithms should try to get close. In this talk, José Chacon will investigate the theoretical foundations of the branch known as densitybased clustering, where clusters are understood as regions of high density. The focus will be on two main objectives: first, to provide an explicit formulation for the ideal population goal of densitybased clustering (here, Differential Topology plays a crucial role by means of Morse Theory); and second, to present two new loss functions, applicable in fact to any clustering methodology, to evaluate the performance of a databased clustering algorithm with respect to the ideal population goal. In particular, it is shown that only mild conditions on a sequence of density estimators are needed to ensure that the sequence of clusterings that they induce is consistent.
Presenter

José Chacon, University of Extremadura