Application of a Genetic Algorithm to Variable Selection in Fuzzy Clustering

Loading...
Thumbnail Image

Date

2004

Journal Title

Journal ISSN

Volume Title

Publisher

Universität Dortmund

Abstract

In order to group the observations of a data set into a given number of clusters, an optimal subset out of a greater number of explanatory variables is to be selected. The problem is approached by maximizing a quality measure under certain restrictions that are supposed to keep the subset most representative of the whole data. The restrictions may either be set manually, or generated from the data. A genetic optimization algorithm is developed to solve this problem. The procedure is then applied to a data set describing features of sub-districts of the city of Dortmund, Germany, to detect different social milieus and investigate the variables making up the differences between these.
In order to group the observations of a data set into a given number of clusters, an ‘optimal’ subset out of a greater number of explanatory variables is to be selected. The problem is approached by maximizing a quality measure under certain restrictions that are supposed to keep the subset most representative of the whole data. The restrictions may either be set manually, or generated from the data. A genetic optimization algorithm is developed to solve this problem. The procedure is then applied to a data set describing features of sub-districts of the city of Dortmund, Germany, to detect different social milieus and investigate the variables making up the differences between these.

Description

Table of contents

Keywords

Citation