Measuring and resource-efficient optimization of clustering quality

dc.contributor.advisorSchubert, Erich
dc.contributor.authorLenssen, Lars Filipp
dc.contributor.refereeZimek, Arthur
dc.date.accepted2025-03-18
dc.date.accessioned2025-04-17T11:24:11Z
dc.date.available2025-04-17T11:24:11Z
dc.date.issued2025
dc.description.abstractCluster analysis is a fundamental task in exploratory data mining, widely used to uncover hidden structures within datasets across various fields. It has broad applications, from identifying subgroups in gene expression data for disease research to segmenting customer bases in industry. Over time, a diverse range of clustering methods has been developed to handle the complex structure of different data domains. Despite these, key challenges remain, particularly in evaluating the quality of clustering results and optimizing the performance of clustering algorithms. The research presents the Average Medoid Silhouette (AMS), an improved version of the Average Silhouette Width (ASW), and introduces the FastMSC and FasterMSC algorithms, which optimize the AMS directly. The DynMSC algorithm is also proposed to simplify determining the optimal number of clusters. For categorical data, the Average Relative Entropy Score (ARES) and Minimum Relative Entropy Contrast (MREC) are introduced, forming the basis of the CatRED algorithm, an agglomerative hierarchical method applied in information systems research.en
dc.identifier.urihttp://hdl.handle.net/2003/43678
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-25451
dc.language.isoen
dc.subjectCluster analysisen
dc.subjectClustering qualityen
dc.subjectClustering evaluationen
dc.subject.ddc004
dc.subject.rswkCluster-Analysede
dc.titleMeasuring and resource-efficient optimization of clustering qualityen
dc.typeText
dc.type.publicationtypePhDThesis
dcterms.accessRightsopen access
eldorado.secondarypublicationfalse

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dissertation_Lenssen.pdf
Size:
3.76 MB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.82 KB
Format:
Item-specific license agreed upon to submission
Description: