Authors: Krivosija, Amer
Title: On clustering and related problems on curves under the Fréchet distance
Language (ISO): en
Abstract: Sensor measurements can be represented as points in Rd. Ordered by the time-stamps of these measurements, they yield a time series, that can be interpreted as a polygonal curve in the d-dimensional ambient space. The number of the vertices is called complexity of the curve. In this thesis we study several fundamental computational tasks on curves: clustering, sim- plification, and embedding, under the Fr´echet distance, which is a popular distance measure for curves, in its continuous and discrete version. We focus on curves in one-dimensional ambient space R. We study the problem of clustering of the curves in R under the Fr´echet distance, in particular, the following variations of the well-known k-center and k- median problems. Given is a set P of n curves in R, each of complexity at most m. Our goal is to find k curves in R, not necessarily from P , called cluster centers and that each has complexity at most R. In the (k, R)-center problem, the maximum distance of an element of P to its nearest cluster center is minimized. In the (k, R)-median problem, the sum of these distances is minimized. We show that both problems are NP-hard under both versions of the Fr´echet distance, if k is part of the input. Under the continuous Fr´echet distance, we give (1 + ε)-approximation algorithms for both (k, R)-center and (k, R)-median problem, with running time near-linear in the input size for constant ε, k and R. Our techniques yield constant-factor approximation algorithms for the observed problems under the discrete Fr´echet distance. To obtain the (1 + ε)-approximation algorithms for the clustering prob- lems under the continuous Fr´echet distance, we develop a new simplification technique on one-dimensional curve, called δ-signature. The signatures al- ways exist, and we can compute them efficiently. We also study the problem of embedding of the Fr´echet distance into space R. We show that, in the worst case and under reasonable assumptions, the discrete Fr´echet distance between two polygonal curves of complexity m in Rd, where 2 ≤ d ≤ 7, degrades by a factor linear in m with constant probability, when the curves are projected onto a randomly chosen line. We show upper and lower bounds on the distortion. Sensor measurements can also define a discrete distribution over possi- ble locations of a point in Rd. Then, the input consists of n probabilistic points. We study the probabilistic 1-center problem in Euclidean space Rd, also known as the probabilistic smallest enclosing ball (pSEB) problem. To improve the best existing algorithm for the pSEB problem by reducing its exponential dependence on the dimension to linear, we study the determinis- tic set median problem, that generalizes both the 1-center and the 1-median problems. We present a (1 + ε)-approximation algorithm for the set median problem, using a novel combination of sampling techniques and stochastic subgradient descent. Our (1 + ε)-approximation algorithm for the pSEB problem takes linear time in d and n, making the pSEB algorithm applicable to shape fitting problems in Hilbert spaces of unbounded dimension using kernel functions. We present an exemplary application by extending the support vector data description (SVDD) shape fitting method to the probabilistic case.
Subject Headings: Clustering
Fréchet distance
Probabilistic points
Subject Headings (RSWK): Cluster-Theorie
Probabilistischer Raum
Issue Date: 2021
Appears in Collections:LS 02 Komplexitätstheorie und Effiziente Algorithmen

Files in This Item:
File Description SizeFormat 
Dissertation.pdfDNB2.49 MBAdobe PDFView/Open

This item is protected by original copyright

Items in Eldorado are protected by copyright, with all rights reserved, unless otherwise indicated.