Authors: | Krivosija, Amer |
Title: | On clustering and related problems on curves under the Fréchet distance |
Language (ISO): | en |
Abstract: | Sensor measurements can be represented as points in Rd. Ordered by the time-stamps of these measurements, they yield a time series, that can be interpreted as a polygonal curve in the d-dimensional ambient space. The number of the vertices is called complexity of the curve. In this thesis we study several fundamental computational tasks on curves: clustering, sim- pliﬁcation, and embedding, under the Fr´echet distance, which is a popular distance measure for curves, in its continuous and discrete version. We focus on curves in one-dimensional ambient space R. We study the problem of clustering of the curves in R under the Fr´echet distance, in particular, the following variations of the well-known k-center and k- median problems. Given is a set P of n curves in R, each of complexity at most m. Our goal is to ﬁnd k curves in R, not necessarily from P , called cluster centers and that each has complexity at most R. In the (k, R)-center problem, the maximum distance of an element of P to its nearest cluster center is minimized. In the (k, R)-median problem, the sum of these distances is minimized. We show that both problems are NP-hard under both versions of the Fr´echet distance, if k is part of the input. Under the continuous Fr´echet distance, we give (1 + ε)-approximation algorithms for both (k, R)-center and (k, R)-median problem, with running time near-linear in the input size for constant ε, k and R. Our techniques yield constant-factor approximation algorithms for the observed problems under the discrete Fr´echet distance. To obtain the (1 + ε)-approximation algorithms for the clustering prob- lems under the continuous Fr´echet distance, we develop a new simpliﬁcation technique on one-dimensional curve, called δ-signature. The signatures al- ways exist, and we can compute them eﬃciently. We also study the problem of embedding of the Fr´echet distance into space R. We show that, in the worst case and under reasonable assumptions, the discrete Fr´echet distance between two polygonal curves of complexity m in Rd, where 2 ≤ d ≤ 7, degrades by a factor linear in m with constant probability, when the curves are projected onto a randomly chosen line. We show upper and lower bounds on the distortion. Sensor measurements can also deﬁne a discrete distribution over possi- ble locations of a point in Rd. Then, the input consists of n probabilistic points. We study the probabilistic 1-center problem in Euclidean space Rd, also known as the probabilistic smallest enclosing ball (pSEB) problem. To improve the best existing algorithm for the pSEB problem by reducing its exponential dependence on the dimension to linear, we study the determinis- tic set median problem, that generalizes both the 1-center and the 1-median problems. We present a (1 + ε)-approximation algorithm for the set median problem, using a novel combination of sampling techniques and stochastic subgradient descent. Our (1 + ε)-approximation algorithm for the pSEB problem takes linear time in d and n, making the pSEB algorithm applicable to shape ﬁtting problems in Hilbert spaces of unbounded dimension using kernel functions. We present an exemplary application by extending the support vector data description (SVDD) shape ﬁtting method to the probabilistic case. |
Subject Headings: | Clustering Curves Fréchet distance Probabilistic points |
Subject Headings (RSWK): | Cluster-Theorie Kurve Fréchet-Metrik Probabilistischer Raum |
URI: | http://hdl.handle.net/2003/40183 http://dx.doi.org/10.17877/DE290R-22055 |
Issue Date: | 2021 |
Appears in Collections: | LS 02 Komplexitätstheorie und Effiziente Algorithmen |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Dissertation.pdf | DNB | 2.49 MB | Adobe PDF | View/Open |
This item is protected by original copyright |
Items in Eldorado are protected by copyright, with all rights reserved, unless otherwise indicated.