LS 02 Komplexitätstheorie und Effiziente Algorithmen

Permanent URI for this collection

http://hdl.handle.net/2003/65

Browse

Now showing 1 - 20 of 27

The rank of sparse random matrices
(2022-04-23) Coja-Oghlan, Amin; Ergür, Alperen A.; Gao, Pu; Hetterich, Samuel; Rolvien, Maurice
We determine the asymptotic normalized rank of a random matrix A over an arbitrary field with prescribed numbers of nonzero entries in each row and column. As an application we obtain a formula for the rate of low-density parity check codes. This formula vindicates a conjecture of Lelarge (2013). The proofs are based on coupling arguments and a novel random perturbation, applicable to any matrix, that diminishes the number of short linear relations.
On clustering and related problems on curves under the Fréchet distance
(2021) Krivosija, Amer; Sohler, Christian; Driemel, Anne; Schubert, Erich
Sensor measurements can be represented as points in Rd. Ordered by the time-stamps of these measurements, they yield a time series, that can be interpreted as a polygonal curve in the d-dimensional ambient space. The number of the vertices is called complexity of the curve. In this thesis we study several fundamental computational tasks on curves: clustering, sim- pliﬁcation, and embedding, under the Fr´echet distance, which is a popular distance measure for curves, in its continuous and discrete version. We focus on curves in one-dimensional ambient space R. We study the problem of clustering of the curves in R under the Fr´echet distance, in particular, the following variations of the well-known k-center and k- median problems. Given is a set P of n curves in R, each of complexity at most m. Our goal is to ﬁnd k curves in R, not necessarily from P , called cluster centers and that each has complexity at most R. In the (k, R)-center problem, the maximum distance of an element of P to its nearest cluster center is minimized. In the (k, R)-median problem, the sum of these distances is minimized. We show that both problems are NP-hard under both versions of the Fr´echet distance, if k is part of the input. Under the continuous Fr´echet distance, we give (1 + ε)-approximation algorithms for both (k, R)-center and (k, R)-median problem, with running time near-linear in the input size for constant ε, k and R. Our techniques yield constant-factor approximation algorithms for the observed problems under the discrete Fr´echet distance. To obtain the (1 + ε)-approximation algorithms for the clustering prob- lems under the continuous Fr´echet distance, we develop a new simpliﬁcation technique on one-dimensional curve, called δ-signature. The signatures al- ways exist, and we can compute them eﬃciently. We also study the problem of embedding of the Fr´echet distance into space R. We show that, in the worst case and under reasonable assumptions, the discrete Fr´echet distance between two polygonal curves of complexity m in Rd, where 2 ≤ d ≤ 7, degrades by a factor linear in m with constant probability, when the curves are projected onto a randomly chosen line. We show upper and lower bounds on the distortion. Sensor measurements can also deﬁne a discrete distribution over possi- ble locations of a point in Rd. Then, the input consists of n probabilistic points. We study the probabilistic 1-center problem in Euclidean space Rd, also known as the probabilistic smallest enclosing ball (pSEB) problem. To improve the best existing algorithm for the pSEB problem by reducing its exponential dependence on the dimension to linear, we study the determinis- tic set median problem, that generalizes both the 1-center and the 1-median problems. We present a (1 + ε)-approximation algorithm for the set median problem, using a novel combination of sampling techniques and stochastic subgradient descent. Our (1 + ε)-approximation algorithm for the pSEB problem takes linear time in d and n, making the pSEB algorithm applicable to shape ﬁtting problems in Hilbert spaces of unbounded dimension using kernel functions. We present an exemplary application by extending the support vector data description (SVDD) shape ﬁtting method to the probabilistic case.
On the relation between structured d-DNNFs and SDDs
(2020-08-17) Bollig, Beate; Farenholtz, Martin
Structured d-DNNFs and SDDs are restricted negation normal form circuits used in knowledge compilation as target languages into which propositional theories are compiled. Structuredness is imposed by so-called vtrees. By definition SDDs are restricted structured d-DNNFs. Beame and Liew (2015) as well as Bova and Szeider (2017) mentioned the question whether structured d-DNNFs are really more general than SDDs w.r.t. polynomial-size representations (w.r.t. the number of Boolean variables the represented functions are defined on.) The main result in the paper is the proof that a function can be represented by SDDs of polynomial size if the function and its complement have polynomial-size structured d-DNNFs that respect the same vtree.
Property testing of graphs and the role of neighborhood distributions
(2020) Fichtenberger, Hendrik; Sohler, Christian; Czumaj, Artur
Property testing considers decision problems in the regime of sublinear complexity. Most classical decision problems require at least linear time complexity in order to read the whole input. Hence, decision problems are relaxed by introducing a gap between “yes” and “no” instances: A property tester for a property Π (e. g., planarity) is a randomized algorithm with constant error probability that accepts objects that have Π (planar graphs) and that rejects objects that have linear edit distance to any object from Π (graphs with a linear number of crossing edges in every planar embedding). For property testers, locality is a natural and crucial concept because they cannot obtain a global view of their input. In this thesis, we investigate property testing in graphs and how testers leverage the information contained in the neighborhoods of randomly sampled vertices: We provide some structural insights regarding properties with constant testing complexity in graphs with bounded (maximum vertex) degree and a connection between testers with constant complexity for general graphs and testers with logarithmic space complexity for random-order streams. We also present testers for some minor-freeness properties and a tester for conductance in the distributed CONGEST model.
On large-scale probabilistic and statistical data analysis
(2018) Munteanu, Alexander; Sohler, Christian; Ickstadt, Katja
In this manuscript we develop and apply modern algorithmic data reduction techniques to tackle scalability issues and enable statistical data analysis of massive data sets. Our algorithms follow a general scheme, where a reduction technique is applied to the large-scale data to obtain a small summary of sublinear size to which a classical algorithm is applied. The techniques for obtaining these summaries depend on the problem that we want to solve. The size of the summaries is usually parametrized by an approximation parameter, expressing the trade-off between efficiency and accuracy. In some cases the data can be reduced to a size that has no or only negligible dependency on the initial number of data items. However, for other problems it turns out that sublinear summaries do not exist in the worst case. In such situations, we exploit statistical or geometric relaxations to obtain useful sublinear summaries under certain mildness assumptions. We present, in particular, the data reduction methods called coresets and subspace embeddings, and several algorithmic techniques to construct these via random projections and sampling.
On algorithms for large-scale graph and clustering problems
(2017) Schwiegelshohn, Chris; Sohler, Christian; Leonardi, Stefano
Gegenstand dieser Arbeit sind algorithmische Methoden der modernen Datenanalyse. Dabei werden vorwiegend zwei übergeordnete Themen behandelt: Datenstromalgorithmen mit Kompressionseigenschaften und Approximationsalgorithmen für Clusteringverfahren. Datenstromalgorithmen verarbeiten einen Datensatz sequentiell und haben das Ziel, Eigenschaften des Datensatzes (approximativ) zu bestimmen, ohne dabei den gesamten Datensatz abzuspeichern. Unter Clustering versteht man die Partitionierung eines Datensatzes in verschiedene Gruppen. Das erste dargestellte Problem betrifft Matching in Graphen. Hier besteht der Datensatz aus einer Folge von Einfüge- und Löschoperationen von Kanten. Die Aufgabe besteht darin, die Größe des so genannten Maximum Matchings so genau wie möglich zu bestimmen. Es wird ein Algorithmus vorgestellt, der, unter der Annahme, dass das Matching höchstens die Größe k hat, die exakte Größe bestimmt und dabei k² Speichereinheiten benötigt. Dieser Algorithmus lässt sich weiterhin verwenden um eine konstante Approximation der Matchinggröße in planaren Graphen zu bestimmen. Des Weiteren werden untere Schranken für den benötigten Speicherplatz bestimmt und eine Reduktion von gewichtetem Matching zu ungewichteten Matching durchgeführt. Anschließend werden Datenstromalgorithmen für die Nachbarschaftssuche betrachtet, wobei die Aufgabe darin besteht, für n gegebene Mengen die Paare mit hoher Ähnlichkeit in nahezu Linearzeit zu finden. Dabei ist der Jaccard Index |A ∩ B|/|A U B| das Ähnlichkeitsmaß für zwei Mengen A und B. In der Arbeit wird eine Datenstruktur beschrieben, die dies erstmalig in dynamischen Datenströmen mit geringem Speicherplatzverbrauch leistet. Dabei werden Zufallszahlen mit nur 2-facher Unabhängigkeit verwendet, was eine sehr effiziente Implementierung ermöglicht. Das dritte Problem befindet sich an der Schnittstelle zwischen den beiden Themen dieser Arbeit und betrifft das k-center Clustering Problem in Datenströmen mit einem Zeitfenster. Die Aufgabe besteht darin k Zentren zu finden, sodass die maximale Distanz unter allen Punkten zu dem jeweils nächsten Zentrum minimiert wird. Ergebnis sind ein 6-Approximationalgorithmus für ein beliebiges k und ein optimaler 4-Approximationsalgorithmus für k = 2. Die entwickelten Techniken lassen sich ebenfalls auf das Durchmesserproblem anwenden und ermöglichen für dieses Problem einen optimalen Algorithmus. Danach werden Clusteringprobleme bezüglich der Jaccard Distanz analysiert. Dabei sind wieder eine Menge N von Teilmengen aus einer Grundgesamtheit U sind und die Aufgabe besteht darin eine Teilmenge $C$ zu finden, die max 1-|X ∩ C|/|X U C| minimiert. Es wird gezeigt, dass zwar eine exakte Lösung des Problems NP-schwer ist, es aber gleichzeitig eine PTAS gibt. Abschließend wird die weit verbreitete lokale Suchheuristik für k-median und k-means Clustering untersucht. Obwohl es im Allgemeinen schwer ist, diese Probleme exakt oder auch nur approximativ zu lösen, gelten sie in der Praxis als relativ gut handhabbar, was andeutet, dass die Härteresultate auf pathologischen Eingaben beruhen. Auf Grund dieser Diskrepanz gab es in der Vergangenheit praxisrelevante Datensätze zu charakterisieren. Für drei der wichtigsten Charakterisierungen wird das Verhalten einer lokalen Suchheuristik untersucht mit dem Ergebnis, dass die lokale Suchheuristik in diesen Fällen optimale oder fast optimale Cluster ermittelt.
On graph algorithms for large-scale graphs
(2015) Bury, Marc; Bollig, Beate; Sauerhoff, Martin
Die Anforderungen an Algorithmen hat sich in den letzten Jahren grundlegend geändert. Die Datengröße der zu verarbeitenden Daten wächst schneller als die zur Verfügung stehende Rechengeschwindigkeit. Daher sind neue Algorithmen auf sehr großen Graphen wie z.B. soziale Netzwerke, Computernetzwerke oder Zustandsübergangsgraphen entwickelt worden, um das Problem der immer größer werdenden Daten zu bewältigen. Diese Arbeit beschäftigt sich mit zwei Herangehensweisen für dieses Problem. Implizite Algorithmen benutzten eine verlustfreie Kompression der Daten, um die Datengröße zu reduzieren, und arbeiten direkt mit den komprimierten Daten, um Optimierungsprobleme zu lösen. Graphen werden hier anhand der charakteristischen Funktion der Kantenmenge dargestellt, welche mit Hilfe von Ordered Binary Decision Diagrams (OBDDs) – eine bekannte Datenstruktur für Boolesche Funktionen - repräsentiert werden können. Wir entwickeln in dieser Arbeit neue Techniken, um die OBDD-Größe von Graphen zu bestimmen, und wenden diese Technik für mehrere Klassen von Graphen an und erhalten damit (fast) optimale Schranken für die OBDD-Größen. Kleine Eingabe-OBDDs sind essenziell für eine schnelle Verarbeitung, aber wir brauchen auch Algorithmen, die große Zwischenergebnisse während der Ausführung vermeiden. Hierfür entwickeln wir Algorithmen für bestimme Graphklassen, die die Kodierung der Knoten ausnutzt, die wir für die Resultate der OBDD-Größe benutzt haben. Zusätzlich legen wir die Grundlage für die Betrachtung von randomisierten OBDD-basierten Algorithmen, indem wir untersuchen, welche Art von Zufall wir hier verwenden und wie wir damit Algorithmen entwerfen können. Im Zuge dessen geben wir zwei randomisierte Algorithmen an, die ihre entsprechenden deterministischen Algorithmen in einer experimentellen Auswertung überlegen sind. Datenstromalgoritmen sind eine weitere Möglichkeit für die Bearbeitung von großen Graphen. In diesem Modell wird der Graph anhand eines Datenstroms von Kanteneinfügungen repräsentiert und den Algorithmen steht nur eine begrenzte Menge von Speicher zur Verfügung. Lösungen für Graphoptimierungsprobleme benötigen häufig eine lineare Größe bzgl. der Anzahl der Knoten, was eine triviale untere Schranke für die Streamingalgorithmen für diese Probleme impliziert. Die Berechnung eines Matching ist so ein Beispiel, was aber in letzter Zeit viel Aufmerksamkeit in der Streaming-Community auf sich gezogen hat. Ein Matching ist eine Menge von Kanten, so dass keine zwei Kanten einen gemeinsamen Knoten besitzen. Wenn wir nur an der Größe oder dem Gewicht (im Falle von gewichteten Graphen) eines Matching interessiert sind, ist es mögliche diese lineare untere Schranke zu durchbrechen. Wir konzentrieren uns in dieser Arbeit auf dynamische Datenströme, wo auch Kanten gelöscht werden können. Wir reduzieren das Problem, einen Schätzer für ein gewichtsoptimales Matching zu finden, auf das Problem, die Größe von Matchings zu approximieren, wobei wir einen kleinen Verlust bzgl. der Approximationsgüte in Kauf nehmen müssen. Außerdem präsentieren wir den ersten dynamischen Streamingalgorithmus, der die Größe von Matchings in lokal spärlichen Graphen approximiert. Für kleine Approximationsfaktoren zeigen wir eine untere Schranke für den Platzbedarf von Streamingalgorithmen, die die Matchinggröße approximieren.
Coresets and streaming algorithms for the k-means problem and related clustering objectives
(2014) Schmidt, Melanie; Sohler, Christian; Blömer, Johannes
The k-means problem seeks a clustering that minimizes the sum of squared errors cost function: For input points P from the Euclidean space R^d and any solution consisting of k centers from R^d, the cost is the sum of the squared distances of any point to its closest center. This thesis studies concepts used for large input point sets. For inputs with many points, the term coreset refers to a reduced version with less but weighted points. For inputs with high-dimensional points, dimensionality reduction is used to reduce the number of dimensions. In both cases, the reduced version has to maintain the cost function up to an epsilon-fraction for all choices of k centers. We study coreset constructions and dimensionality reductions for the k-means problem. Further, we develop coreset constructions in the data stream model. Here, the data is so large that it should only be read once and cannot be stored in main memory. The input might even arrive as a stream of points in an arbitrary order. Thus, a data stream algorithm has to continuously process the input while it arrives and can only store small summaries. In the second part of the thesis, the obtained results are extended to related clustering objectives. Projective clustering minimizes the squared distances to k subspaces instead of k points. Kernel k-means is an extension of k-means to scenarios where the target clustering is not linearly separable. In addition to extensions to these objectives, we study coreset constructions for a probabilistic clustering problem where input points are given as distributions over a finite set of locations.
Property testing in gradbeschränkten gerichteten Graphen unter Nichtsichtbarkeit eingehender Kanten
(2013-12-06) Hellweg, Frank; Sohler, Christian; Westermann, Matthias
Diese Arbeit untersucht Property-Testing in gerichteten gradbeschränkten Graphen. Property-Testing ist eine Technik zum Lösen von Entscheidungsproblemen, wobei durch Relaxierung des Problems und Verwendung von Randomisierung eine sublineare Laufzeit angestrebt wird. Gradbeschränkte Graphen sind das übliche Modell, um Property-Testing in dünn besetzten Graphen zu untersuchen; hierbei wird angenommen, dass der Eingabegraph in Adjazenzlistendarstellung vorliegt und die Länge der Adjazenzlisten durch eine Konstante D beschränkt ist. Im Rahmen dieser Arbeit gehen wir davon aus, dass in gerichteten Graphen jeweils nur die ausgehenden Kanten in den Adjazenzlisten enthalten sind. Die dadurch deutlich eingeschränkten Möglichkeiten der Graphexplorierung führen dazu, dass Probleme im Vergleich zu ihren Pendants in ungerichteten Graphen deutlich schwieriger zu lösen sind; dies konnten Bender und Ron in 2002 zum Beispiel für starken Zusammenhang zeigen. In dieser Arbeit werden Property-Testing-Algorithmen im oben genannten Modell für drei Probleme vorgestellt: Starker Zusammenhang, Subgraphfreiheit mit dem Spezialfall 3-Stern-Freiheit und die Spanner-Eigenschaft in geometrischen gerichteten Graphen.
Theoretical foundations of artificial immune systems
(2011-07-21) Zarges, Christine; Jansen, Thomas; Rudolph, Günter
Artificial immune systems (AIS) are a special class of biologically inspired algorithms, which are based on the immune system of vertebrates. The field constitutes a relatively new and emerging area of research in Computational Intelligence that has achieved various promising results in different areas of application, e.g., learning, classification, anomaly detection, and (function) optimization. An increasing and often stated problem of the field is the lack of a theoretical basis for AIS as most work so far only concentrated on the direct application of immune principles. In this thesis, we concentrate on optimization applications of AIS. It can easily be recognized that with respect to this application area, the work done previously mainly covers convergence analysis. To the best of our knowledge this thesis constitutes the first rigorous runtime analyses of immune-inspired operators and thus adds substantially to the demanded theoretical foundation of AIS. We consider two very common aspects of AIS. On one hand, we provide a theoretical analysis for different hypermutation operators frequently employed in AIS. On the other hand, we examine a popular diversity mechanism named aging. We compare our findings with corresponding results from the analysis of other nature-inspired randomized search heuristics, in particular evolutionary algorithms. Moreover, we focus on the practical implications of our theoretical results in order to bridge the gap between theory and practice. Therefore, we derive guidelines for parameter settings and point out typical situations where certain concepts seem promising. These analyses contribute to the understanding of how AIS actually work and in which applications they excel other randomized search heuristics.
Non-uniform Sampling in Clustering and Streaming
(2011-03-22) Monemizadeh, Morteza; Sohler, Christian; Schweikardt, Nicole
Approximation Techniques for Facility Location and Their Applications in Metric Embeddings
(2011-02-01) Lammersen, Christiane; Sohler, Christian; Meyer auf der Heide, Friedhelm
This thesis addresses the development of geometric approximation algorithms for huge datasets and is subdivided into two parts. The first part deals with algorithms for facility location problems, and the second part is concerned with the problem of computing compact representations of finite metric spaces. Facility location problems belong to the most studied problems in combinatorial optimization and operations research. In the facility location variants considered in this thesis, the input consists of a set of points where each point is a client as well as a potential location for a facility. Each client has to be served by a facility. However, connecting a client incurs connection costs, and opening or maintaining a facility causes so-called opening costs. The goal is to open a subset of the input points as facilities such that the total cost of the system is minimized.
Information Complexity and Data Stream Algorithms for Basic Problems
(2010-12-09) Gronemeier, André; Sauerhoff, Martin; Sohler, Christian
Data stream algorithms obtain their input as a stream of data elements that have to be processed immediately as they arrive using only a very limited amount of memory. They solve a new class of algorithmic problems that emerged recently with the growing importance of computer networks and the ever-increasing size of the data sets that are processed algorithmically. In this thesis data stream algorithms for basic problems under extreme space restrictions are developed, namely counting and random sampling. Then we apply these algorithms to improve the space complexity of the celebrated data stream algorithm for the computation of frequency moments by Alon, Matias, and Szegedy for very long data streams. Lower bounds on the space complexity of data stream algorithms are usually proved by using communication complexity arguments. Information complexity is a related field that applies Shannon's information theory to obtain lower bounds on the communication complexity of functions. The development of information complexity is closely linked to the recent interest in data stream algorithms since important parts of this theory have been developed to prove a lower bound on the space complexity of data stream algorithms for the frequency moments. In this thesis we prove an optimal lower bound on the multi-party information complexity of the disjointness function, the underlying communication problem in the proof of the lower bound on the space complexity of data stream algorithms for the frequency moments. Additionally, we generalize and simplify known lower bounds on the one-way communication complexity of the index function by using information complexity and we present the first attempt to apply information complexity to multi-party one-way protocols in the number on the forehead model by Chandra, Furst, and Lipton.
Zur Analyse der Optimierungszeit randomisierter Suchheuristiken für kombinatorische Probleme
(2010-09-23) Thyssen, Christian; Jansen, Thomas; Rudolph, Günter
Algorithms for regression and classification
(2009-03-12T10:57:58Z) Nunkesser, Robin; Jansen, Thomas; Fried, Roland
Regression and classification are statistical techniques that may be used to extract rules and patterns out of data sets. Analyzing the involved algorithms comprises interdisciplinary research that offers interesting problems for statisticians and computer scientists alike. The focus of this thesis is on robust regression and classification in genetic association studies. In the context of robust regression, new exact algorithms and results for robust online scale estimation with the estimators Qn and Sn and for robust linear regression in the plane with the estimator least quartile difference (LQD) are presented. Additionally, an evolutionary computation algorithm for robust regression with different estimators in higher dimensions is devised. These estimators include the widely used least median of squares (LMS) and least trimmed squares (LTS). For classification in genetic association studies, this thesis describes a Genetic Programming algorithm that outpeforms the standard approaches on the considered data sets. It is able to identify interesting genetic factors not found before in a data set on sporadic breast cancer and to handle larger data sets than the compared methods. In addition, it is extendible to further application fields.
Computational complexity of evolutionary algorithms, hybridizations, and swarm intelligence
(2008-12-23T13:25:19Z) Sudholt, Dirk; Jansen, Thomas; Rudolph, Günter
Bio-inspired randomized search heuristics such as evolutionary algorithms, hybridizations with local search, and swarm intelligence are very popular among practitioners as they can be applied in case the problem is not well understood or when there is not enough knowledge, time, or expertise to design problem-specific algorithms. Evolutionary algorithms simulate the natural evolution of species by iteratively applying evolutionary operators such as mutation, recombination, and selection to a set of solutions for a given problem. A recent trend is to hybridize evolutionary algorithms with local search to refine newly constructed solutions by hill climbing. Swarm intelligence comprises ant colony optimization as well as particle swarm optimization. These modern search paradigms rely on the collective intelligence of many single agents to find good solutions for the problem at hand. Many empirical studies demonstrate the usefulness of these heuristics for a large variety of problems, but a thorough understanding is still far away. We regard these algorithms from the perspective of theoretical computer science and analyze the random time these heuristics need to optimize pseudo-Boolean problems. This is done in a mathematically rigorous sense, using tools known from the analysis of randomized algorithms, and it leads to asymptotic bounds on their computational complexity. This approach has been followed successfully for evolutionary algorithms, but the theory of hybrid algorithms and swarm intelligence is still in its very infancy. Our results shed light on the asymptotic performance of these heuristics, increase our understanding of their dynamic behavior, and contribute to a rigorous theoretical foundation of randomized search heuristics.
Computational aspects of combinatorial pricing problems
(2007-11-26T10:54:02Z) Briest, Patrick; Krysta, Piotr; Wegener, Ingo
Combinatorial pricing encompasses a wide range of natural optimization problems that arise in the computation of revenue maximizing pricing schemes for a given set of goods, as well as the design of profit maximizing auctions in strategic settings. We consider the computational side of several different multi-product and network pricing problems and, as most of the problems in this area are NP-hard, we focus on the design of approximation algorithms and corresponding inapproximability results. In the unit-demand multi-product pricing problem it is assumed that each consumer has different budgets for the products she is interested in and purchases a single product out of her set of alternatives. Depending on how consumers choose their products once prices are fixed we distinguish the min-buying, max-buying and rank-buying models, in which consumers select the affordable product with smallest price, highest price or highest rank according to some predefined preference list, respectively. We prove that the max-buying model allows for constant approximation guarantees and this is true even in the case of limited product supply. For the min-buying model we prove inapproximability beyond the known logarithmic guarantees under standard complexity theoretic assumptions. Surprisingly, this result even extends to the case of pricing with a price ladder constraint, i.e., a predefined relative order on the product prices. Furthermore, similar results can be shown for the uniform-budget version of the problem, which corresponds to a special case of the unit-demand envy-free pricing problem, under an assumption about the average case hardness of refuting random 3SAT-instances. Introducing the notion of stochastic selection rules we show that among a large class of selection rules based on the order of product prices the maxbuying model is in fact the only one allowing for sub-logarithmic approximation guarantees. In the single-minded pricing problem each consumer is interested in a single set of products, which she purchases if the sum of prices does not exceed her budget. It turns out that our results on envyfree unit-demand pricing can be extended to this scenario and yield inapproximability results for ratios expressed in terms of the number of distinct products, thereby complementing existing hardness results. On the algorithmic side, we present an algorithm with approximation guarantee that depends only on the maximum size of the sets and the number of requests per product. Our algorithm’s ratio matches previously known results in the worst case but has significantly better provable performance guarantees on sparse problem instances. Viewing single-minded as a network pricing problem in which we assign prices to edges and consumers want to purchase paths in the network, it is proven that the problem remains APX-hard even on extremely sparse instances. For the special case of pricing on a line with paths that are nested, we design an FPTAS and prove NP-hardness. In a Stackelberg network pricing game a so-called leader sets the prices on a subset of the edges of a network, the remaining edges have associated fixed costs. Once prices are fixed, one or more followers purchase min-cost subnetworks according to their requirements and pay the leader for all pricable edges contained in their networks. We extend the analysis of the known single-price algorithm, which assigns the same price to all pricable edges, from cases in which the feasible subnetworks of a follower form the basis of a matroid to the general case, thus, obtaining logarithmic approximation guarantees for general Stackelberg games. We then consider a special 2-player game in which the follower buys a min-cost vertex cover in a bipartite graph and the leader sets prices on a subset of the vertices. We prove that this problem is polynomial time solvable in some cases and allows for constant approximation guarantees in general. Finally, we point out that results on unit-demand and single-minded pricing yield several strong inapproximability results for Stackelberg pricing games with multiple followers.
Design und Analyse randomisierter Suchheuristiken
(2007-03-06T09:24:34Z) Storch, Tobias; Wegener, Ingo; Jansen, Thomas
Das Design von Algorithmen und ihre Analyse gehören zu den fundamentalen Aufgaben der Informatik. Dabei ist das Berechnen der Optima einer Funktion eines der wichtigsten Probleme - in Theorie wie Praxis. In dieser Dissertation werden verschiedene randomisierte Suchheuristiken, zu denen die evolutionären Algorithmen gehören, zur Optimierung pseudoboolescher Funktionen theoretisch untersucht. Zum Beginn werden die Effekte der Wahl der Plus- und Komma-Selektion auf die Laufzeit eines einfachen evolutionären Algorithmus bei unterschiedlichen Nachkommenpopulationsgrößen betrachtet. Bei kleinen Nachkommenpopulationen ist die Komma-Selektion bereits zur Optimierung einfacher Funktionen völlig ungeeignet, wohingegen bei großen Nachkommenpopulationen die Plus- und die Komma-Selektion sich sehr ähnlich verhalten. Für Nachkommenpopulationen, die weder klein noch groß sind, wird exemplarisch die wesentliche Effizienzsteigerung durch Einsatz der Komma- gegenüber der Plus-Selektion bewiesen. Des Weiteren werden die Effekte der Wahl der Elternpopulationsgröße auf die Laufzeit eines einfachen evolutionären Algorithmus betrachtet. Neben dem Vergleich der Auswirkungen der Diversität auf die Effizienz zur Optimierung einfacher Funktionen wird ein starkes Hierarchieresultat für die Elternpopulationsgröße gezeigt. Bereits die Vergrößerung der Population um eins kann von totaler Ineffizienz zu Effizienz führen. Des Weiteren betrachten wir detailliert die Effekte von Transformationen der zu optimierenden Funktionen sowohl für die unterschiedlichen Komponenten und Klassen von randomisierten Suchheuristiken als auch für die Black-Box-Komplexität. Diesbezüglich kann eine Klassifizierung aller Suchheuristiken und Transformationen vorgenommen sowie für einfache Algorithmen konkretisiert werden. Zum Schluss werden einfache randomisierte Suchheuristiken zur Approximierung und Optimierung des Cliquenproblems untersucht. Dies geschieht nicht nur für beliebige Graphen erschöpfend, sondern auch für spezielle Klassen von Graphen. Den semizufälligen Graphen kommt eine besondere Bedeutung zu. Hier gelingt es schließlich in einer sehr allgemeinen Form, die wesentliche Steigerung der Effizienz einer randomisierten Suchheuristik durch den Einsatz einer großen Population nachzuweisen. Für planare und zufällige planare Graphen wird abschließend durch die Betrachtung der Black-Box-Komplexität die Optimalität eines Metropolis-Algorithmus zur Berechnung einer größten Clique bewiesen.
Probabilistic analysis of evolution strategies using isotropic mutations
(2007-02-07T10:49:26Z) Jägersküpper, Jens; Wegener, Ingo; Jansen, Thomas
This dissertation deals with optimization in high-dimensional Euclidean space. Namely, a particular type of direct-search methods known as Evolution Strategies (ESs) are investigated. Evolution Strategies mimic natural evolution, in particular mutation, in order to "evolve" an approximate solution. As this dissertation focuses on theoretical investigation of ESs in the way randomized approximation algorithms are analyzed in theoretical computer science (rather than by means of convergence theory or dynamical-system theory), very basic and simple ESs are considered. Namely, the only search operator that is applied are so-called isotropic mutations. That is, a new candidate solution is obtained by adding a random vector to the current candidate solution the distribution of which is spherically symmetric. General lower bounds on the number of steps/isotropic mutations which are necessary to reduce the approximation error in the search space are proved, where the focus is on how the number of optimization steps depends on (and scales with) the dimensionality of the search space. These lower bounds hold independently of the function to be optimized and for large classes of ESs. Moreover, for several concrete optimization scenarios where certain ESs optimize a unimodal function, upper bounds on the number of optimization steps are proved.
Algorithmik und Komplexität OBDD-repräsentierter Graphen
(2006-12-07T16:38:05Z) Sawitzki, Daniel; Wegener, Ingo; Sauerhoff, Martin
Ordered Binary Decision Diagrams (OBDDs) werden in vielen praktischen Anwendungsgebieten erfolgreich als Datenstruktur zur kompakten Repräsentation boolescher Funktionen eingesetzt. Auch sehr große Graphen werden in Bereichen wie CAD und Model Checking oft implizit durch boolesche Funktionen und OBDDs dargestellt. Diese Dissertation behandelt grundlegende graphtheoretische Probleme auf OBDD-repräsentierten Graphen und lotet die Möglichkeiten entsprechender OBDD-basierter Algorithmen aus. Zum einen werden neue Algorithmen vorgestellt und ihre Eigenschaften im Hinblick auf das Entwurfsziel sublinearer Heuristiken analysiert. Zum anderen werden Grenzen des Ansatzes durch komplexitätstheoretische Härteresultate und konkrete untere Schranken aufgezeigt.

Browse

Recent Submissions