Reasoning about distributed relational data and query evaluation

Geck, Gaetano

Reasoning about distributed relational data and query evaluation

Dateien

Diss-GaetanoGeck.pdf (1.17 MB)

Datum

2019

Autor:innen

Geck, Gaetano

Zusammenfassung

Large data sets are often stored distributedly to increase the reliability of systems and the efficiency of query evaluation in them. While some query operators -- like selections and projections -- are intrinsically conform with parallel evaluation, others -- like joins -- demand specific distribution patterns. For relational databases, a common approach to evaluate queries in parallel relies on the use of rather simple distribution patterns for binary joins and the computation of the query result according to some query plan, operator by operator. Often, this requires the redistribution of large intermediate results (possibly larger than the input and/or output) and thus may lead to unnecessary long processing times. Thus, especially in the last decade, more elaborate distribution patterns that depend on the whole query have been studied and shown to allow more efficient query evaluation in several cases by reducing the amount of communication between servers. Ameloot et al. have described a setting where query evaluation is studied for a broad range of distribution patterns. Their work focuses on problems to decide whether a query can be evaluated correctly under a given distribution pattern. More particularly, they have considered two problems: "parallel correctness", where the pattern is specified explicitly, and "parallel-correctness transfer", where the pattern is known to be appropriate for another query. This thesis comprises the author's contributions to the complexity-theoretical investigation of these problems for conjunctive queries (and extensions thereof). These contributions complement the main characterisations and some additional complexity results by Ameloot et al. Furthermore, this thesis contains some new characterisations for "polarised" queries. Via the characterisations, parallel correctness and parallel-correctness transfer can be translated into questions on the co-occurrences of certain facts, induced by the query, on some server. Such questions and others can be modelled by "distribution dependencies", a variant of the well-known tuple- and equality-generating dependencies. Modelling via these constraints allows a more general description of distribution patterns in distributed relational data. The third contribution of this thesis is the study of the implication problem for distribution dependencies, providing lower and upper bounds for some fragments.

Schlagwörter

Distributed database, Query evaluation, Distribution policy, Parallel correctness, Parallel soundness, Parallel completeness, Parallel-correctness transfer, Containment, Implication, Reasoning, Static analysis

Schlagwörter nach RSWK

Verteiltes Datenbanksystem, Abfrageverarbeitung, Verteilungsverfahren, Korrektheit, Vollständigkeit, Statische Analyse, Schlussfolgern

URI

http://hdl.handle.net/2003/39813
http://dx.doi.org/10.17877/DE290R-21704

Sammlungen

LS 01 Logik in der Informatik

Komplettanzeige

Reasoning about distributed relational data and query evaluation

Dateien

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Sonstige Titel

Zusammenfassung

Beschreibung

Inhaltsverzeichnis

Schlagwörter

Schlagwörter nach RSWK

Zitierform

URI

Sammlungen

Befürwortung

Review

Ergänzt durch

Referenziert von