Reasoning about distributed relational data and query evaluation

dc.contributor.advisorSchwentick, Thomas
dc.contributor.authorGeck, Gaetano
dc.contributor.refereeSegoufin, Luc
dc.date.accepted2019-11-19
dc.date.accessioned2020-11-11T10:22:02Z
dc.date.available2020-11-11T10:22:02Z
dc.date.issued2019
dc.description.abstractLarge data sets are often stored distributedly to increase the reliability of systems and the efficiency of query evaluation in them. While some query operators -- like selections and projections -- are intrinsically conform with parallel evaluation, others -- like joins -- demand specific distribution patterns. For relational databases, a common approach to evaluate queries in parallel relies on the use of rather simple distribution patterns for binary joins and the computation of the query result according to some query plan, operator by operator. Often, this requires the redistribution of large intermediate results (possibly larger than the input and/or output) and thus may lead to unnecessary long processing times. Thus, especially in the last decade, more elaborate distribution patterns that depend on the whole query have been studied and shown to allow more efficient query evaluation in several cases by reducing the amount of communication between servers. Ameloot et al. have described a setting where query evaluation is studied for a broad range of distribution patterns. Their work focuses on problems to decide whether a query can be evaluated correctly under a given distribution pattern. More particularly, they have considered two problems: "parallel correctness", where the pattern is specified explicitly, and "parallel-correctness transfer", where the pattern is known to be appropriate for another query. This thesis comprises the author's contributions to the complexity-theoretical investigation of these problems for conjunctive queries (and extensions thereof). These contributions complement the main characterisations and some additional complexity results by Ameloot et al. Furthermore, this thesis contains some new characterisations for "polarised" queries. Via the characterisations, parallel correctness and parallel-correctness transfer can be translated into questions on the co-occurrences of certain facts, induced by the query, on some server. Such questions and others can be modelled by "distribution dependencies", a variant of the well-known tuple- and equality-generating dependencies. Modelling via these constraints allows a more general description of distribution patterns in distributed relational data. The third contribution of this thesis is the study of the implication problem for distribution dependencies, providing lower and upper bounds for some fragments.en
dc.identifier.urihttp://hdl.handle.net/2003/39813
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-21704
dc.language.isoende
dc.subjectDistributed databaseen
dc.subjectQuery evaluationen
dc.subjectDistribution policyen
dc.subjectParallel correctnessen
dc.subjectParallel soundnessen
dc.subjectParallel completenessen
dc.subjectParallel-correctness transferen
dc.subjectContainmenten
dc.subjectImplicationen
dc.subjectReasoningen
dc.subjectStatic analysisen
dc.subject.ddc004
dc.subject.rswkVerteiltes Datenbanksystemde
dc.subject.rswkAbfrageverarbeitungde
dc.subject.rswkVerteilungsverfahrende
dc.subject.rswkKorrektheitde
dc.subject.rswkVollständigkeitde
dc.subject.rswkStatische Analysede
dc.subject.rswkSchlussfolgernde
dc.titleReasoning about distributed relational data and query evaluationen
dc.typeTextde
dc.type.publicationtypedoctoralThesisde
dcterms.accessRightsopen access
eldorado.secondarypublicationfalsede

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Diss-GaetanoGeck.pdf
Size:
1.17 MB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.85 KB
Format:
Item-specific license agreed upon to submission
Description: