Query evaluation revised: parallel, distributed, via rewritings

dc.contributor.advisorSchwentick, Thomas
dc.contributor.authorSpinrath, Christopher
dc.contributor.refereePichler, Reinhard
dc.date.accepted2024-01-29
dc.date.accessioned2024-03-08T09:58:24Z
dc.date.available2024-03-08T09:58:24Z
dc.date.issued2024
dc.description.abstractThis is a thesis on query evaluation in parallel and distributed settings, and structurally simple rewritings. It consists of three parts. In the first part, we investigate the efficiency of constant-time parallel evaluation algorithms. That is, the number of required processors or, asymptotically equivalent, the work required to evaluate queries in constant time. It is known that relational algebra queries can be evaluated in constant time. However, work-efficiency has not been a focus, and indeed known evaluation algorithms yield huge (polynomial) work bounds. We establish work-efficient constant-time algorithms for several query classes: (free-connex) acyclic, semi-join algebra, and natural join queries; the latter in the worst-case framework. The second part is about deciding parallel-correctness of distributed evaluation strategies: Given a query and policies specifying how data is distributed and communicated among multiple servers, does the distributed evaluation yield the same result as the classical evaluation, for every database? Ketsman et al. proved that parallel-correctness for Datalog is undecidable; by reduction from the undecidable containment problem for Datalog. We show that parallel-correctness is already undecidable for monadic and frontier-guarded Datalog queries, for which containment is decidable. However, deciding parallel-correctness for frontier-guarded Datalog and constraint-based communication policies satisfying a certain property is 2ExpTime-complete. Furthermore, we obtain the same bounds for the parallel-boundedness problem, which asks whether the number of required communication rounds is bounded, over all databases. The third part is about structurally simple rewritings. The (classical) rewriting problem asks whether, for a given query and a set of views, there is a query, called rewriting, over the views that is equivalent to the given query. We study the variant of this problem for (subclasses of) conjunctive queries and views that asks for a structurally simple rewriting. We prove that, if the given query is acyclic, an acyclic rewriting exists if there is any rewriting at all. Analogous statements hold for free-connex acyclic, hierarchical, and q-hierarchical queries. Furthermore, we prove that the problem is NP-hard, even if the given query and the views are acyclic or hierarchical. It becomes tractable if the views are free-connex acyclic or q-hierarchical (and the arity of the database schema is bounded).de
dc.identifier.urihttp://hdl.handle.net/2003/42383
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-24219
dc.language.isoende
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/4.0/de
dc.subjectQuery evaluationde
dc.subjectDatalogde
dc.subjectAcyclic conjunctive queriesde
dc.subjectFree-connex queriesde
dc.subjectHierarchical queriesde
dc.subjectPRAMde
dc.subjectMassively parallel communicationde
dc.subjectDistributed databasesde
dc.subjectParallel evaluationde
dc.subjectParallel-correctnessde
dc.subjectWork-efficientde
dc.subjectRewritingde
dc.subjectAcyclic rewritingde
dc.subjectComplexityde
dc.subjectNP-hardnessde
dc.subject.ddc004
dc.subject.rswkVerteiltes Datenbanksystemde
dc.subject.rswkKomplexitätde
dc.subject.rswkParallelisierungde
dc.subject.rswkAbfrageverarbeitungde
dc.subject.rswkLeistungsbewertungde
dc.titleQuery evaluation revised: parallel, distributed, via rewritingsde
dc.typeTextde
dc.type.publicationtypePhDThesisde
dcterms.accessRightsopen access
eldorado.secondarypublicationfalsede

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2024-02-07-dissertation-spinrath-pub.pdf
Size:
1.86 MB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.85 KB
Format:
Item-specific license agreed upon to submission
Description: