Statistical methods for complex data structures: change detection and goodness-of-fit testing for stochastic networks and an online monitoring framework for text data

dc.contributor.advisorJentsch, Carsten
dc.contributor.authorFlossdorf, Jonathan
dc.contributor.refereeKreiss, Alexander
dc.contributor.refereeFried, Roland
dc.date.accepted2024-12-11
dc.date.accessioned2025-01-08T12:44:53Z
dc.date.available2025-01-08T12:44:53Z
dc.date.issued2024
dc.description.abstractIn modern times, not only the amount of data is growing, but also its complexity leading to more sophisticated types of data. A popular example is the conception of given structures as networks. Because network data, even in their simplest form, contain considerably more information than traditional data, they are methodologically challenging to handle and analyse - even more so as sample sizes increase or covariates are added. This cumulative dissertation addresses the challenges of descriptive and inferential statistics in the context of complex and high-dimensional structures with a main focus on stochastic networks. A special centre of attention lies in maintaining flexibility in practical application, i.e. imposing as few restrictions on the network structure as possible. As a first step, the peculiarities of network data and the question of how to characterise differences between given networks are considered, leading to a categorisation of potential change types in network structure. Further, the suitability of different network metrics, which characterise a network by a scalar value, is investigated in a comprehensive framework. These findings are then embedded in a change point detection scheme for dynamic networks. In a second work, these findings are extended to a multivariate setup where different metrics are used simultaneously to extract as much information as possible from the dynamic network. In this context, the interplay between multivariate metric sets and different types of parametric and non-parametric control charts is extensively discussed. Third, a novel class of goodness-of-fit tests is introduced, which includes a wide range of test statistics and allows to decide whether an underlying network is generated by a homogeneous Erdös-Rényi model, which is typically used as a benchmark model due to its simplicity and handiness, or by a more sophisticated alternative. In the latter stages of this thesis, attention is turned to the field of textual data, which, due to its complexity, requires overcoming similar challenges. A change detection scheme is developed that allows to automatically monitor the evolution of topics identified from a dynamic topic modeling approach. In order to increase the flexibility and applicability of this procedure, it is further embedded in a rich visualisation scheme that enables advanced interpretation possibilities and deeper analysis options for the user.en
dc.identifier.urihttp://hdl.handle.net/2003/43290
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-25122
dc.language.isoen
dc.subjectChange dedectionen
dc.subjectComplex dataen
dc.subjectDynamic networksen
dc.subjectInferential statisticsen
dc.subjectStatistical networksen
dc.subjectText dataen
dc.subject.ddc310
dc.subject.rswkDynamisches Netzwerkde
dc.subject.rswkInferenzstatistikde
dc.subject.rswkHochdimensionale Datende
dc.subject.rswkText Miningde
dc.titleStatistical methods for complex data structures: change detection and goodness-of-fit testing for stochastic networks and an online monitoring framework for text dataen
dc.typeText
dc.type.publicationtypePhDThesis
dcterms.accessRightsopen access
eldorado.secondarypublicationfalse

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dissertation_Flossdorf.pdf
Size:
6.08 MB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.82 KB
Format:
Item-specific license agreed upon to submission
Description: