Eldorado Community:
http://hdl.handle.net/2003/9
2018-12-12T15:52:04ZA similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing
http://hdl.handle.net/2003/37828
Title: A similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing
Authors: van Delft, Anne; Dette, Holger
Abstract: Due to the surge of data storage techniques, the need for the development of appropri-ate techniques to identify patterns and to extract knowledge from the resulting enormous data sets, which can be viewed as collections of dependent functional data, is of increasing interest in many scientific areas. We develop a similarity measure for spectral density oper-ators of a collection of functional time series, which is based on the aggregation of Hilbert-Schmidt differences of the individual time-varying spectral density operators. Under fairly general conditions, the asymptotic properties of the corresponding estimator are derived and asymptotic normality is established. The introduced statistic lends itself naturally to quantify (dis)-similarity between functional time series, which we subsequently exploit in order to build a spectral clustering algorithm. Our algorithm is the first of its kind in the analysis of non-stationary (functional) time series and enables to discover particular pat-terns by grouping together ‘similar’ series into clusters, thereby reducing the complexity of the analysis considerably. The algorithm is simple to implement and computationally fea-sible. As a further application we provide a simple test for the hypothesis that the second order properties of two non-stationary functional time series coincide.2018-12-04T08:35:05ZAliasing effects for random fields over spheres of arbitrary dimension
http://hdl.handle.net/2003/37827
Title: Aliasing effects for random fields over spheres of arbitrary dimension
Authors: Durastanti, Claudio; Patschkowski, Tim
Abstract: In this paper, aliasing effects are investigated for random ﬁelds deﬁned on the d-dimensional
sphere Sd, and reconstructed from discrete samples. First, we introduce the concept of an aliasing function
on Sd. The aliasing function allows to identify explicitly the aliases of a given harmonic coefficient in
the Fourier decomposition. Then, we exploit this tool to establish the aliases of the harmonic coefficients approximated by means of the quadrature procedure named spherical uniform sampling. Subsequently, we
study the consequences of the aliasing errors in the approximation of the angular power spectrum of an isotropic random ﬁeld, the harmonic decomposition of its covariance function. Finally, we show that band-
limited random ﬁelds are aliases-free, under the assumption of a sufficiently large amount of nodes in the quadrature rule.2018-12-04T08:32:55ZIncreased market transparency in Germany’s gasoline market: The death of rockets and feathers?
http://hdl.handle.net/2003/37826
Title: Increased market transparency in Germany’s gasoline market: The death of rockets and feathers?
Authors: Frondel, Manuel; Horvath, Marco; Vance, Colin; Kihm, Alexander
Abstract: Drawing on a consumer search model and a unique panel data set of daily
fuel prices covering over 5,000 fuel stations in Germany, this paper documents a
change in the price setting behavior of retail gas stations following the introduction of
a legally mandated on-line price portal. Prior to the introduction of the portal in 2013,
positive asymmetry is found on the basis of error correction models, with prices following
the “rockets and feathers” pattern documented in many commodity markets,
particularly in retail markets for fuels. In the aftermath of the portal’s introduction, by
contrast, negative asymmetry is observed: fuel price decreases in response to refinery
price decreases are stronger than fuel price increases due to refinery price increases.
This reversal in price pass-through, which is found among both branded and unbranded
stations, suggests welfare gains for consumers from increased market transparency.2018-12-04T08:30:36ZStatistical analysis of the lifetime of diamond impregnated tools for core drilling of concrete
http://hdl.handle.net/2003/37814
Title: Statistical analysis of the lifetime of diamond impregnated tools for core drilling of concrete
Authors: Malevich, Nadja; Müller, Christine H.; Kansteiner, Michael; Biermann, Dirk; Ferreira, Manuel; Tillmann, Wolfgang
Abstract: The lifetime of diamond impregnated tools for core drilling of concrete
is studied via the lifetimes of the single diamonds on the tool. Thereby, the number
of visible and active diamonds on the tool surface is determined by microscopical
inspections of the tool at given points in time. This leads to interval-censored lifetime
data if only the diamonds visible at the beginning are considered. If also the
lifetimes of diamonds appearing during the drilling process are included then the
lifetimes are doubly interval-censored. A statistical method is presented to analyse
the interval-censored data as well as the doubly interval-censored data. The method
is applied to three series of experiments which differ in the size of the diamonds
and the type of concrete. It turns out that the lifetimes of small diamonds used for
drilling into conventional concrete is much shorter than the lifetimes when using
large diamonds or high strength concrete.2018-11-27T11:46:56ZDetection of anomalous sequences in crack data of a bridge monitoring
http://hdl.handle.net/2003/37813
Title: Detection of anomalous sequences in crack data of a bridge monitoring
Authors: Abbas, Sermad; Fried, Roland; Heinrich, Jens; Horn, Melanie; Jakubzik, Mirko; Kohlenbach, Johanna; Maurer, Reinhard; Michels, Anne; Müller, Christine H.
Abstract: For estimating the remaining lifetime of old prestressed concrete bridges,
a monitoring of crack widths can be used. However, the time series of crack widths
show a strong variation mainly caused by temperature and traffic. Additionally, sequences
with extreme volatility appear where the cause is unknown. They are called
anomalous sequences in the following.We present and compare four methods which
aim to detect these anomalous sequences in the time series. Volatilities caused by
traffic should not be detected.2018-11-27T11:45:06ZMultiscale change point detection for dependent data
http://hdl.handle.net/2003/37806
Title: Multiscale change point detection for dependent data
Authors: Dette, Holger; Schüler, Theresa; Vetter, Mathias
Abstract: In this paper we study the theoretical properties of the simultaneous multiscale change
point estimator (SMUCE) proposed by Frick et al. (2014) in regression models with dependent
error processes. Empirical studies show that in this case the change point estimate
is inconsistent, but it is not known if alternatives suggested in the literature for correlated
data are consistent. We propose a modification of SMUCE scaling the basic statistic by
the long run variance of the error process, which is estimated by a difference-type variance
estimator calculated from local means from different blocks. For this modification we prove
model consistency for physical dependent error processes and illustrate the finite sample
performance by means of a simulation study.2018-11-16T13:21:21ZPanel cointegrating polynomial regressions: Group-mean fully modified OLS estimation and inference
http://hdl.handle.net/2003/37669
Title: Panel cointegrating polynomial regressions: Group-mean fully modified OLS estimation and inference
Authors: Wagner, Martin; Reichold, Karsten
Abstract: This paper considers group-mean fully modified OLS estimation for a panel of cointegrating
polynomial regressions, i. e., regressions that include an integrated process and its powers as
explanatory variables. The stationary errors are allowed to be serially correlated, the regressor
to be endogenous and { as usual in the nonstationary panel literature { we include individual
specific fixed effects. We consider a fixed cross-section dimension, asymptotics in the time
dimension only and show that the estimator allows for standard asymptotic inference in this
setting. In both the simulations as well as an illustrative application estimating environmental
Kuznets curves for carbon dioxide emissions we compare our group-mean estimator with the
pooled fully modified OLS estimator of de Jong and Wagner (2018).2018-11-13T12:32:37ZConsistency for the negative binomial regression with fixed covariate
http://hdl.handle.net/2003/37352
Title: Consistency for the negative binomial regression with fixed covariate
Authors: Weißbach, Rafael; Radloff, Lucas
Abstract: We model an overdispersed count as a dependent measurement, by means of
the Negative Binomial distribution. We consider quantitative regressors that
are ﬁxed by design. The expectation of the dependent variable is assumed to
be a known function of a linear combination involving regressors and their coefficients. In the NB1-parametrization of the negative binomial distribution,
the variance is a linear function of the expectation, inﬂated by the dispersion
parameter, and not a generalized linear model. We apply a general result of
Bradley and Gart (1962) to derive weak consistency and asymptotic normality of the maximum likelihood estimator for all parameters. To this end, we
show (i) how to bound the logarithmic density by a function that is linear
in the outcome of the dependent variable, independently of the parameter.
Furthermore (ii) the positive deﬁniteness of the matrix related to the Fisher
information is shown with the Cauchy-Schwarz inequality.2018-10-31T13:29:33ZUsing the extremal index for value-at-risk backtesting
http://hdl.handle.net/2003/37201
Title: Using the extremal index for value-at-risk backtesting
Authors: Bücher, Axel; Posch, Peter N.; Schmidtke, Philipp
Abstract: We introduce a set of new Value-at-Risk independence backtests by establishing a
connection between the independence property of Value-at-Risk forecasts and the
extremal index, a general measure of extremal clustering of stationary sequences.
We introduce a sequence of relative excess returns whose extremal index has to
be estimated. We compare our backtest to both popular and recent competitors
using Monte-Carlo simulations and find considerable power in many scenarios.
In an applied section we perform realistic out-of-sample forecasts with common
forecasting models and discuss advantages and pitfalls of our approach.2018-10-19T14:45:07ZSwitching to green electricity: Spillover effects on household consumption
http://hdl.handle.net/2003/37200
Title: Switching to green electricity: Spillover effects on household consumption
Authors: Sommer, Stephan
Abstract: One way to reduce emissions from the consumption of electricity is switching to
green electricity suppliers. This paper identifies the determinants of adopting green electricity
and the effect on electricity consumption, using panel data on more than 9,000
households. To control for potential self-selection into green electricity tariffs, an endogenous
dummy treatment effects model is estimated. The results suggest that wealthier
and better-educated households are more likely to adopt green electricity. Moreover, we
find that switching to green electricity decreases electricity consumption and households
supplied by green electricity are less price-responsive. Consequently, enforcing higher
prices for conventional electricity might prove effective in reducing both greenhouse gas
emissions and electricity consumption at the household level.2018-10-19T14:43:05ZRISE Germany Internship: Applying Deep Learning Methods to the Search for Astrophysical Tau Neutrinos
http://hdl.handle.net/2003/37190
Title: RISE Germany Internship: Applying Deep Learning Methods to the Search for Astrophysical Tau Neutrinos
Authors: Martin, William2018-10-12T12:28:22ZFeature Selection for High-Dimensional Data with RapidMiner
http://hdl.handle.net/2003/37189
Title: Feature Selection for High-Dimensional Data with RapidMiner
Authors: Sangkyun, Lee; Schowe, Benjamin; Sivakumar, Viswanath; Morik, Katharina
Abstract: Feature selection is an important task in machine learning, reducing dimensionality of learning problems by selecting few relevant features without losing too much information. Focusing on smaller sets of features, we can learn simpler models from data that are easier to understand and to apply. In fact, simpler models are more robust to input noise and outliers, often leading to better prediction performance than the models trained in higher dimensions with all features. We implement several feature selection algorithms in an extension of RapidMiner, that scale well with the number of features compared to the existing feature selection operators in RapidMiner.2018-10-12T12:25:02ZEnergy-Efficient GPS-Based Positioning in the Android Operating System
http://hdl.handle.net/2003/37188
Title: Energy-Efficient GPS-Based Positioning in the Android Operating System
Authors: Streicher, Jochen; Spincyk, Olaf
Abstract: We present our ongoing collaborative work on EnDroid, an energy-efficient GPS-based positioning system for the Android Operating System. EnDroid is based on the EnTracked positioning system, developed at the University of Aarhus, Denmark. We describe the current prototypical state of our implementation and present our experiences and conclusions from preliminarily evaluating EnDroid on the Google Nexus One Smartphone. Although the preliminary results seem to sup- port the approach, there are still several open questions, both at the application interface, as well as at the hardware management level.2018-10-12T12:23:41ZProbabilistic Graphical Models in RapidMiner
http://hdl.handle.net/2003/37187
Title: Probabilistic Graphical Models in RapidMiner
Authors: Piatkowski, Nico
Abstract: This Report describes the technical background and usage of the GraphMod plug-in for RapidMiner. The plug-in enables RapidMiner to load factor graphs and interpret Label and Attributes which are contained in an Example as assignments to random variables. A set of examples which belong to the same Batch is treated as assignment to a whole factor graph. New operators allow the estimation of factor weights, the computation of the single-node marginal probability functions and the computation of the most probable assignment for each Labelnode with several methods. All algorithms are optimized for parallel execution on common multi-core processors and NVIDIA CUDA capable many-core processors (also known as Graphics Processing Unit).2018-10-12T12:22:11ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37186
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-12T09:18:51ZComputing on High Performance Clusters with R: Packages BatchJobs and BatchExperiments
http://hdl.handle.net/2003/37185
Title: Computing on High Performance Clusters with R: Packages BatchJobs and BatchExperiments
Authors: Bischl, Bernd; Lang, Michel; Mersmann, Olaf; Rahnenführer, Jörg; Weihs, Claus
Abstract: Empirical analysis of statistical algorithms often demands time-consuming experiments which are best performed on high performance computing clusters. We present two R packages which greatly simplify working in batch computing environments. The package BatchJobs implements the basic objects and procedures to control a batch cluster within R. It is structured around cluster versions of the well-known higher order functions Map, Reduce and Filter from functional programming. An important feature is that the state of computation is persistently available in a database. The user can query the status of jobs and then continue working with a desired subset. The second package, BatchExperiments, is tailored for the still very general scenario of analyzing arbitrary algorithms on problem instances. It extends BatchJobs by letting the user define an array of jobs of the kind “apply algorithm A to problem instance P and store results”. It is possible to associate statistical designs with parameters of algorithms and problems and therefore to systematically study their influence on the results. In general our main contributions are: (a) Portability : Both packages use a clear and well-defined interface to the batch system which makes them applicable in most high-performance computing environments. (b) Reproducibility: Every computational part has an associated seed that the user can control to ensure reproducibility even when the underlying batch system changes. (c) Efficiency: Efficiently use batch computing clusters completely within R. (d) Abstraction and good software design: The code layers for algorithms, experiment definitions and execution are cleanly separated and enable the writing of readable and maintainable code.2018-10-12T09:16:55ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37184
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-12T09:14:26ZOptimization plugin for RapidMiner
http://hdl.handle.net/2003/37183
Title: Optimization plugin for RapidMiner
Authors: Umaashankar, Venkatesh; Sangkyun, Lee
Abstract: Optimization in general means selecting a best choice out of various alternatives, which reduces the cost or disadvantage of an objective. Optimization problems are very popular in the fields such as economics, finance, logistics, etc. Optimization is a science of its own and machine learning or data mining is a diverse growing field which applies techniques from various other areas to find useful insights from data. Many of the machine learning problems can be modelled and solved as optimization problems, which means optimization already provides a set of well established methods and algorithms to solve machine learning problems. Due to the importance of optimization in machine learning, in recent times, machine learning researchers are contributing remarkable improvements in the field of optimization. We implement several popular optimization strategies and algorithms as a plugin for RapidMiner, which adds an optimization tool kit to the list of existing arsenal of operators in RapidMiner.2018-10-12T09:12:51ZThe Streams Framework
http://hdl.handle.net/2003/37182
Title: The Streams Framework
Authors: Bockermann, Christian; Blom, Hendrik
Abstract: In this report, we present the streams library, a generic Java-based library for designing data stream processes. The streams library defines a simple abstraction layer for data processing and provides a small set of online algorithms for counting and classification. Moreover it integrates existing libraries such as MOA. Processes are defined in XML files following the semantics and ideas of well established tools like Ant, Maven or the Spring Framework. The streams library can be easily embedded into existing software, used as a standalone tool or be used to define compute graphs that are executed on other back end systems such as the Stormstream engine. This report reflects the status of the streams framework in version 0.9.6. As the framework is continuously enhanced, the report is extended along. The most recent version of this report is available online.2018-10-12T09:11:13ZMeasuring the Power Consumption of Smartphones
http://hdl.handle.net/2003/37181
Title: Measuring the Power Consumption of Smartphones
Authors: Manning-Dahan, Tyler; Putzke, Markus; Wietfeld, Christian
Abstract: Smartphones are becoming a part of everyday life and as such, a better understanding of hardware and software power consumption is crucial to develop more efficient smartphones. In order to extend battery life, application developers and phone designers must become aware of the limitations of a phone’s CPU power, as well as the LCD display consumption and connectivity via WiFi, 3G, and GPS systems. We present power consumption measurements of an HTC Incredible S and compare these results to known analytical models. The evaluation shows that power consumption is considerably varying with different types of smartphones and that well known models underestimate the actual consumption. The results illustrate that touching the screen nearly doubles the power consumption , which is not captured by any analytical model. Moreover, we present in which way the transmitted packet size of WiFi and cellular communications affect the power consumption.2018-10-12T09:08:46ZUnimodal regression using Bernstein-Schoenberg-splines and penalties
http://hdl.handle.net/2003/37180
Title: Unimodal regression using Bernstein-Schoenberg-splines and penalties
Authors: Köllmann, Claudia; Bornkamp, Björn; Ickstadt, Katja
Abstract: Research in the field of nonparametric shape constrained regression has been intensive. However, only few publications explicitly deal with unimodality although there is need for such methods in applications, for example, in dose-response analysis. In this paper we propose unimodal spline regression methods that make use of Bernstein-Schoenberg-splines and their shape preservation property. To achieve unimodal and smooth solutions we use penalized splines, and extend the penalized spline approach towards penalizing against general parametric functions, instead of using just difference penalties. For tuning parameter selection under a unimodality constraint a restricted maximum likelihood and an alternative Bayesian approach for unimodal regression are developed. We compare the proposed methodologies to other common approaches in a simulation study and apply it to a dose-response data set. All results suggest that the unimodality constraint or the combination of unimodality and a penalty can substantially improve estimation of the functional relationship.2018-10-12T09:07:11ZPreserving Confidentiality in Multiagent Systems - An Internship Project within the DAAD RISE Program
http://hdl.handle.net/2003/37179
Title: Preserving Confidentiality in Multiagent Systems - An Internship Project within the DAAD RISE Program
Authors: Dilger, Daniel; Krümpelmann, Patrick; Tadros, Cornelia
Abstract: RISE (Research Internships in Science and Engineering) is a summer internship program for undergraduate students from the United States, Canada and the UK organized by the DAAD (Deutscher Akademischer Austausch Dienst). Within the project A5 in the Collaborative Research Center SFB 876, we have planned and conducted an internship project in the RISE program that should support our research. Daniel Dilger was the intern and has been supervised by the PhD students Patrick Krümpelmann and Cornelia Tadros. The aim was to model an application scenario for our prototype implementation of a confidentiality preserving multiagent system and to run experiments with that prototype.2018-10-12T09:05:30ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37178
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-12T08:47:15ZRobPer: An R Package to Calculate Periodograms for Light Curves Based On Robust Regression
http://hdl.handle.net/2003/37177
Title: RobPer: An R Package to Calculate Periodograms for Light Curves Based On Robust Regression
Authors: Thieler, Anita Monika; Fried, Roland; Rathjens, Jonathan
Abstract: An important task in astroparticle physics is the detection of periodicities in irregularly sampled time series, called light curves. The classic Fourier periodogram cannot deal with irregular sampling and with the measurement accuracies that are typically given for each observation of a light curve. Hence, methods to fit periodic functions using weighted regression were developed in the past to calculate periodograms. We present the R Package RobPer which allows to combine different periodic functions and regression techniques to calculate periodograms. Possible regression techniques are least squares, least absolute deviation, least trimmed, M-, S- and {\tau} -regression. Measurement accuracies can be taken into account including weights. Our periodogram function covers most of the attempts that have been tried earlier and provides new model-regression-combinations that have not been used before. To detect valid periods, we apply an outlier search on the periodogram instead of using fixed critical values that are theoretically only justified in case of least squares regression, independent periodogram bars and a null hypothesis allowing only normal white noise. This outlier search can be performed using RobPer as well. Finally, the package also includes a generator to generate artificial light curves e.g., for simulation studies.2018-10-12T08:44:24ZPreprocessing of Affymetrix Exon Expression Arrays
http://hdl.handle.net/2003/37176
Title: Preprocessing of Affymetrix Exon Expression Arrays
Authors: Sangkyun, Lee; Schramm, Alexander
Abstract: The activity of genes can be captured by measuring the amount of messenger RNAs transcribed from the genes, or from their subunits called exons. In our study, we use the Affymetrix Human Exon ST v1.0 micro arrays to measure the activity of exon s in Neuroblastoma cancer patients. The purpose is to discover a small number of genes or exons that play important roles in differentiating high - risk patients fro m low - risk counterparts. Although the technology has been improved for the past 15 years, array measurements still can be contaminated by various factors, including human error. Since the number of arrays is often only few hundreds, atypical errors can hardly be canceled by large numbers of normal arrays. In this article we describe how we filter out low - quality arrays in a principled way, so that we can obtain more reliable results in downstream analyses.2018-10-12T08:39:44ZA Survey of the Stream Processing Landscape
http://hdl.handle.net/2003/37175
Title: A Survey of the Stream Processing Landscape
Authors: Bockermann, Christian
Abstract: The continuous processing of streaming data has become an important aspect in many applications. Over the last years a variety of different streaming platforms has been developed and a number of open source frameworks is available for the implementation of streaming applications. In this report, we will survey the landscape of existing streaming platforms. Starting with an overview of the evolving developments in the recent past, we will discuss the requirements of modern streaming architectures and present the ways these are approached by the different frameworks.2018-10-12T08:38:07ZRandom projections for Bayesian regression
http://hdl.handle.net/2003/37174
Title: Random projections for Bayesian regression
Authors: Geppert, Leo N.; Ickstadt, Katja; Munteanu, Alexander; Sohler, Christian
Abstract: This article introduces random projections applied as a data reduction technique for Bayesian regression analysis. We show sufficient conditions under which the entire d -dimensional distribution is preserved under random projections by reducing the number of data points from n to k element of O(poly(d/epsilon)) in the case n >> d . Under mild assumptions, we prove that evaluating a Gaussian likelihood function based on the projected data instead of the original data yields a (1+ O(epsilon))-approximation in the l_2-Wasserstein distance. Our main result states that the posterior distribution of a Bayesian linear regression is approximated up to a small error depending on only an epsilon-fraction of its defining parameters when using either improper non-informative priors or arbitrary Gaussian priors. Our empirical evaluations involve different simulated settings of Bayesian linear regression. Our experiments underline that the proposed method is able to recover the regression model while considerably reducing the total run-time.2018-10-12T08:35:55ZRessourcenbeschränkte Analyse von Ionenmobilitätsspektren mit dem Raspberry Pi
http://hdl.handle.net/2003/37173
Title: Ressourcenbeschränkte Analyse von Ionenmobilitätsspektren mit dem Raspberry Pi
Authors: Egorov, Alexey; König, Alexander; Köppen, Marcel; Kühn, Henning; Kullack, Isabell; Kuthe, Elias; Mitkovska, Suzana; Niehage, Robert; Pawelko, Andreas; Sträßer, Manuel; Striewe, Christian; D'Addario, Marianna; Kopczynski, Dominik; Rahmann, Sven
Abstract: Die Zusammensetzung der Umgebungs- oder Ausatemluft kann viele Informationen liefern, die z. B. helfen können, eine Erkrankung oder deren Ursache festzustellen. Die Moleküle der in der Luft enthaltenen Substanzen haben jeweils unterschiedliche Größen und Formen, so dass es möglich ist, sie voneinander zu trennen über Ausschläge in einer Luftmessung die Häufigkeit ihres Vorkommens zu bestimmen. Diese Ausschläge werden als Peaks bezeichnet. Ihre Erkennung ist Gegenstand aktueller Forschung. Das Einsatzgebiet solcher Messungen erstreckt sich von medizinischer Überwachung von Patienten im Krankenhaus bis zur Überprüfung der Umgebungsluft bestimmter Gegenden.2018-10-12T08:34:17ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37172
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-12T08:30:12ZDemixing empirical distribution functions
http://hdl.handle.net/2003/37171
Title: Demixing empirical distribution functions
Authors: Munteanu, Alexander; Wornowizki, Max
Abstract: We consider the two-sample homogeneity problem where the information contained in two samples is used to test the equality of the underlying distributions. For instance, in cases where one sample stems from a simulation procedure modelling the data generating process of the other sample consisting of observed data, a mere rejection of the null hypothesis is unsatisfactory. Instead, the data analyst would like to know how the simulation can b e improved while changing it as little as possible. Based on the popular Kolmogorov-Smirnov test and a general nonparametric mixture model, we propose an algorithm which determines an appropriate correction distribution function describing how the simulation procedure can b e corrected. It is constructed in such a way that complementing the simulation sample by a given proportion of observations sampled from the correction distribution do es not lead to a rejection of the null hypothesis of equal distributions when the modified and the observed sample are compared. We prove our algorithm to run in linear time and evaluate it on simulated and real spectrometry data showing that it leads to intuitive results. We illustrate its practical performance considering runtime as well as accuracy in a real world scenario.2018-10-12T08:28:28ZData Modeling of Ubiquitous System Software
http://hdl.handle.net/2003/37170
Title: Data Modeling of Ubiquitous System Software
Authors: Streicher, Jochen
Abstract: The multitude of events and internal data structures in complex modern system software are an excellent target for data analysis. The tools to collect the data range from low-level tracing frameworks to more sophisticated ones with specialized data collection and processing languages. However, these lack information on the relationship between different data sources and between currently and already collected data. We describe a formal data model that captures the structure of data streams in the system software as well as the relationships between them.2018-10-12T08:26:55ZBeyond unimodal regression: modelling multimodality with piecewise unimodal, mixture or additive regression
http://hdl.handle.net/2003/37169
Title: Beyond unimodal regression: modelling multimodality with piecewise unimodal, mixture or additive regression
Authors: Köllmann, Claudia; Ickstadt, Katja; Fried, Roland
Abstract: Research in the field of nonparametric shape constrained regression has been extensive and there is need for such methods in various application are as, since shape constraints can reflect prior knowledge about the underlying relationship. It is, for example, often natural that some intensity first increases and then decreases over time, which can be described by a unimodal shape constraint. But the prior knowledge in different applications is also of increasing complexity and data shapes may vary fro m few to plenty of modes and from piecewise unimodal to superpositions of unimodal function courses. Thus, we go beyond unimodal regression in this report and capture multimodality by employing piecewise unimodal regression, mixture regression or additive regression models. We give an overview of the statistical methods, namely the unimodal spline regression approach by and its aforementioned extensions for use with multimodal data. The usefulness of the methods is demonstrated by applying them to data sets from three different application areas: breath gas analysis, marine biology and astroparticle physics. Though the three application areas are quite different, the propose d extensions of unimodal regression yield very helpful results in each of it. This encourages using the methodologies proposed here in many other areas of application as well.2018-10-12T08:25:06ZLogistic Regression in Datastreams
http://hdl.handle.net/2003/37168
Title: Logistic Regression in Datastreams
Authors: Schwiegelshohn, Chris; Sohler, Christian
Abstract: Learning from data streams is a well researched task both in theory and practice. As remarked by Clarkson, Hazan and Woodruff, many classification problems cannot be very well solved in a streaming setting. For previous model assumptions, there exist simple, yet highly artificial lower bounds prohibiting space efficient one- pass algorithms. At the same time, several classification algorithms are often successfully used in practice. To overcome this gap, we give a model relaxing the constraints that previously made classification impossible from a theoretical point of view and under these model assumptions provide the first (1 + epsilon) -approximate algorithms for sketching the objective values of logistic regression and perceptron classifiers in data streams.2018-10-12T08:23:06ZUnderstanding Where Your Classifier Does (Not) Work - the SCaPE Model Class for Exceptional Model Mining
http://hdl.handle.net/2003/37167
Title: Understanding Where Your Classifier Does (Not) Work - the SCaPE Model Class for Exceptional Model Mining
Authors: Duivesteijn, Wouter; Thaele, Julia
Abstract: FACT, the First G-APD Cherenkov Telescope, detects air showers induced by high-energetic cosmic particles. It is desirable to classify a shower as being induced by a gamma ray or a background particle. Generally, it is nontrivial to get any feedback on the real-life training task, but we can attempt to understand how our classifier works by investigating its performance on Monte Carlo simulated data. To this end, in this paper we develop the SCaPE (Soft Classifier Performance Evaluation) model class for Exceptional Model Mining, which is a Local Pattern Mining framework devoted to highlighting unusual interplay between multiple targets. In our Monte Carlo simulated data, we take as targets the computed classifier probabilities and the binary column containing the ground truth: which kind of particle induced the corresponding shower. Using a newly developed quality measure based on ranking loss, the SCaPE model class highlights subspaces of the search space where the classifier performs particularly well or poorly. These subspaces arrive in terms of conditions on attributes of the data, hence they come in a language a domain expert understands, which should aid him in understanding where his/her classifier does (not) work. Additional experiments are carried out on nine UCI datasets. Found subgroups highlight subspaces whose difficulty for classification is corroborated by astrophysical interpretation, as well as subspaces that warrant further investigation.2018-10-12T08:21:08ZAngerona - A Multiagent Framework for Logic Based Agents with Application to Secrecy Preservation
http://hdl.handle.net/2003/37166
Title: Angerona - A Multiagent Framework for Logic Based Agents with Application to Secrecy Preservation
Authors: Krümpelmann, Patrick; Janus, Tim; Kern-Isberner, Gabriele2018-10-11T13:54:58ZUntersuchungen zur Analyse von deutschsprachigen Textdaten
http://hdl.handle.net/2003/37165
Title: Untersuchungen zur Analyse von deutschsprachigen Textdaten
Authors: Morik, Katharina; Jung, Alexander; Weckwerth, Jan; Rötner, Stefan; Hess, Sybille; Buschjäger, Sebastian; Pfahler, Lukas2018-10-11T13:53:10ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37164
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-11T13:50:51ZPerformance Analysis for Parallel R Programs: Towards Efficient Resource Utilization
http://hdl.handle.net/2003/37163
Title: Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization
Authors: Kotthaus, Helena; Korb, Ingo; Marwedel, Peter
Abstract: Parallel computing is becoming more and more popular, since R is increasingly used to process large data sets. We therefore have improved traceR to allow for profiling parallel applications also. TraceR can be used for common cases like parallelization on multiple cores or parallelization on multiple machines. For the parallel performance analysis we added measurements like CPU utilization of parallel tasks and measurements for analyzing the memory usage of parallel programs during execution. With our parallel performance analysis we concentrate on applications that are embarrassingly par- allel consisting of independent tasks. One example application which is embarrassingly parallel and also has a high resource utilization is the model selection. Here the goal is to find the best machine learning algorithm configuration for building a model for the given data. Therefore one has to search through a huge model space. Since the gain from parallel execution can be negated if the memory requirements of all parallel processes exceed the capacity of the system, our profiling data can serve as a constraint to determine the degree of parallelism and also to guide distribution of parallel R applications. Our goal is to provide a resource-aware parallelization strategy. To develop such a strategy we first need to analyze the performance of parallel applications. In the following we therefore will describe different parallel example applications and show how traceR is applied to analyze parallel R applications.2018-10-11T13:48:34ZData Reduction for CORSIKA
http://hdl.handle.net/2003/37162
Title: Data Reduction for CORSIKA
Authors: Baack, Dominik
Abstract: For the analysis of measured data by experiments, simulated Monte Carlo data is essential. It is used to test the understanding of the experiment, for separation of signal and background and for reconstruction of real physical properties from observable parameters. With increasing size of the experiments, more and more simulated data is needed. To optimize the simulation and to reduce the huge amount of calculation time needed, two different methods exist. The first method is the low-level optimization of the source code. The second one is the reduction of the actually needed Monte Carlo data. This report focuses on the cosmic ray simulation CORSIKA, which simulates cosmic ray induced particle showers within the atmosphere. In case of CORSIKA, big parts of the program are already optimized. Additionally, parts of the source code are only accessible in binary form so the first method of optimization is nearly impossible. Therefore the preferred method here is the reduction of unnecessarily generated data. This report presents a modified and extended internal structure for CORSIKA, which is shown in Figure 2. The modifications can be divided in two modules: Dynamic Stack and Remote Control. Both have complementary approaches to reduce the amount of needed simulation cycles and provide an easy API for customizations without assuming any of the CORSIKA code or structure.2018-10-11T13:44:45ZRISE Germany Internship: Application of Data Mining Methods on IceCube Event Reconstructions
http://hdl.handle.net/2003/37161
Title: RISE Germany Internship: Application of Data Mining Methods on IceCube Event Reconstructions
Authors: Bhasin, Srishti; Börner, Mathis
Abstract: In this report the results from a 3-month internship are presented. The goal of the internship was to apply data mining methods to low level IceCube data in order to reconstruct the particle energies. IceCube is a neutrino observatory located at the geographical South Pole, built with the aim of detecting high energy astrophysical neutrinos. The detector consists of 5160 photomultipliers, located 1.5-2.5 kilometers beneath the icecap, which detect Cherenkov light radiated by charged particle propagation through the ice. The reconstruction of detected events directly at the pole is challenging, due to heavy constraints on resources. Due to this, only rudimentary reconstructions are performed on-site. The final results are obtained months later, once the data has been transported from the detector. An effective and prompt reconstruction directly at the pole would open a lot of new possibilities for follow-up studies of detected events. The application of state-of-the-art data mining methods can help to obtain these reconstructions on-site.2018-10-11T13:42:44ZOnline Gauß-Prozesse zur Regression auf FPGAs
http://hdl.handle.net/2003/37160
Title: Online Gauß-Prozesse zur Regression auf FPGAs
Authors: Buschjäger, Sebastian
Abstract: FPGAs köonnen als eine schnelle und energiesparende Ausführungsplattform genutzt werden, welche jedoch keinerlei Laufzeitumgebung für Dateiabstraktionen oder Peripheriezugriffe anbietet. Aus diesem Grund muss neben der eigentlichen Implementierung auch der Entwurf des umliegenden Systems erfolgen. Dieser Systementwurf hat sich mit der dritten Generation der verf ̈ ugbaren Werkzeuguntersützung für FPGAs stark gewandelt, wodurch sich Unterschiede zu der vorhandenen Literatur ergeben. Das Entwurfsvorgehen für die aktuelle FPGA- und Werkzeuggeneration soll zunächst vorgestellt werden, um darauf aufbauend eine passende Laufzeitumgebung für maschinelle Lernalgorithmen auf dem FPGA zu entwerfen. Hierbei soll eine möglichst modulare und energiesparende Systemarchitektur entworfen werden, sodass sich die hier vorgestellte Systemarchitektur gut in eingebettete System anwenden lässt und zusätzlich der maschinelle Lernalgorithmus wegen der Modularität des Systems einfach ausgetauscht werden kann. Anschließend soll eine beispielhafte Umsetzung eines Gauß-Prozesses auf dem FPGA die Einbettung in das Gesamtsystem zeigen, wobei hier Wert auf eine möglichst hohe Geschwindigkeit der Hardwareimplementierung gelegt werden soll. Die Umsetzung einer energiesparenden Systemarchitektur für verschiedene maschinelle Lernalgorithmen ist nach Wissen des Autors neu, da in der vorhandenen Literatur jeweils ein neues System für einen anderen Algorithmus entworfen wird. Des Weiteren ist Umsetzung von Gauß-Prozessen auf FPGAs ist nach Wissen des Autors ebenfalls neu, sodass ich hier weitere Unterschiede zur vorhanden Literatur ergeben.2018-10-11T13:40:05ZEasyTCGA: An R package for easy batch downloading of TCGA data from FireBrowse
http://hdl.handle.net/2003/37159
Title: EasyTCGA: An R package for easy batch downloading of TCGA data from FireBrowse
Authors: Kliewer, Viktoria; Sangkyun, Lee
Abstract: Many organizations deal with the investigation of cancer including the National Institutes of Health, USA. Genomics(CCG). The Cancer Genome Atlas (TCGA) is an establishment of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) that has created maps of the key genomic changes in more than 30 cancer types. The aim of TCGA is the improvement of the effectiveness to diagnose, to treat and to guard against cancer through genome analysis. TCGA provides a publically available dataset. The Broad Institute TCGA GDAC Firehose arranges this data set that can be loaded directly with use of FireBrowse. FireBrowse allows simple and smart download and study TCGA data and TCGA analyses. The data is downloaded as zip files. Mario Deng created an R client called FirebrowseR with the objective of getting the TCGA data from FireBrowse conveniently. The size of record sets to download is limited. EasyTCGA is an R package providing easy batch downloading of particular TCGA data from FireBrowse using FirebrowseR. The key advantage of EasyTCGA is the downloading of the whole available data set you are interested in at once as a single data frame. The focus of this technical report is on the presentation of the R package EasyTCGA . That is why all specific expressions and variables like biological data and the like will not be explained. You get all relevant background informations on the given URL’s. EasyTGCA can download clinical data, sample-level log2 miRSeq and mRNASeq expression values, selected columns from the MAF (Mutation Annotation File) generated by MutSig and significantly mutated genes, as scored by MutSig.2018-10-11T13:27:16ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37158
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-11T11:46:05ZPG594 -- Big Data
http://hdl.handle.net/2003/37157
Title: PG594 -- Big Data
Authors: Asmi, Mohamed; Bainczyk, Alexander; Bunse, Mirko; Gaidel, Dennis; May, Michael; Pfeiffer, Christian; Schieweck, Alexander; Schönberger, Lea; Stelzner, Karl; Sturm, David; Wiethoff, Carolin; Xu, Lili
Abstract: In der heutigen Welt wird die Verarbeitung großer Mengen von Daten immer wichtiger. Dabei wird eine Vielzahl von Technologien, Frameworks und Software-Lösungen eingesetzt, die explizit für den Big-Data-Bereich konzipiert wurden oder aber auf Big-Data-Systeme portiert werden können. Ziel dieser Projektgruppe (PG) ist der Erwerb von Expertenwissen hinsichtlich aktueller Tools und Systeme im Big-Data-Bereich anhand einer realen, wissenschaftlichen Problemstellung. Vom Wintersemester 2015/2016 bis zum Ende des Sommersemesters 2016 beschäftigte sich diese Projektgruppe mit der Verarbeitung und Analyse der Daten des durch den Fachbereich Physik auf der Insel La Palma betriebenen First G-APD Cherenkov Telescope (FACT). Dieses liefert täglich Daten im Terabyte- Bereich, die mit Hilfe des Clusters des Sonderforschungsbereiches 876 zunächst indiziert und dann auf effiziente Weise verarbeitet werden müssen, sodass diese Projektgruppe im besten Falle die Tätigkeit der Physiker mit ihren Ergebnissen unterstützen kann. Wie genau dies geschehen soll, sei auf den nachfolgenden Seiten genauer beleuchtet - begonnen mit dem dezidierten Anwendungsfall, unter Berücksichtigung der notwendigen fachlichen sowie technischen Grundlagen, bis hin zu den finalen Ergebnissen.2018-10-11T11:43:22ZRISE Germany Internship: Unfolding FACT Data
http://hdl.handle.net/2003/37156
Title: RISE Germany Internship: Unfolding FACT Data
Authors: Bieker, Jacob; Börner, Mathis; Brügge, Kai; Nöthe, Maximillian
Abstract: In this report the results from a 10 week internship are presented. The goal of the internship was to apply different unfolding approaches to conduct measurements of energy spectra from data aquired by FACT, the First G-APD Cherenkov Telescope. FACT is the first operational telescope of its kind, employing a camera equipped with silicon photo multipliers (G-APD aka SiPM) to primarily detect gamma rays. Improving the unfolding method can help with better interpretation of the data and more accurate physics results without the need for new equipment or more observations. The approaches tested during this internship range from simplistic matrix inversion to an improvement over of the previous standard (TRUEE).2018-10-11T11:41:22ZAutomated Data Collection for Modelling Texas Instruments Ultra Low-Power Chargers
http://hdl.handle.net/2003/37155
Title: Automated Data Collection for Modelling Texas Instruments Ultra Low-Power Chargers
Authors: Masoudinejad, Mojtaba
Abstract: Some IoT designers develop their ad-hoc conversion solution specifically designed for their entity. However, having Maximum Power Point Tracking (MPPT), battery control, converter and switching logic would require a series of components. These devices will increase the initial cost and the overall energy loss overhead of this middle-ware between the EH and the storage. Nevertheless, these issues can be conquered by integrating all these elements and logics into one single chip. Currently, there are three Texas Instruments (TI) chips from the BQ255XX series and ST (SPV1050) chip available on-the-shelf, specially designed for low energy environments. Among them, TI's BQ25505 and BQ25570 chips promise a better performance out of the box and are dominant in the market. Although multiple designers have used these chips in their IoT devices, no analytical analysis on them is available. Some basic information about these devices are available through their datasheets. However, for a reliable design and fast analysis of the overall energy performance of an IoT device, these chips have to be modelled.2018-10-11T11:39:45ZTechnical report for Collaborative Research Center SFB 876 - Graduate School
http://hdl.handle.net/2003/37154
Title: Technical report for Collaborative Research Center SFB 876 - Graduate School
Authors: Morik, Katharina; Rhode, Wolfgang2018-10-11T11:37:34ZA Power Model for DC-DC Boost Converters Operating in PFM Mode
http://hdl.handle.net/2003/37153
Title: A Power Model for DC-DC Boost Converters Operating in PFM Mode
Authors: Masoudinejad, Mojtaba
Abstract: Next generation of computing is going to be outside of the traditional stationary computing realm. In the future paradigm, many non-stationary objects around us sense and actuate on the environment while they are connected to each other via internet. During the last few years, the number of these devices has been growing rapidly. This is making an explosion of small computing platforms for commercial, consumer, and industrial use cases. The overall concept of IoT is based on the communication (mainly through the internet) between multiple entities which are generalised as things . According to the diversity of the application fields, large number of entities are considered as things . From simple one-bit sensors to complex robots. Even some concepts consider human being as an entity within an IoT system. This leads into ambiguity of the definition for objects. Consequently, no unified definition for things is accepted among different communities. However, Cyber Physical Systems (CPS) as embedded devices with communication capabilities would fit into most (if not all) of them.2018-10-11T11:34:22ZMathematical modelling of the quality-based order assignment problem
http://hdl.handle.net/2003/37152
Title: Mathematical modelling of the quality-based order assignment problem
Authors: Schmitt, Jacqueline; Hahn, Florian; Deuse, Jochen
Abstract: The increasing global comp etition forces companies to reduce their pro duction costs and increase the quality of their pro ducts at the same time. Due to individualized customer needs, there can b e numerous customer requirements to the pro ducts that need to b e fulfilled to ensure customer satisfaction. Therefore, many companies established a quality management (QM) system, which aims for continuous improvement of p erformance regarding system, pro cess, and pro duct quality. Basic concepts and requirements for QM systems can be found in the ISO 9000 standards series. A main principle hereby is the customer orientation so that individualized customer needs can be considered within the design of internal quality testing gates. Within this technical report we present two approaches to model the product to customer order assignment problem (PCO-AP) mathematically as a 0,1 assignment problem (0,1- AP) and generalized assignment problem (GAP).2018-10-11T11:32:41ZModel-Based Optimization of Subgroup Weights for Survival Analysis
http://hdl.handle.net/2003/37151
Title: Model-Based Optimization of Subgroup Weights for Survival Analysis
Authors: Richter, Jakob; Madjar, Katrin; Rahnenführer, Jörg
Abstract: To obtain a reliable prediction model for a specific cancer subgroup or cohort is often difficult due to the limited number of samples and, in survival analysis, even more due to potentially high censoring rates. Sometimes similar datasets are available for other patient subgroups with the same or a similar disease and treatment, e.g., from other clinical centers. Simple pooling of all subgroups can decrease the variance of the predicted parameters of the prediction models, but also increase the bias due to potential high heterogeneity between the cohorts.
A promising compromise is to identify which subgroups are similar enough to the specific subgroup of interest and then include only these for model building.
Similarity here refers to the relationship between input and output in the prediction model, and not necessarily to the distributions of the input and output variables themselves.
Here, we propose a subgroup-based weighted likelihood approach and evaluate it on a set of lung cancer cohorts. When interested in a prediction model for a specific subgroup, then for every other subgroup, an individual weight determines the strength with which its observations enter into the likelihood-based optimization of the model parameters. A weight close to 0 indicates that a subgroup should be discarded, and a weight close to 1 indicates that the subgroup fully enters into the model building process.
MBO (model based optimization) can be used to quickly find a good prediction model in the presence of a large number of hyperparameters to be tuned. Here, we use MBO to identify the best model for survival prediction in lung cancer subgroups, where besides the parameters of a Cox model additionally the individual values of the subgroup weights are optimized. Interestingly, often the resulting models with highest prediction quality are obtained for a mixed weight structure, i.e. both weights close to 0, weights close to 1, and medium weights are optimal, reflecting the similarity of the corresponding cancer subgroups.2018-10-11T11:30:33ZEfficient Track Reconstruction on Modern Hardware
http://hdl.handle.net/2003/37150
Title: Efficient Track Reconstruction on Modern Hardware
Authors: Lindemann, Thomas
Abstract: Particle physics has become a massively data-intensive discipline. Huge particle accelerators — such as the Large Hadron Collider (LHC) at CERN — produce vast amounts of experimental data — 4 TB/s in the case of the LHCb experiment at CERN — which often must be processed in real time. Named after the b-quark, LHCb is one of the four big experiments at CERN. The general scope is to explain the matter/anti-matter asymmetry. The main focus is the study of particle decays involving beauty and charm quarks.
In the LHCb Project, a continuous stream of hits is produced by the several stages of the LHCb detector. Given the low probability of observing an “interesting” collision, physicists produce a vast number of collision experiments in the hope of finding a few interesting ones. Thus, the event data have to be processed in real time, since there are no capabilities to store all collision event permanently with the current storage technology. Analyzing these data volumes has become the key limitation of the domain: any improvement in analysis performance translates into better insights on the physics side.
In this report, we present the results of our experiments of our current work with the HybridSeeding track reconstruction algorithm.2018-10-11T11:28:48ZPanel cointegrating polynomial regression analysis and the environmental Kuznets curve
http://hdl.handle.net/2003/37148
Title: Panel cointegrating polynomial regression analysis and the environmental Kuznets curve
Authors: de Jong, Robert M.; Wagner, Martin
Abstract: This paper develops a modified and a fully modified OLS estimator for a panel of cointegrating
polynomial regressions, i.e. regressions that include an integrated process and its powers
as explanatory variables. The stationary errors are allowed to be serially correlated and the
regressors are allowed to be endogenous and we allow for individual and time fixed effects. Inspired
by Phillips and Moon (1999) we consider a cross-sectional i.i.d. random linear process
framework. The modified OLS estimator utilizes the large cross-sectional dimension that allows
to consistently estimate and subtract an additive bias term without the need to also transform
the dependent variable as required in fully modified OLS estimation. Both developed estimators
have zero mean Gaussian limiting distributions and thus allow for standard asymptotic inference.
Our illustrative application indicates that the developed methods are a potentially useful
addition to not least the environmental Kuznets curve literature's toolkit.2018-10-10T13:23:00ZCombining uncertainty with uncertainty to get certainty? Efficiency analysis for regulation purposes
http://hdl.handle.net/2003/37146
Title: Combining uncertainty with uncertainty to get certainty? Efficiency analysis for regulation purposes
Authors: Andor, Mark; Parmeter, Christopher; Sommer, Stephan
Abstract: Data envelopment analysis (DEA) and stochastic frontier analysis (SFA),
as well as combinations thereof, are widely applied in incentive regulation
practice, where the assessment of efficiency plays a major role in regulation
design and benchmarking. Using a Monte Carlo simulation experiment,
this paper compares the performance of six alternative methods commonly
applied by regulators. Our results demonstrate that combination approaches,
such as taking the maximum or the mean over DEA and SFA efficiency
scores, have certain practical merits and might offer an useful alternative
to strict reliance on a singular method. In particular, the results highlight
that taking the maximum not only minimizes the risk of underestimation,
but can also improve the precision of efficiency estimation. Based on our results,
we give recommendations for the estimation of individual efficiencies
for regulation purposes and beyond.2018-10-10T13:17:53ZTesting relevant hypotheses in functional time series via self-normalization
http://hdl.handle.net/2003/37138
Title: Testing relevant hypotheses in functional time series via self-normalization
Authors: Dette, Holger; Kokot, Kevin; Volgushev, Stanislav
Abstract: In this paper we develop methodology for testing relevant hypotheses in a tuning-free
way. Our main focus is on functional time series, but extensions to other settings are also
discussed. Instead of testing for exact equality, for example for the equality of two mean
functions from two independent time series, we propose to test a relevant deviation under
the null hypothesis. In the two sample problem this means that an L2-distance between
the two mean functions is smaller than a pre-specified threshold. For such hypotheses
self-normalization, which was introduced by Shao (2010) and Shao and Zhang (2010) and
is commonly used to avoid the estimation of nuisance parameters, is not directly applicable.
We develop new self-normalized procedures for testing relevant hypotheses in the one
sample, two sample and change point problem and investigate their asymptotic properties.
Finite sample properties of the proposed tests are illustrated by means of a simulation study
and a data example.2018-10-05T12:05:01ZOptimal designs for inspection times of interval-censored data
http://hdl.handle.net/2003/37137
Title: Optimal designs for inspection times of interval-censored data
Authors: Malevich, Nadja; Müller, Christine H.
Abstract: We treat optimal equidistant and optimal non-equidistant inspection
times for interval-censored data with exponential distribution.We provide
in particular a recursive formula for calculating the optimal non-equidistant
inspection times which is similar to a formula for optimal spacing of quantiles
for asymptotically best linear estimates based on order statistics. This formula
provides an upper bound for the standardized Fisher information which
is reached for the optimal non-equidistant inspection times if the number of
inspections is converging to infinity. The same upper bound is also shown for
the optimal equidistant inspection times. Since optimal equidistant inspection
times are easier to calculate and easier to handle in practice, we study the
efficiency of optimal equidistant inspection times with respect to optimal nonequidistant
inspection times. Moreover, since the optimal inspection times are
only locally optimal, we provide also some results concerning maximin efficient
designs.2018-10-05T12:02:35ZOn second order conditions in the multivariate block maxima and peak over threshold method
http://hdl.handle.net/2003/37120
Title: On second order conditions in the multivariate block maxima and peak over threshold method
Authors: Bücher, Axel; Volgushev, Stanislav; Zou, Nan
Abstract: Second order conditions provide a natural framework for establishing asymptotic
results about estimators for tail related quantities. Such conditions are typically
tailored to the estimation principle at hand, and may be vastly different for estimators
based on the block maxima (BM) method or the peak-over-threshold (POT)
approach. In this paper we provide details on the relationship between typical second
order conditions for BM and POT methods in the multivariate case. We show that the
two conditions typically imply each other, but with a possibly different second order
parameter. The latter implies that, depending on the data generating process, one
of the two methods can attain faster convergence rates than the other. The class of
multivariate Archimax copulas is examined in detail; we find that this class contains
models for which the second order parameter is smaller for the BM method and vice
versa. The theory is illustrated by a small simulation study.2018-09-05T09:43:56ZThe Phillips unit root tests for polynomials of integrated processes
http://hdl.handle.net/2003/37119
Title: The Phillips unit root tests for polynomials of integrated processes
Authors: Stypka, Oliver; Wagner, Martin
Abstract: We show that the Phillips (1987) unit root tests have nuisance parameter free limiting dis-
tributions when applied to polynomials of integrated processes driven by linear process errors.
This substantially generalizes a similar result of Wagner (2012) allowing only for serially uncor-
related errors. The result is based on novel kernel weighted sum limit results involving powers
of integrated processes. These results allow us also consider additional modifications of the
Phillips (1987) tests applicable to polynomials of integrated processes.2018-09-05T09:41:26ZDetecting deviations from second-order stationarity in locally stationary functional time series
http://hdl.handle.net/2003/37118
Title: Detecting deviations from second-order stationarity in locally stationary functional time series
Authors: Bücher, Axel; Dette, Holger; Heinrichs, Florian
Abstract: A time-domain test for the assumption of second order stationarity of a
functional time series is proposed. The test is based on combining individual cumulative
sum tests which are designed to be sensitive to changes in the mean, variance and
autocovariance operators, respectively. The combination of their dependent p-values
relies on a joint dependent block multiplier bootstrap of the individual test statistics.
Conditions under which the proposed combined testing procedure is asymptotically
valid under stationarity are provided. A procedure is proposed to automatically choose
the block length parameter needed for the construction of the bootstrap. The finitesample
behavior of the proposed test is investigated in Monte Carlo experiments and
an illustration on a real data set is provided.2018-09-05T09:38:49ZThe U. S. fracking boom: Impact on oil prices
http://hdl.handle.net/2003/37078
Title: The U. S. fracking boom: Impact on oil prices
Authors: Frondel, Manuel; Horvath, Marco
Abstract: As of late 2008, the steady decline of U. S. crude oil production over the last decades was reversed by the increased adoption of the hydraulic fracturing (“fracking”) technology. Adapting the supply-side model proposed by Kaufmann et al. (2004) to assess OPEC’s ability to inﬂuence real oil prices, this paper investigates the effect of the increase in U. S. oil production due to fracking on world oil prices. Among our key results obtained from (dynamic) OLS estimations, there is a statistically signiﬁcant negative long-run relationship between increased U.S. oil production and oil prices.2018-07-31T14:08:06ZForeign competition and executive compensation in the manufacturing industry – A comparison between Germany and the U.S.
http://hdl.handle.net/2003/37058
Title: Foreign competition and executive compensation in the manufacturing industry – A comparison between Germany and the U.S.
Authors: Dyballa, Katharina; Kraft, Kornelius
Abstract: In this study we use import penetration as a proxy for foreign competition in order to empirically analyze (1) the impact of foreign competition on managerial compensation, (2) differences in the impact between Germany and the U.S and (3) whether the impact of import penetration is driven by implied efficiency effects. We use data from the manufacturing industry covering the period from 1984-2010 for Germany respectively 1992-2011 for the U.S and apply system GMM in order to solve potential endogeneity problems. It turns out that foreign competition leads to an increase of average per capita executive compensation in both countries. The impact of foreign competition on payperformance sensitivity differs between the US and Germany. A differentiation between imported intermediates (efficient sourcing strategy) and final inputs (competition) reveals that the impact of import penetration is not biased by efficiency effects.2018-07-24T13:42:57ZOptimal designs for frequentist model averaging
http://hdl.handle.net/2003/37014
Title: Optimal designs for frequentist model averaging
Authors: Alhorn, Kira; Schorning, Kirsten; Dette, Holger
Abstract: We consider the problem of designing experiments for the estimation of a target in
regression analysis if there is uncertainty about the parametric form of the regression
function. A new optimality criterion is proposed, which minimizes the asymptotic mean
squared error of the frequentist model averaging estimate by the choice of an experimental
design. Necessary conditions for the optimal solution of a locally and Bayesian optimal
design problem are established. The results are illustrated in several examples and it is
demonstrated that Bayesian optimal designs can yield a reduction of the mean squared
error of the model averaging estimator up to 45%.2018-07-16T12:03:35ZThe price response of residential electricity demand in Germany: A dynamic approach
http://hdl.handle.net/2003/36966
Title: The price response of residential electricity demand in Germany: A dynamic approach
Authors: Frondel, Manuel; Kussel, Gerhard; Sommer, Stephan
Abstract: Due to growing concerns about climate change, policy-makers from all
around the world establish measures, such as carbon taxes, to lower electricity demand
and energy consumption in general. Drawing on household panel data from
the German Residential Energy Consumption Survey (GRECS) that span over nine
years (2006-2014) and employing the sum of regulated price components as an instrument
for the likely endogenous electricity price, we gauge the response of residential
electricity demand to price increases on the basis of the dynamic Blundell-Bond estimator
to account for potential simultaneity and endogeneity problems, as well as
the Nickell bias. Estimating short- and long-run price elasticities of -0.44 and -0.66,
respectively, our results indicate that price measures may be effective in dampening
residential electricity consumption, particularly in the long run. Yet, we also find that
responses to price changes are very heterogeneous across household groups.2018-07-09T14:49:29ZOn axiomizing and extending the quasi-arithmetic mean
http://hdl.handle.net/2003/36880
Title: On axiomizing and extending the quasi-arithmetic mean
Authors: Hansen, Maurice
Abstract: Quasi-arithmetic means contain many other mean value concepts
such as the arithmetic, the geometric or the harmonic mean as
special cases. Treating quasi-arithmetic means as sequences of mappings
from I^n into I (for some real interval I) this paper shows that under
mild additional conditions this mapping is uniquely determined by its
values on I^2. This extends a well-known result by Huntington [4] where
this claim is proven only for special cases.2018-05-29T08:46:48ZSimar and Wilson two-stage efficiency analysis for Stata
http://hdl.handle.net/2003/36879
Title: Simar and Wilson two-stage efficiency analysis for Stata
Authors: Badunenko, Oleg; Tauchmann, Harald
Abstract: When analyzing what determines the efficiency of production, regressing
efficiency scores estimated by DEA on explanatory variables has much intuitive
appeal. Simar and Wilson (2007) show that this na¨ıve two-stage estimation
procedure suffers from severe flaws, that render its results, and in particular
statistical inference based on them, questionable. At the same time they propose
a statistically grounded bootstrap based two-stage estimator that eliminates the
above mentioned weaknesses of its na¨ıve predecessors and comes in two variants.
This article introduces the new Stata command simarwilson that implements
either variant of the suggested estimator in Stata. The command allows for various
options, and extends the original procedure in some respects. For instance, it
allows for analyzing both, output- and input-oriented efficiency. To demonstrate
the capabilities of the new command simarwilson we use data from the Penn
World Tables and the Global Competitiveness Report by the World Economic
Forum to perform a cross-country empirical study about the importance of quality
of governance of a country for its efficiency of output production.2018-05-25T13:51:06ZRobust discrimination between long-range dependence and a change in mean
http://hdl.handle.net/2003/36842
Title: Robust discrimination between long-range dependence and a change in mean
Authors: Gerstenberger, Carina
Abstract: In this paper we introduce a robust to outliers Wilcoxon change-point testing procedure,
for distinguishing between short-range dependent time series with a change in mean at unknown
time and stationary long-range dependent time series. We establish the asymptotic
distribution of the test statistic under the null hypothesis for L1 near epoch dependent
processes and show its consistency under the alternative. The Wilcoxon-type testing procedure
similarly as the CUSUM-type testing procedure of Berkes, Horvath, Kokoszka and
Shao (2006), requires estimation of the location of a possible change-point, and then using
pre- and post-break subsamples to discriminate between short and long-range dependence.
A simulation study examines the empirical size and power of the Wilcoxon-type testing
procedure in standard cases and with disturbances by outliers. It shows that in standard
cases the Wilcoxon-type testing procedure behaves equally well as the CUSUM-type testing
procedure but outperforms it in presence of outliers.2018-04-23T12:22:48ZDeviations from triangular arbitrage parity in foreign exchange and bitcoin markets
http://hdl.handle.net/2003/36820
Title: Deviations from triangular arbitrage parity in foreign exchange and bitcoin markets
Authors: Reynolds, Julia; Sögner, Leopold; Wagner, Martin; Wied, Dominik
Abstract: This paper applies new econometric tools to monitor and detect so-called "financial market dislocations",
defined as periods in which substantial deviations from arbitrage parities take place. In particular,
we focus on deviations from the triangular arbitrage parity for exchange rate triplets. Due to
increasing media attention towards mispricing in the market for cryptocurrencies, we include the cryptocurrency Bitcoin in addition to fiat currencies. We do not find evidence for substantial deviations
from the triangular arbitrage parity when only traditional fiat currencies are concerned. However, we
document significant deviations from triangular arbitrage parities in the newer markets for Bitcoin.2018-03-27T14:57:15ZEfficient designs for the estimation of mixed and self carryover effects
http://hdl.handle.net/2003/36819
Title: Efficient designs for the estimation of mixed and self carryover effects
Authors: Kunert, Joachim; Mielke, Johanna
Abstract: Biosimilars are copies of biological medicines that are developed by a competitor
after the patent for the originator drug has expired. Extensive clinical trials are
required to show therapeutic equivalence between the biosimilar and its reference
product before a biosimilar can be sold on the market. However, even after more
than 10 years of experience with biosimilars in Europe, there is still some uncertainty
if the patients who are already taking the reference product can switch between
the biosimilar and its reference product. One convenient way to assess the impact
of switches is the analysis of mixed and self carryover effects: if the products are
switchable, there should not be any difference in the carryover effects. This paper
determines a series of simple designs which are highly efficient for the comparison
of the mixed and self carryover effects of two treatments. The proof of efficiency
is not straightforward because the information matrix of the efficient designs is not
completely symmetric.2018-03-27T14:54:28ZThe nonparametric location-scale mixture cure model
http://hdl.handle.net/2003/36818
Title: The nonparametric location-scale mixture cure model
Authors: Chown, Justin; Heuchenne, Cédric; Van Keilegom, Ingrid
Abstract: We propose completely nonparametric methodology to investigate location-scale modelling of two-component mixture cure models, where the responses of interest are only indirectly observable due to the presence of censoring and the presence of so-called long-term survivors that are always censored. We use covariate-localized nonparametric estimators, which depend on a bandwidth sequence, to propose an estimator of the error distribution function that has not been considered before in the literature. When this bandwidth belongs to a certain range of undersmoothing band-widths, the asymptotic distribution of the proposed estimator of the error distribution function does not depend on this bandwidth, and this estimator is shown to be root-n consistent. This suggests that a computationally costly bandwidth selection procedure is unnecessary to obtain an effective estimator of the error distribution, and that a simpler rule-of-thumb approach can be used instead.
A simulation study investigates the finite sample properties of our approach, and the methodology is illustrated using data obtained to study the behavior of distant metastasis in lymph-node-negative breast cancer patients.2018-03-27T14:52:08ZEquity and the Willingness to Pay for Green Electricity: Evidence from Germany
http://hdl.handle.net/2003/36800
Title: Equity and the Willingness to Pay for Green Electricity: Evidence from Germany
Authors: Andor, Mark; Frondel, Manuel; Sommer, Stephan
Abstract: The production of electricity on the basis of renewable energy technologies is a
classic example of an impure public good. It is often discriminatively financed by industrial
and household consumers, such as in Germany, where the energy-intensive
sector benefits from far-reaching exemptions, while all other electricity consumers
are forced to bear a higher burden. Based on randomized information treatments
in a stated-choice experiment among about 11,000 German households, we explore
whether this coercive payment rule affects households’ willingness-to-pay (WTP) for
green electricity. Our central result is that reducing inequity by abolishing the exemption
for the energy-intensive industry raises households’ WTP, a finding that may
have high external validity.2018-03-14T11:52:35ZA study on the least square estimator of multiple isotonic regression function
http://hdl.handle.net/2003/36799
Title: A study on the least square estimator of multiple isotonic regression function
Authors: Bagchi, Pramita; Dhar, Subhra Sankar
Abstract: Consider the problem of pointwise estimation of f in a multiple isotonic regression model Z = f(X1, ... ,Xd) + ε , where Z is the response variable, f is an unknown non-parametric regression function,
which is isotonic with respect to each component, and is the error term. In this article, we investigate
the behaviour of the least square estimator of f and establish its asymptotic properties. We generalize the
greatest convex minorant characterization of isotonic regression estimator for the multivariate case and use
it to establish the asymptotic distribution of properly normalized version of the estimator. Moreover, we
test whether the multiple isotonic regression function at a fixed point is larger (or smaller) than a specified
value or not based on this estimator, and the consistency of the test is established. The practicability of the
estimator and the test are shown on simulated and real data as well.2018-03-14T11:50:47ZOn detecting changes in the jumps of arbitrary size of a time-continuous stochastic process
http://hdl.handle.net/2003/36786
Title: On detecting changes in the jumps of arbitrary size of a time-continuous stochastic process
Authors: Hoffmann, Michael
Abstract: This paper introduces test and estimation procedures for abrupt and gradual changes in
the entire jump behaviour of a discretely observed Ito semimartingale. In contrast to existing
work we analyse jumps of arbitrary size which are not restricted to a minimum height. Our
methods are based on weak convergence of a truncated sequential empirical distribution
function of the jump characteristic of the underlying Ito semimartingale. Critical values
for the new tests are obtained by a multiplier bootstrap approach and we investigate the
performance of the tests also under local alternatives. An extensive simulation study shows
the finite-sample properties of the new procedures.2018-03-02T12:37:55ZUniversally optimal crossover designs for the estimation of mixed-carryover effects with an application to biosimilar development
http://hdl.handle.net/2003/36785
Title: Universally optimal crossover designs for the estimation of mixed-carryover effects with an application to biosimilar development
Authors: Mielke, Johanna; Kunert, Joachim
Abstract: Biosimilars are medical products that are developed as copies of already
established, large molecule drugs (biologics). For gaining approval, sponsors have to
confirm that the proposed biosimilar has the same efficacy and safety as the originator
product. This comparability exercise includes also, in most cases, that large clinical
trials are conducted in patients. However, even with the evidence gained during the
clinical studies, there is still some uncertainty if patients who were already treated
with the originator can be switched to the biosimilar or if even multiple switches between
the biosimilar and the originator are acceptable. A simple way to address the
question of switchability is the estimation of so-called mixed and self-carryover effects,
which are carryover effects that not only depend on the treatment in the current
period, but also on the treatment in the previous period. In this paper, we determine
universally optimal designs for the estimation of mixed-carryover effects in a linear
model with treatment, period, subject and self-carryover as nuisance parameters.2018-03-02T12:35:46ZA likelihood ratio approach to sequential change point detection
http://hdl.handle.net/2003/36782
Title: A likelihood ratio approach to sequential change point detection
Authors: Dette, Holger; Gösmann, Josua
Abstract: In this paper we propose a new approach for sequential monitoring of a parameter
of a d-dimensional time series. We consider a closed-end-method, which is motivated
by the likelihood ratio test principle and compare the new method with two alternative
procedures. We also incorporate self-normalization such that estimation of the longrun
variance is not necessary. We prove that for a large class of testing problems the
new detection scheme has asymptotic level a and is consistent. The asymptotic theory
is illustrated for the important cases of monitoring a change in the mean, variance and
correlation. By means of a simulation study it is demonstrated that the new test performs
better than the currently available procedures for these problems.2018-02-28T12:58:43ZChange point analysis in non-stationary processes - a mass excess approach
http://hdl.handle.net/2003/36346
Title: Change point analysis in non-stationary processes - a mass excess approach
Authors: Dette, Holger; Wu, Weichi
Abstract: This paper considers the problem of testing if a sequence of means (μ t)t=1,...,n of a non-stationary time series (Xt)t=1,...,n is stable in the sense that the di fference of the means μ1 and μt between the initial time t = 1 and any other time is smaller than a given level, that is |μ1 — μt| ≤ c for all t = 1,..., n. A test for hypotheses of this type is developed using a bias corrected monotone rearranged local linear estimator and asymptotic normality of the corresponding test statistic is established. As the asymptotic variance depends on the location and order of the critical roots of the equation |μ1 — μt| = c a new bootstrap procedure is proposed to obtain critical values and its consistency is established. As a consequence we are able to quantitatively describe relevant deviations of a non-stationary sequence from its initial value. The results are illustrated by means of a simulation study and by analyzing data examples.2018-02-01T11:52:48ZDoes financial compensation increase the acceptance of power lines? Evidence from Germany
http://hdl.handle.net/2003/36310
Title: Does financial compensation increase the acceptance of power lines? Evidence from Germany
Authors: Simora, Michael; Frondel, Manuel; Vance, Colin
Abstract: Although public support for renewable energy promotion in Germany is
strong, the required power line construction has incited a groundswell of opposition
from residents concerned about the impacts on their neighborhoods. This paper
evaluates a large randomized one-shot binary-choice experiment to examine the
effect of different compensation schemes on the acceptance of new power line construction.
Results reveal that community compensations have no bearing on the acceptance
level, whereas personal compensations have a negative effect. Two possible
channels through which financial compensation reduces the willingness-to-accept are
(1) crowding out of intrinsic motivation to support the construction project and (2) a
signaling effect that alerts residents to potential negative impacts of the power lines.
Both explanations call into question the efficacy of financial payments to decrease local
opposition.2017-12-21T10:03:16ZPredatory short sales and bailouts
http://hdl.handle.net/2003/36232
Title: Predatory short sales and bailouts
Authors: Kranz, Sebastian; Löffler, Gunter; Posch, Peter N.
Abstract: This paper extends the literature on predatory short selling and bailouts
through a joint analysis of the two. We consider a model with informed short
sales, as well as uninformed predatory short sales, which can trigger the inefficient
liquidation of a firm. We obtain several novel results: A government commitment
to bail out insolvent firms with positive probability can increase welfare because
it selectively deters predatory short selling without hampering desirable informed
short sales. Contrasting a common view, bailouts can be optimal ex ante but
undesirable ex post. Furthermore, bailouts in our model are a better policy tool
than short selling restrictions. Welfare gains from the bailout policy are unevenly
distributed: shareholders gain while taxpayers lose. Bailout taxes allow ex-ante
Pareto improvements.2017-12-04T13:50:38ZBayesian optimal designs for dose-response curves with common parameters
http://hdl.handle.net/2003/36181
Title: Bayesian optimal designs for dose-response curves with common parameters
Authors: Schorning, Kirsten; Konstantinou, Maria
Abstract: The issue of determining not only an adequate dose but also a dosing frequency
of a drug arises frequently in Phase II clinical trials. This results in the comparison
of models which have some parameters in common. Planning such studies based on
Bayesian optimal designs offers robustness to our conclusions since these designs,
unlike locally optimal designs, are efficient even if the parameters are misspecified.
In this paper we develop approximate design theory for Bayesian D-optimality for
nonlinear regression models with common parameters and investigate the cases of
common location or common location and scale parameters separately. Analytical
characterisations of saturated Bayesian D-optimal designs are derived for frequently
used dose-response models and the advantages of our results are illustrated via a
numerical investigation.2017-11-14T13:23:20ZDer Wert von Versorgungssicherheit mit Strom: Evidenz für deutsche Haushalte
http://hdl.handle.net/2003/36180
Title: Der Wert von Versorgungssicherheit mit Strom: Evidenz für deutsche Haushalte
Authors: Frondel, Manuel; Sommer, Stephan
Abstract: Dieser Artikel untersucht auf Basis einer Befragung von mehr als 5.000
Haushaltsvorständen, wie viel sie für Versorgungssicherheit mit Strom zu zahlen bereit
sind. Alternativ zur Zahlungsbereitschaft (willingness to pay, WTP) wird auch nach der
Bereitschaft gefragt, gegen eine Entschädigungszahlung auf ein gewisses Maß an
Versorgungssicherheit zu verzichten (willingness to accept, WTA). In Übereinstimmung
mit zahlreichen empirischen Studien finden wir mittlere WTA-Werte, die deutlich über
den mittleren WTP-Werten für die Vermeidung eines unangekündigten, vierstündigen
Stromausfalls liegen. Den Grund für diese Diskrepanz sehen wir darin, dass die
bekundeten Entschädigungsforderungen für den Verzicht auf Versorgungssicherheit
tendenziell über dem tatsächlichen Wert liegen, der der Versorgungssicherheit mit Strom
beigemessen wird, wohingegen die dafür bekundete Zahlungsbereitschaft tendenziell
untertrieben wird.2017-11-14T13:22:11ZA test for separability in covariance operators of random surfaces
http://hdl.handle.net/2003/36169
Title: A test for separability in covariance operators of random surfaces
Authors: Bagchi, Pramita; Dette, Holger
Abstract: The assumption of separability is a simplifying and very popular assumption in
the analysis of spatio-temporal or hypersurface data structures. It is often made in
situations where the covariance structure cannot be easily estimated, for example
because of a small sample size or because of computational storage problems. In
this paper we propose a new and very simple test to validate this assumption. Our
approach is based on a measure of separability which is zero in the case of separability
and positive otherwise. The measure can be estimated without calculating
the full non-separable covariance operator. We prove asymptotic normality of the
corresponding statistic with a limiting variance, which can easily be estimated from
the available data. As a consequence quantiles of the standard normal distribution
can be used to obtain critical values and the new test of separability is very easy to
implement. In particular, our approach does neither require projections on subspaces
generated by the eigenfunctions of the covariance operator, nor resampling
procedures to obtain critical values nor distributional assumptions as recently used
by Aston et al. (2017) and Constantinou et al. (2017) to construct tests for separability.
We investigate the finite sample performance by means of a simulation study
and also provide a comparison with the currently available methodology. Finally,
the new procedure is illustrated analyzing wind speed and temperature data.2017-11-08T09:31:50ZOptimal designs for regression with spherical data
http://hdl.handle.net/2003/36168
Title: Optimal designs for regression with spherical data
Authors: Dette, Holger; Konstantinou, Maria; Schorning, Kirsten; Gösmann, Josua
Abstract: In this paper optimal designs for regression problems with spherical predictors of
arbitrary dimension are considered. Our work is motivated by applications in material
sciences, where crystallographic textures such as the missorientation distribution
or the grain boundary distribution (depending on a four dimensional spherical predictor)
are represented by series of hyperspherical harmonics, which are estimated
from experimental or simulated data.
For this type of estimation problems we explicitly determine optimal designs with
respect to Kiefers op-criteria and a class of orthogonally invariant information criteria
recently introduced in the literature. In particular, we show that the uniform
distribution on the m-dimensional sphere is optimal and construct discrete and implementable
designs with the same information matrices as the continuous optimal
designs. Finally, we illustrate the advantages of the new designs for series estimation
by hyperspherical harmonics, which are symmetric with respect to the first and
second crystallographic point group.2017-11-08T09:29:23ZFunctional data analysis in the Banach space of continuous functions
http://hdl.handle.net/2003/36129
Title: Functional data analysis in the Banach space of continuous functions
Authors: Dette, Holger; Kokot, Kevin; Aue, Alexander
Abstract: Functional data analysis is typically conducted within the L2-Hilbert space framework. There is by now a fully developed statistical toolbox allowing for the principled application of the functional data machinery to real-world problems, often based on dimension reduction techniques such as functional principal component analysis. At the same time, there have recently been a number of publications that sidestep dimension reduction steps and focus on a fully functional L2-methodology. This paper goes one step further and develops data analysis methodology for functional time series in the space of all continuous functions. The work is motivated by the fact that objects with rather different shapes may still have a small L2-distance and are therefore identified as similar when using an L2-metric. However, in applications it is often desirable to
use metrics reflecting the visualaization of the curves in the statistical analysis. The methodological contributions are focused on developing two-sample and change-point tests as well as confidence bands, as these procedures appear do be conducive to the proposed setting. Particular interest is put on relevant differences; that is, on not trying to test for exact equality, but rather for pre-specified deviations under the null hypothesis.
The procedures are justified through large-sample theory. To ensure practicability, nonstandard bootstrap procedures are developed and investigated addressing particular features that arise in the problem of testing relevant hypotheses. The finite sample properties are explored through a simulation study and an application to annual temperature profiles.2017-10-19T14:17:45ZOptimal designs for enzyme inhibition kinetic models
http://hdl.handle.net/2003/36099
Title: Optimal designs for enzyme inhibition kinetic models
Authors: Schorning, Kirsten; Dette, Holger; Kettelhake, Katrin; Möller, Tilman
Abstract: In this paper we present a new method for determining optimal designs for enzyme
inhibition kinetic models, which are used to model the influence of the concentration of a
substrate and an inhibition on the velocity of a reaction. The approach uses a nonlinear
transformation of the vector of predictors such that the model in the new coordinates is
given by an incomplete response surface model. Although there exist no explicit solutions
of the optimal design problem for incomplete response surface models so far, the corre-
sponding design problem in the new coordinates is substantially more transparent, such
that explicit or numerical solutions can be determined more easily. The designs for the
original problem can finally be found by an inverse transformation of the optimal designs
determined for the response surface model. We illustrate the method determining explicit
solutions for the D-optimal design and for the optimal design problem for estimating the
individual coefficients in a non-competitive enzyme inhibition kinetic model.2017-09-15T13:15:48ZCombining cumulative sum change-point detection tests for assessing the stationarity of univariate time series
http://hdl.handle.net/2003/36098
Title: Combining cumulative sum change-point detection tests for assessing the stationarity of univariate time series
Authors: Bücher, Axel; Fermanian, Jean-David; Kojadinovic, Ivan
Abstract: We derive tests of stationarity for continuous univariate time series by combining changepoint
tests sensitive to changes in the contemporary distribution with tests sensitive to
changes in the serial dependence. Rank-based cumulative sum tests based on the empirical
distribution function and on the empirical autocopula at a given lag are considered first.
The combination of their dependent p-values relies on a joint dependent multiplier bootstrap
of the two underlying statistics. Conditions under which the proposed combined testing
procedure is asymptotically valid under stationarity are provided. After discussing the
choice of the maximum lag to investigate, extensions based on tests solely focusing on second-order
characteristics are proposed. The finite-sample behaviors of all the derived statistical
procedures are investigated in large-scale Monte Carlo experiments and illustrations on two
real data sets are provided. Extensions to multivariate time series are briefly discussed as
well.2017-09-15T12:55:28ZA nonparametric test for stationarity in functional time series
http://hdl.handle.net/2003/36083
Title: A nonparametric test for stationarity in functional time series
Authors: van Delft, Anne; Bagchi, Pramita; Characiejus, Vaidotas; Dette, Holger
Abstract: We propose a new measure for stationarity of a functional time series, which is based on an explicit representation of the L2-distance between the spectral density operator of a non-stationary process and its best (L2-)approximation by a spectral density operator corresponding to a stationary process. This distance can easily be estimated by sums of Hilbert-Schmidt inner products of periodogram operators (evaluated at different frequencies), and asymptotic normality of an appropriately standardised version of the estimator can be established for the corresponding estimate under the null hypothesis and alternative. As a
result we obtain confidence intervals for the discrepancy of the underlying process from a functional stationary process and a simple asymptotic frequency domain level ® test (using the quantiles of the normal distribution) for the hypothesis of stationarity of functional time series. Moreover, the new methodology allows also to test precise hypotheses of the form “the functional time series is approximately stationarity”, which means that the new measure of stationarity is smaller than a given threshold. Thus in contrast to methods proposed in the literature our approach also allows to test for “relevant” deviations from stationarity. We demonstrate in a small simulation study that the new method has very good finite sample properties and compare it with the currently available alternative procedures. Moreover, we apply our test to annual temperature curves.2017-09-06T13:48:08ZBehavioral economics and energy conservation - a systematic review of nonprice interventions and their causal effects
http://hdl.handle.net/2003/36037
Title: Behavioral economics and energy conservation - a systematic review of nonprice interventions and their causal effects
Authors: Andor, Mark; Fels, Katja
Abstract: Research from economics and psychology suggests that behavioral
interventions can be a powerful climate policy instrument. This paper
provides a systematic review of the existing empirical evidence on non-price
interventions targeting energy conservation behavior of private households.
Specifically, we analyze the four nudge-like interventions referred to as social
comparison, pre-commitment, goal setting and labeling in 38 international
studies comprising 91 treatments. This paper differs from previous systematic
reviews by solely focusing on studies that permit the identification of causal
effects. We find that all four interventions have the potential to significantly
reduce energy consumption of private households, yet effect sizes vary
immensely. We conclude by emphasizing the importance of impact
evaluations before rolling out behavioral policy interventions at scale.2017-07-28T12:09:18ZA note on conditional versus joint unconditional weak convergence in bootstrap consistency results
http://hdl.handle.net/2003/35989
Title: A note on conditional versus joint unconditional weak convergence in bootstrap consistency results
Authors: Bücher, Axel; Kojadinovic, Ivan
Abstract: The consistency of a bootstrap or resampling scheme is classically validated by weak convergence of conditional laws. However, when working with stochastic processes in the space of bounded functions and their weak convergence in the Hoffmann-Jorgensen sense, an obstacle occurs: due to possible non-measurability, neither laws nor conditional laws are well-defined. Starting from an equivalent formulation of weak convergence based on the bounded Lipschitz metric, a classical circumvent is to formulate bootstrap consistency in terms of the latter distance between what might be called a conditional law of the (nonmeasurable) bootstrap process and the law of the limiting process. The main contribution of this note is to provide an equivalent formulation of bootstrap consistency in the space of bounded functions which is more intuitive and easy to work with. Essentially, the equivalent formulation consists of (unconditional) weak convergence of the original process jointly with an arbitrary large number of bootstrap replicates. As a by-product, we provide two equivalent formulations of bootstrap consistency for Rd-valued statistics: the first in terms of (unconditional) weak convergence of the statistic jointly with its bootstrap replicates, the second in terms of convergence in probability of the empirical distribution function of the bootstrap replicates. Finally, the asymptotic validity of bootstrap-based confidence intervals and tests is briefly revisited, with particular emphasis on the, in practice unavoidable, Monte Carlo approximation of conditional quantiles.2017-06-10T12:42:56ZInference for heavy tailed stationary time series based on sliding blocks
http://hdl.handle.net/2003/35988
Title: Inference for heavy tailed stationary time series based on sliding blocks
Authors: Bücher, Axel; Segers, Johan
Abstract: The block maxima method in extreme value theory consists of fitting an extreme value distribution to a sample of block maxima extracted from a time series. Traditionally, the maxima are taken over disjoint blocks of observations. Alternatively, the blocks can be chosen to slide through the observation period, yielding a
larger number of overlapping blocks. Inference based on sliding blocks is found to be more efficient than inference based on disjoint blocks. The asymptotic variance of the maximum likelihood estimator of the Fréchet shape parameter is reduced by more than 18%. Interestingly, the amount of the efficiency gain is the same whatever the serial dependence of the underlying time series: as for disjoint blocks, the asymptotic
distribution depends on the serial dependence only through the sequence of scaling constants. The findings are illustrated by simulation experiments and are applied to the estimation of high return levels of the daily log-returns of the Standard & Poor's 500 stock market index.2017-06-10T12:40:34ZDie Gerechtigkeitslücke in der Verteilung der Kosten der Energiewende auf die privaten Haushalte
http://hdl.handle.net/2003/35978
Title: Die Gerechtigkeitslücke in der Verteilung der Kosten der Energiewende auf die privaten Haushalte
Authors: Frondel, Manuel; Kutzschbauch, Ole; Sommer, Stephan; Traub, Stefan
Abstract: Die Energiewende bürdet den Verbrauchern zunehmende Lasten auf. Relativ zu
ihrem Einkommen fallen diese Belastungen für einkommensschwache Haushalte
stärker aus als für einkommensstarke Haushalte. Die Ergebnisse unserer empirischen
Erhebung unter mehr als 11.000 Haushalten zeigen jedoch, dass in der Regel eine
Aufteilung der Kosten der Energiewende gewünscht wird, die Haushalte mit hohen
Einkommen vergleichsweise stärker in die Pflicht nimmt als einkommensschwache
Haushalte. Die auf dieser Grundlage von uns konstatierte Gerechtigkeitslücke zwischen
der gewünschten und tatsächlichen Kostenbelastung der Haushalte nimmt
mit den wachsenden Kosten der Energiewende voraussichtlich weiter zu. Diese
Lücke könnte im Prinzip jedoch leicht geschlossen werden, wie die in diesem Beitrag
dargestellten empirischen Schätzungen der Zahlungsbereitschaft der Haushalte
für die Förderung der Erneuerbaren auf Basis von Diskreten-Wahl-Modellen nahelegen.
So könnten die einkommenstärkeren Haushalte bei der Finanzierung der
Energiewende stärker als bislang in die Pflicht genommen werden, da nach unseren
Schätzergebnissen die Haushalte des oberen Einkommensdrittels eine statistisch
signifikant höhere Zustimmung zu zukünftigen EEG-Umlageerhöhungen zeigen als
die Haushalte des unteren Einkommensdrittels.2017-06-01T10:13:43ZThe speed of transition revisited
http://hdl.handle.net/2003/35942
Title: The speed of transition revisited
Authors: Naevdal, Eric; Wagner, Martin
Abstract: The speed of transition literature appears to have overlooked the fact that due to the
dynamic nature of the economy the post-transition economic performance influences optimal
behavior already during transition. We illustrate the implications of this neglect
using the well-known model of Aghion and Blanchard (1994, Section 6.4). The correct
solution differs in several respects from the "approximate" solution presented by Aghion
and Blanchard. First, unemployment is increasing up to a certain endogenous point in
time, when, second, the remaining state sector is closed down. This point in time can be
defined as the end of transition. The correct solution is based on transforming the problem
to a type of a dynamic optimization problem often encountered in resource economics: a
scrap value problem with free terminal time.2017-05-02T12:17:22ZConsequentiality and the Willingness-To-Pay for Renewables: Evidence from Germany
http://hdl.handle.net/2003/35937
Title: Consequentiality and the Willingness-To-Pay for Renewables: Evidence from Germany
Authors: Andor, Mark A.; Frondel, Manuel; Horvath, Marco
Abstract: Based on hypothetical responses originating from a large-scale survey among
about 7,000 German households, this study investigates the discrepancy in willingness-to-
pay (WTP) estimates for green electricity across discrete-choice and open-ended valuation
formats, thereby accounting for perceived consequentiality: respondents selfselect
into two groups distinguished by their belief in the consequentiality of their
answers for policy making. Recognizing that consequentiality status and WTP might
be jointly influenced by unobservable factors, we employ a switching regression model
that accounts for the potential endogeneity of respondents’ belief in consequences and,
hence, biases from sample selectivity. Contrasting with the received literature, we find
WTP bids that tend to be higher among those respondents who obtained questions
in the open-ended format, rather than single binary choice questions. This difference
shrinks, however, when focusing on individuals who perceive the survey as politically
consequential.2017-04-25T07:44:39ZRelevant change points in high dimensional time series
http://hdl.handle.net/2003/35934
Title: Relevant change points in high dimensional time series
Authors: Dette, Holger; Gösmann, Josua2017-04-19T13:32:46ZSequential detection of parameter changes in dynamic conditional correlation models
http://hdl.handle.net/2003/35914
Title: Sequential detection of parameter changes in dynamic conditional correlation models
Authors: Pape, Katharina; Galeano, Pedro; Wied, Dominik
Abstract: A multivariate monitoring procedure is presented to detect changes in the parameter vector of
the dynamic conditional correlation model proposed by Robert Engle in 2002. The benefit of
the proposed procedure is that it can be used to detect changes in both the conditional and
unconditional variance as well as in the correlation structure of the model. The detector is based
on quasi log likelihood scores. More precisely, standardized derivations of quasi log likelihood
contributions of points in the monitoring period are evaluated at parameter estimates calculated
from a historical period. The null hypothesis of a constant parameter vector is rejected if these
standardized terms differ too much from those that were expected under the assumption of a
constant parameter vector. Under appropriate assumptions on moments and the structure of
the parameter space, limit results are derived both under null hypothesis and alternatives. In a
simulation study, size and power properties of the procedure are examined in various scenarios.2017-04-06T11:27:08ZFourier analysis of serial dependence measures
http://hdl.handle.net/2003/35853
Title: Fourier analysis of serial dependence measures
Authors: Van Hecke, Ria; Volgushev, Stanislav; Dette, Holger
Abstract: Classical spectral analysis is based on the discrete Fourier transform of the auto-covariances.
In this paper we investigate the asymptotic properties of new frequency domain methods where the auto-covariances in the spectral density are replaced by alternative dependence measures which can be estimated by U-statistics. An interesting example is given by
Kendall's r , for which the limiting variance exhibits a surprising behavior.2017-03-15T11:40:32ZCointegration in singular ARMA models
http://hdl.handle.net/2003/35778
Title: Cointegration in singular ARMA models
Authors: Deistler, Manfred; Wagner, Martin
Abstract: We consider the cointegration properties of singular ARMA processes integrated of order one.
Such processes are necessarily cointegrated as opposed to the regular case. We show that in the
left coprime case the cointegrating space only depends upon the autoregressive polynomial at
one.2017-02-03T11:14:06ZRisk estimators for choosing regularization parameters in ill-posed problems - properties and limitations
http://hdl.handle.net/2003/35772
Title: Risk estimators for choosing regularization parameters in ill-posed problems - properties and limitations
Authors: Lucka, Felix; Proksch, Katharina; Brune, Christoph; Bissantz, Nicolai; Burger, Martin; Dette, Holger; Wübbeling, Frank
Abstract: This paper discusses the properties of certain risk estimators recently proposed to
choose regularization parameters in ill-posed problems. A simple approach is Stein's unbiased
risk estimator (SURE), which estimates the risk in the data space, while a recent
modification (GSURE) estimates the risk in the space of the unknown variable. It seems
intuitive that the latter is more appropriate for ill-posed problems, since the properties
in the data space do not tell much about the quality of the reconstruction. We provide
theoretical studies of both estimators for linear Tikhonov regularization in a finite
dimensional setting and estimate the quality of the risk estimators, which also leads to
asymptotic convergence results as the dimension of the problem tends to infinity. Unlike
previous papers, who studied image processing problems with a very low degree of
ill-posedness, we are interested in the behavior of the risk estimators for increasing illposedness.
Interestingly, our theoretical results indicate that the quality of the GSURE
risk can deteriorate asymptotically for ill-posed problems, which is confirmed by a detailed
numerical study. The latter shows that in many cases the GSURE estimator leads
to extremely small regularization parameters, which obviously cannot stabilize the reconstruction.
Similar but less severe issues with respect to robustness also appear for the
SURE estimator, which in comparison to the rather conservative discrepancy principle
leads to the conclusion that regularization parameter choice based on unbiased risk estimation
is not a reliable procedure for ill-posed problems. A similar numerical study for
sparsity regularization demonstrates that the same issue appears in nonlinear variational
regularization approaches.2017-02-01T10:36:22ZClimate change, population ageing and public spending: Evidence on individual preferences
http://hdl.handle.net/2003/35771
Title: Climate change, population ageing and public spending: Evidence on individual preferences
Authors: Andor, Mark; Schmidt, Christoph M.; Sommer, Stephan
Abstract: Economic theory, as well as empirical research, suggest that elderly people
prefer public spending on policies yielding short-term benefits. This might be bad
news for policies aimed at combating climate change: while the unavoidable costs of
these policies arise today, the expected benefits occur in the distant future. Drawing
on data from over 12,000 households and using the ordered logit and the generalized
ordered logit model, we analyze whether attitudes towards climate change and climate
policies, as well as public spending preferences, differ with respect to age. Our
estimates show that elderly people are less concerned about climate change, but more
concerned about other global challenges. Furthermore, they are less likely to support
climate-friendly policies, such as the subsidization of renewables, and allocate less
public resources to environmental policies. Thus, our results suggest that the ongoing
demographic change in industrialized countries may undermine climate policies.2017-01-31T14:58:48ZRobust estimation of change-point location
http://hdl.handle.net/2003/35748
Title: Robust estimation of change-point location
Authors: Gerstenberger, Carina
Abstract: We introduce a robust estimator of the location parameter for the change-point in the
mean based on the Wilcoxon statistic and establish its consistency for L1 near epoch
dependent processes. It is shown that the consistency rate depends on the magnitude
of change. A simulation study is performed to evaluate finite sample properties of the
Wilcoxon-type estimator in standard cases, as well as under heavy-tailed distributions and
disturbances by outliers, and to compare it with a CUSUM-type estimator. It shows that
the Wilcoxon-type estimator is equivalent to the CUSUM-type estimator in standard cases,
but outperforms the CUSUM-type estimator in presence of heavy tails or outliers in the
data.2017-01-11T09:33:07ZOn MSE-optimal crossover designs
http://hdl.handle.net/2003/35743
Title: On MSE-optimal crossover designs
Authors: Neumann, Christoph; Kunert, Joachim
Abstract: In crossover designs, each subject receives a series of treatments
one after the other. Most papers on optimal crossover designs consider an
estimate which is corrected for carryover effects. We look at the estimate
for direct effects of treatment, which is not corrected for carryover effects.
If there are carryover effects, this estimate will be biased. We try to find a
design that minimizes the mean square error, that is the sum of the squared
bias and the variance. It turns out that the designs which are optimal for
the corrected estimate are highly efficient for the uncorrected estimate.2017-01-06T12:30:55ZOrdinal pattern dependence between hydrological time series
http://hdl.handle.net/2003/35732
Title: Ordinal pattern dependence between hydrological time series
Authors: Fischer, Svenja; Schumann, Andreas; Schnurr, Alexander
Abstract: Ordinal patterns provide a method to measure correlation between time series. In
contrast to classical correlation measures like the Pearson correlation coefficient they
are able to measure not only linear correlation but also non-linear correlation even
in the presence of non-stationarity. Hence, they are a noteworthy alternative to the
classical approaches when considering discharge series. Discharge series naturally
show a high variation as well as single extraordinary extreme events and, caused by
anthropogenic and climatic impacts, non-stationary behaviour. Here, the method
of ordinal patterns is used to compare pairwise discharge series derived from macroand
mesoscale catchments in Germany. Differences of coincident groups were detected
for winter and summer annual maxima. Hydrological series, which are mainly
driven by annual climatic conditions (yearly discharges and low water discharges)
showed other and in some cases surprising interdependencies between macroscale
catchments. Anthropogenic impacts as the construction of a reservoir or different
flood conditions caused by urbanization could be detected.2016-12-22T12:29:13ZA simple test for white noise in functional time series
http://hdl.handle.net/2003/35731
Title: A simple test for white noise in functional time series
Authors: Bagchi, Pramita; Characiejus, Vaidotas; Dette, Holger
Abstract: We propose a new procedure for white noise testing of a functional time series.
Our approach is based on an explicit representation of the L2-distance between the
spectral density operator and its best (L2-)approximation by a spectral density operator
corresponding to a white noise process. The estimation of this distance can be
easily accomplished by sums of periodogram kernels and it is shown that an appropriately
standardized version of the estimator is asymptotically normal distributed
under the null hypothesis (of functional white noise) and under the alternative. As a
consequence we obtain a very simple test (using the quantiles of the normal distribution)
for the hypothesis of a white noise functional process. In particular the test
does neither require the estimation of a long run variance (including a fourth order
cumulant) nor resampling procedures to calculate critical values. Moreover, in contrast
to all other methods proposed in the literature our approach also allows to test
for "relevant" deviations from white noise and to construct confidence intervals for
a measure which measures the discrcepancy of the underlying process from a functional
white noise process.2016-12-22T12:27:11Z