Eldorado Collection:
http://hdl.handle.net/2003/71
2023-12-07T02:27:49ZExplainable online ensemble of deep neural network pruning for time series forecasting
http://hdl.handle.net/2003/42086
Title: Explainable online ensemble of deep neural network pruning for time series forecasting
Authors: Saadallah, Amal; Jakobs, Matthias; Morik, Katharina
Abstract: Both the complex and evolving nature of time series data make forecasting among one of the most challenging tasks in machine learning. Typical methods for forecasting are designed to model time-evolving dependencies between data observations. However, it is generally accepted that none of them are universally valid for every application. Therefore, methods for learning heterogeneous ensembles by combining a diverse set of forecasters together appears as a promising solution to tackle this task. While several approaches in the context of time series forecasting have focused on how to combine individual models in an ensemble, ranging from simple and enhanced averaging tactics to applying meta-learning methods, few works have tackled the task of ensemble pruning, i.e. individual model selection to take part in the ensemble. In addition, in classical ML literature, ensemble pruning techniques are mostly restricted to operate in a static manner. To deal with changes in the relative performance of models as well as changes in the data distribution, we employ gradient-based saliency maps for online ensemble pruning of deep neural networks. This method consists of generating individual models’ performance saliency maps that are subsequently used to prune the ensemble by taking into account both aspects of accuracy and diversity. In addition, the saliency maps can be exploited to provide suitable explanations for the reason behind selecting specific models to construct an ensemble that plays the role of a forecaster at a certain time interval or instant. An extensive empirical study on many real-world datasets demonstrates that our method achieves excellent or on par results in comparison to the state-of-the-art approaches as well as several baselines. Our code is available on Github (https://github.com/MatthiasJakobs/os-pgsm/tree/ecml_journal_2022).2022-08-02T00:00:00ZApproaching phase retrieval with deep learning
http://hdl.handle.net/2003/42053
Title: Approaching phase retrieval with deep learning
Authors: Uelwer, Tobias
Abstract: Phase retrieval is the process of reconstructing images from only magnitude measurements. The problem is particularly challenging as most of the information about the image is contained in the missing phase. An important phase retrieval problem is Fourier phase retrieval, where the magnitudes of the Fourier transform are given. This problem is relevant in many areas of science, e.g., in X-ray crystallography, astronomy, microscopy, array imaging, and optics. In addition to Fourier phase retrieval, we also take a closer look at two additional phase retrieval problems: Fourier phase retrieval with a reference image and compressive Gaussian phase retrieval.
Most methods for phase retrieval, e.g., the error-reduction algorithm or Fienup's hybrid-input output algorithms are optimization-based algorithms which solely minimize an error-function to reconstruct the image. These methods usually make strong assumptions about the measured magnitudes which do not always hold in practice. Thus, they only work reliably for easy instances of the phase retrieval problems but fail drastically for difficult instances.
With the recent advances in the development of graphics processing units (GPUs), deep neural networks (DNNs) have become fashionable again and have led to breakthroughs in many research areas. In this thesis, we show how DNNs can be applied to solve the more difficult instances of phase retrieval problems when training data is available. On the one hand, we show how supervised learning can be used to greatly improve the reconstruction quality when training images and their corresponding measurements are available. We analyze the ability of these methods to generalize to out-of-distribution data. On the other hand, we take a closer look at an existing unsupervised method that relies on generative models. Unsupervised methods are agnostic toward the measurement process which is particularly useful for Gaussian phase retrieval. We apply this method to the Fourier phase retrieval problem and demonstrate how the reconstruction performance can be further improved with different initialization schemes. Furthermore, we demonstrate how optimizing intermediate representations of the underlying generative model can help overcoming the limited range of the model and, thus, can help to reach better solutions. Finally, we show how backpropagation can be used to learn reference images using a modification of the well-established error-reduction algorithm and discuss whether learning a reference image is always efficient. As it is common in machine learning research, we evaluate all methods on benchmark image datasets as it allows for easy reproducibility of the experiments and comparability to related methods. To better understand how the methods work, we perform extensive ablation experiments, and also analyze the influence of measurement noise and missing measurements.2023-01-01T00:00:00ZExplainable adaptation of time series forecasting
http://hdl.handle.net/2003/41371
Title: Explainable adaptation of time series forecasting
Authors: Saadallah, Amal
Abstract: A time series is a collection of data points captured over time, commonly found in many fields such as healthcare, manufacturing, and transportation. Accurately predicting the future behavior of a time series is crucial for decision-making, and several Machine Learning (ML) models have been applied to solve this task. However, changes in the time series, known as concept drift, can affect model generalization to future data, requiring thus online adaptive forecasting methods. This thesis aims to extend the State-of-the-Art (SoA) in the ML literature for time series forecasting by developing novel online adaptive methods.
The first part focuses on online time series forecasting, including a framework for selecting time series variables and developing ensemble models that are adaptive to changes in time series data and model performance. Empirical results show the usefulness and competitiveness of the developed methods and their contribution to the explainability of both model selection and ensemble pruning processes. Regarding the second part, the thesis contributes to the literature on online ML model-based quality prediction for three Industry 4.0 applications: NC-milling, bolt installation in the automotive industry, and Surface Mount Technology (SMT) in electronics manufacturing. The thesis shows how process simulation can be used to generate additional knowledge and how such knowledge can be integrated efficiently into the ML process. The thesis also presents two applications of explainable model-based quality prediction and their impact on smart industry practices.2022-01-01T00:00:00ZMachine learning for acquiring knowledge in astro-particle physics
http://hdl.handle.net/2003/41174
Title: Machine learning for acquiring knowledge in astro-particle physics
Authors: Bunse, Mirko
Abstract: This thesis explores the fundamental aspects of machine learning, which are involved with acquiring knowledge in the research field of astro-particle physics. This research field substantially relies on machine learning methods, which reconstruct the properties of astro-particles from the raw data that specialized telescopes record. These methods are typically trained from resource-intensive simulations, which reflect the existing knowledge about the particles—knowledge that physicists strive to expand. We study three fundamental machine learning tasks, which emerge from this goal.
First, we address ordinal quantification, the task of estimating the prevalences of ordered classes in sets of unlabeled data. This task emerges from the need for testing the agreement of astro-physical theories with the class prevalences that a telescope observes. To this end, we unify existing methods on quantification, propose an alternative optimization process, and develop regularization techniques to address ordinality in quantification problems, both in and outside of astro-particle physics. These advancements provide more accurate reconstructions of the energy spectra of cosmic gamma ray sources and, hence, support physicists in drawing conclusions from their telescope data.
Second, we address learning under class-conditional label noise. More particularly, we focus on a novel setting, in which one of the class-wise noise rates is known and one is not. This setting emerges from a data acquisition protocol, through which astro-particle telescopes simultaneously observe a region of interest and several background regions. We enable learning under this type of label noise with algorithms for consistent, noise-aware decision thresholding. These algorithms yield binary classifiers, which outperform the existing state-of-the-art in gamma hadron classification with the FACT telescope. Moreover, unlike the state-of-the-art, our classifiers are entirely trained from the real telescope data and thus do not require any resource-intensive simulation.
Third, we address active class selection, the task of actively finding those proportions of classes which optimize the classification performance. In astro-particle physics, this task emerges from the simulation, which produces training data in any desired class proportions. We clarify the implications of this setting from two theoretical perspectives, one of which provides us with bounds of the resulting classification performance. We employ these bounds in a certificate of model robustness, which declares a set of class proportions for which the model is accurate with a high probability. We also employ these bounds in an active strategy for class-conditional data acquisition. Our strategy uniquely considers existing uncertainties about those class proportions that have to be handled during the deployment of the classifier, while being theoretically well-justified.2022-01-01T00:00:00ZSome representation learning tasks and the inspection of their models
http://hdl.handle.net/2003/41168
Title: Some representation learning tasks and the inspection of their models
Authors: Pfahler, Lukas
Abstract: Today, the field of machine learning knows a wide range of tasks with a wide range
of supervision sources, ranging from the traditional classification tasks with neatly labeled
data, over data with noisy labels to data with no labels, where we have to rely
on other forms of supervision, like self-supervision. In the first part of this thesis, we
design machine learning tasks for applications where we do not immediately have access
to neatly-labeled training data.
First, we design unsupervised representation learning tasks for training embedding
models for mathematical expression that allow retrieval of related formulae. We train
convolutional neural networks, transformer models and graph neural networks to embed
formulas from scientific articles into a real-valued vector space using contextual similarity
tasks as well as self-supervised tasks. We base our studies on a novel dataset
that consists of over 28 million formulae that we have extracted from scientific articles
published on arXiv.org. We represent the formulas in different input formats — images,
sequences or trees — depending on the embedding model. We compile an evaluation
dataset with annotated search queries from several different disciplines and showcase the
usefulness of our approach for deploying a search engine for mathematical expressions.
Second, we investigate machine learning tasks in astrophysics. Prediction models are
currently trained on simulated data, with hand-crafted features and using multiple singletask
models. In contrast, we build a single multi-task convolutional neural network that
works directly on telescope images and uses convolution layers to learn suitable feature
representations automatically. We design loss functions for each task and propose a
novel way to combine the different loss functions to account for their different scales and
behaviors. Next, we explore another form of supervision that does not rely on simulated
training data, but learns from actual telescope recordings. Through the framework of
noisy label learning, we propose an approach for learning gamma hadron classifiers that
outperforms existing classifiers trained on simulated, fully-labeled data. Our method is
general: it can be used for training models in scenarios that fit our noise assumption of
class-conditional label noise with exactly one known noise probability.
In the second part of this work, we develop methods to inspect models and gain
trust into their decisions. We focus on large, non-linear models that can no longer be
understood in their entirety through plain inspection of their trainable parameters. We
investigate three approaches for establishing trust in models.
First, we propose a method to highlight influential input nodes for similarity computations
performed by graph neural networks. We test this approach with our embedding
models for retrieval of related formulas and show that it can help understand the similarity
scores computed by the models.
Second, we investigate explanation methods that provide explanations based on the
training process that produced the model. This way we provide explanations that are
not merely an approximation of the computation of the prediction function, but actually
an investigation into why the model learned to produce an output grounded in the actual
data. We propose two different methods for tracking the training process and show how
they can be easily implemented within existing deep learning frameworks.
Third, we contribute a method to verify the adversarial robustness of random forest
classifiers. Our method is based on knowledge distillation of a random forest model into
a decision tree model. We bound the approximation error of using the decision tree as
a proxy model to the given random forest model and use these bounds to provide guarantees
on the adversarial robustness of the random forest. Consequently, our robustness
guarantees are approximative, but we can provably control the quality of our results
using a hyperparameter.2022-01-01T00:00:00ZEnsemble learning with discrete classifiers on small devices
http://hdl.handle.net/2003/41132
Title: Ensemble learning with discrete classifiers on small devices
Authors: Buschjäger, Sebastian
Abstract: Machine learning has become an integral part of everyday life ranging from applications in AI-powered search queries to (partial) autonomous driving. Many of the advances in machine learning and its application have been possible due to increases in computation power, i.e., by reducing manufacturing sizes while maintaining or even increasing energy consumption. However, 2-3 nm manufacturing is within reach, making further miniaturization increasingly difficult while thermal design power limits are simultaneously reached, rendering entire parts of the chip useless for certain computational loads.
In this thesis, we investigate discrete classifier ensembles as a resource-efficient alternative that can be deployed to small devices that only require small amounts of energy. Discrete classifiers are classifiers that can be applied -- and oftentimes also trained -- without the need for costly floating-point operations. Hence, they are ideally suited for deployment to small devices with limited resources.
The disadvantage of discrete classifiers is that their predictive performance often lacks behind their floating-point siblings. Here, the combination of multiple discrete classifiers into an ensemble can help to improve the predictive performance while still having a manageable resource consumption.
This thesis studies discrete classifier ensembles from a theoretical point of view, an algorithmic point of view, and a practical point of view. In the theoretical investigation, the bias-variance decomposition and the double-descent phenomenon are examined. The bias-variance decomposition of the mean-squared error is re-visited and generalized to an arbitrary twice-differentiable loss function, which serves as a guiding tool throughout the thesis. Similarly, the double-descent phenomenon is -- for the first time -- studied comprehensively in the context of tree ensembles and specifically random forests. Contrary to established literature, the experiments in this thesis indicate that there is no double-descent in random forests.
While the training of ensembles is well-studied in literature, the deployment to small devices is often neglected. Additionally, the training of ensembles on small devices has not been considered much so far. Hence, the algorithmic part of this thesis focuses on the deployment of discrete classifiers and the training of ensembles on small devices. First, a novel combination of ensemble pruning (i.e., removing classifiers from the ensemble) and ensemble refinement (i.e., re-training of classifiers in the ensemble) is presented, which uses a novel proximal gradient descent algorithm to minimize a combined loss function. The resulting algorithm removes unnecessary classifiers from an already trained ensemble while improving the performance of the remaining classifiers at the same time. Second, this algorithm is extended to the more challenging setting of online learning in which the algorithm receives training examples one by one. The resulting shrub ensembles algorithm allows the training of ensembles in an online fashion while maintaining a strictly bounded memory consumption. It outperforms existing state-of-the-art algorithms under resource constraints and offers competitive performance in the general case.
Last, this thesis studies the deployment of decision tree ensembles to small devices by optimizing their memory layout. The key insight here is that decision trees have a probabilistic inference time because different observations can take different paths from the root to a leaf.
By estimating the probability of visiting a particular node in the tree, one can place it favorably in the memory to maximize the caching behavior and, thus, increase its performance without changing the model.
Last, several real-world applications of tree ensembles and Binarized Neural Networks are presented.2022-01-01T00:00:00ZRandomized outlier detection with trees
http://hdl.handle.net/2003/40232
Title: Randomized outlier detection with trees
Authors: Buschjäger, Sebastian; Honysz, Philipp-Jan; Morik, Katharina
Abstract: Isolation forest (IF) is a popular outlier detection algorithm that isolates outlier observations from regular observations by building multiple random isolation trees. The average number of comparisons required to isolate a given observation can then be used as a measure of its outlierness. Multiple extensions of this approach have been proposed in the literature including the extended isolation forest (EIF) as well as the SCiForest. However, we find a lack of theoretical explanation on why IF, EIF, and SCiForest offer such good practical performance. In this paper, we present a theoretical framework that views these approaches from a distributional viewpoint. Using this viewpoint, we show that isolation-based approaches first accurately approximate the data distribution and then secondly approximate the coefficients of mixture components using the average path length. Using this framework, we derive the generalized isolation forest (GIF) that also trains random isolation trees, but combining them moves beyond using the average path length. That is, GIF splits the data into multiple sub-spaces by sampling random splits as do the original IF variants do and directly estimates the mixture coefficients of a mixture distribution to score the outlierness on entire regions of data. In an extensive evaluation, we compare GIF with 18 state-of-the-art outlier detection methods on 14 different datasets. We show that GIF outperforms three competing tree-based methods and has a competitive performance to other nearest-neighbor approaches while having a lower runtime. Last, we highlight a use-case study that uses GIF to detect transaction fraud in financial data.2020-12-15T00:00:00ZGeospatial IoT—the need for event-driven architectures in contemporary spatial data infrastructures
http://hdl.handle.net/2003/38397
Title: Geospatial IoT—the need for event-driven architectures in contemporary spatial data infrastructures
Authors: Rieke, Matthes; Bigagli, Lorenzo; Herle, Stefan; Jirka, Simon; Kotsev, Alexander; Liebig, Thomas; Malewski, Christian; Paschke, Thomas; Stasch, Christoph
Abstract: The nature of contemporary spatial data infrastructures lies in the provision of geospatial information in an on-demand fashion. Although recent applications identified the need to react to real-time information in a time-critical way, research efforts in the field of geospatial Internet of Things in particular have identified substantial gaps in this context, ranging from a lack of standardisation for event-based architectures to the meaningful handling of real-time information as “events”. This manuscript presents work in the field of event-driven architectures as part of spatial data infrastructures with a particular focus on sensor networks and the devices capturing in-situ measurements. The current landscape of spatial data infrastructures is outlined and used as the basis for identifying existing gaps that retain certain geospatial applications from using real-time information. We present a selection of approaches—developed in different research projects—to overcome these gaps. Being designed for specific application domains, these approaches share commonalities as well as orthogonal solutions and can build the foundation of an overall event-driven spatial data infrastructure.2018-09-25T00:00:00ZHow is a data-driven approach better than random choice in label space division for multi-label classification?
http://hdl.handle.net/2003/38382
Title: How is a data-driven approach better than random choice in label space division for multi-label classification?
Authors: Szymanski, Piotr; Kajdanowicz, Tomasz; Kersting, Kristian
Abstract: We propose using five data-driven community detection approaches from social networks to partition the label space in the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd. We evaluate modularity-maximizing using fast greedy and leading eigenvector approximations, infomap, walktrap and label propagation algorithms. For this purpose, we propose to construct a label co-occurrence graph (both weighted and unweighted versions) based on training data and perform community detection to partition the label set. Then, each partition constitutes a label space for separate multi-label classification sub-problems. As a result, we obtain an ensemble of multi-label classifiers that jointly covers the whole label space. Based on the binary relevance and label powerset classification methods, we compare community detection methods to label space divisions against random baselines on 12 benchmark datasets over five evaluation measures. We discover that data-driven approaches are more efficient and more likely to outperform RAkELd than binary relevance or label powerset is, in every evaluated measure. For all measures, apart from Hamming loss, data-driven approaches are significantly better than RAkELd ( α=0.05 ), and at least one data-driven approach is more likely to outperform RAkELd than a priori methods in the case of RAkELd’s best performance. This is the largest RAkELd evaluation published to date with 250 samplings per value for 10 values of RAkELd parameter k on 12 datasets published to date.2016-07-30T00:00:00ZA mathematical theory of making hard decisions: model selection and robustness of matrix factorization with binary constraints
http://hdl.handle.net/2003/38270
Title: A mathematical theory of making hard decisions: model selection and robustness of matrix factorization with binary constraints
Authors: Heß, Sibylle Charlotte
Abstract: One of the first and most fundamental tasks in machine learning is to group observations within a dataset. Given a notion of similarity, finding those instances which are outstandingly similar to each other has manifold applications. Recommender systems and topic analysis in text data are examples which are most intuitive to grasp. The interpretation of the groups, called clusters, is facilitated if the assignment of samples is definite. Especially in high-dimensional data, denoting a degree to which an observation belongs to a specified cluster requires a subsequent processing of the model to filter the most important information. We argue that a good summary of the data provides hard decisions on the following question: how many groups are there, and which observations belong to which clusters? In this work, we contribute to the theoretical and practical background of clustering tasks, addressing one or both aspects of this question. Our overview of state-of-the-art clustering approaches details the challenges of our ambition to provide hard decisions. Based on this overview, we develop new methodologies for two branches of clustering: the one concerns the derivation of nonconvex clusters, known as spectral clustering; the other addresses the identification of biclusters, a set of samples together with similarity defining features, via Boolean matrix factorization. One of the main challenges in both considered settings is the robustness to noise. Assuming that the issue of robustness is controllable by means of theoretical insights, we have a closer look at those aspects of established clustering methods which lack a theoretical foundation. In the scope of Boolean matrix factorization, we propose a versatile framework for the optimization of matrix factorizations subject to binary constraints. Especially Boolean factorizations have been computed by intuitive methods so far, implementing greedy heuristics which lack quality guarantees of obtained solutions. In contrast, we propose to build upon recent advances in nonconvex optimization theory. This enables us to provide convergence guarantees to local optima of a relaxed objective, requiring only approximately binary factor matrices. By means of this new optimization scheme PAL-Tiling, we propose two approaches to automatically determine the number of clusters. The one is based on information theory, employing the minimum description length principle, and the other is a novel statistical approach, controlling the false discovery rate. The flexibility of our framework PAL-Tiling enables the optimization of novel factorization schemes. In a different context, where every data point belongs to a pre-defined class, a characterization of the classes may be obtained by Boolean factorizations. However, there are cases where this traditional factorization scheme is not sufficient. Therefore, we propose the integration of another factor matrix, reflecting class-specific differences within a cluster. Our theoretical considerations are complemented by empirical evaluations, showing how our methods combine theoretical soundness with practical advantages.2018-01-01T00:00:00ZExponential families on resource-constrained systems
http://hdl.handle.net/2003/36877
Title: Exponential families on resource-constrained systems
Authors: Piatkowski, Nico Philipp
Abstract: This work is about the estimation of exponential family models on resource-constrained
systems. Our main goal is learning probabilistic models on devices with highly restricted
storage, arithmetic, and computational capabilities—so called, ultra-low-power
devices. Enhancing the learning capabilities of such devices opens up opportunities for
intelligent ubiquitous systems in all areas of life, from medicine, over robotics, to home
automation—to mention just a few. We investigate the inherent resource consumption of
exponential families, review existing techniques, and devise new methods to reduce the
resource consumption. The resource consumption, however, must not be reduced at all
cost. Exponential families possess several desirable properties that must be preserved:
Any probabilistic model encodes a conditional independence structure—our methods
keep this structure intact. Exponential family models are theoretically well-founded.
Instead of merely finding new algorithms based on intuition, our models are formalized
within the framework of exponential families and derived from first principles. We do
not introduce new assumptions which are incompatible with the formal derivation of the
base model, and our methods do not rely on properties of particular high-level applications.
To reduce the memory consumption, we combine and adapt reparametrization
and regularization in an innovative way that facilitates the sparse parametrization of
high-dimensional non-stationary time-series. The procedure allows us to load models in
memory constrained systems, which would otherwise not fit. We provide new theoretical
insights and prove that the uniform distance between the data generating process
and our reparametrized solution is bounded. To reduce the arithmetic complexity of
the learning problem, we derive the integer exponential family, based on the very definition
of sufficient statistics and maximum entropy estimation. New integer-valued
inference and learning algorithms are proposed, based on variational inference, proximal
optimization, and regularization. The benefit of this technique is larger, the weaker
the underlying system is, e.g., the probabilistic inference on a state-of-the-art ultra-lowpower
microcontroller can be accelerated by a factor of 250. While our integer inference
is fast, the underlying message passing relies on the variational principle, which is inexact
and has unbounded error on general graphs. Since exact inference and other existing
methods with bounded error exhibit exponential computational complexity, we employ
near minimax optimal polynomial approximations to yield new stochastic algorithms
for approximating the partition function and the marginal probabilities. Changing the
polynomial degree allows us to control the complexity and the error of our new stochastic
method. We provide an error bound that is parametrized by the number of samples, the
polynomial degree, and the norm of the model’s parameter vector. Moreover, important
intermediate quantities can be precomputed and shared with the weak computational device
to reduce the resource requirement of our method even further. All new techniques
are empirically evaluated on synthetic and real-world data, and the results confirm the
properties which are predicted by our theoretical derivation. Our novel techniques allow
a broader range of models to be learned on resource-constrained systems and imply
several new research possibilities.2018-01-01T00:00:00ZDistributed analysis of vertically partitioned sensor measurements under communication constraints
http://hdl.handle.net/2003/35815
Title: Distributed analysis of vertically partitioned sensor measurements under communication constraints
Authors: Stolpe, Marco
Abstract: Nowadays, large amounts of data are automatically generated by devices and sensors. They measure, for instance, parameters of production processes, environmental conditions of transported goods, energy consumption of smart homes, traffic volume, air pollution and water consumption, or pulse and blood pressure of individuals. The collection and transmission of data is enabled by electronics, software, sensors and network connectivity embedded into physical objects. The objects and infrastructure connecting such objects are called the Internet of Things (IoT). In 2010, already 12.5 billion devices were connected to the IoT, a number about twice as large as the world's population at that time. The IoT provides us with data about our physical environment, at a level of detail never known before in human history. Understanding such data creates opportunities to improve our way of living, learning, working, and entertaining. For instance, the information obtained from data analysis modules embedded into existing processes could help their optimization, leading to more sustainable systems which save resources in sectors such as manufacturing, logistics, energy and utilities, the public sector, or healthcare.
IoT's inherent distributed nature, the resource constraints and dynamism of its networked participants, as well as the amounts and diverse types of data collected are challenging even the most advanced automated data analysis methods known today. Currently, there is a strong research focus on the centralization of all data in the cloud, processing it according to the paradigm of parallel high-performance computing. However, the resources of devices and sensors at the data generating side might not suffice to transmit all data. For instance, pervasive distributed systems such as wireless sensors networks are highly communication-constrained, as are streaming high throughput applications, or those where data masses are simply too huge to be sent over existing communication lines, like satellite connections. Hence, the IoT requires a new generation of distributed algorithms which are resource-aware and intelligently reduce the amount of data transmitted and processed throughout the analysis chain.
This thesis deals with the distributed analysis of vertically partitioned sensor measurements under communication constraints, which is a particularly challenging scenario. Here, not observations are distributed over nodes, but their feature values. The learning of accurate prediction models may require the combination of information from different nodes, necessarily leading to communication. The main question is how to design communication-efficient algorithms for the scenario, while at the same time preserving sufficient accuracy.
The first part of the thesis introduces fundamental concepts. An overview of the IoT and its many applications is given, with a special focus on data analysis, the vertically partitioned data scenario, and accompanying research questions. Then, basic notions of machine learning and data mining are introduced. A selection of existing distributed data mining approaches is presented and discussed in more detail. Distributed learning in the vertically partitioned data scenario is then motivated by a smart manufacturing case study. In a hot rolling mill, different machines assess parameters describing the processing of single steel blocks, whose quality should be predicted as early as possible, by analysis of distributed measurements. Each machine creates not single value series, but many of them. Their heterogeneity leads to challenging questions concerning the steps of preprocessing and finding a good representation for learning, for which solutions are proposed. Another problem is that quality information is not given for individual blocks, but charges of blocks. How can we nevertheless predict the quality of individual blocks? Time constraints lead to questions typical for the vertically partitioned data scenario. Which data should be analyzed locally, to match the constraints, and which should be sent to a central server?
Learning from aggregated label information is a relatively novel problem in machine learning research. A new algorithm for the task is developed and evaluated, the Learning from Label Proportions by Clustering (LLPC) algorithm. The algorithm's performance is compared to three other state-of-the-art approaches, in terms of accuracy and running time. It can be shown that LLPC achieves results with lower running time, while accuracy is comparable to that of its competitors, or significantly higher. The proposed algorithm comes with many other benefits, like ease of implementation and a small memory footprint.
For highly decentralized systems, the Training of Local Models from (Label) Counts (TLMC) algorithm is proposed. The method builds on LLPC, reducing communication by transferring only label counts for batches of observations between nodes. Feasibility of the approach is demonstrated by evaluating the algorithm's performance in the context of traffic flow prediction. It is shown that TLMC is much more communication-efficient than centralization of all data, but that accuracy can nevertheless compete with that of a centrally trained global model.
Finally, a communication-efficient distributed algorithm for anomaly detection is proposed, the Vertically Distributed Core Vector Machine (VDCVM). It can be shown that the proposed algorithm communicates up to an order of magnitude less data during learning, in comparison to another state-of-the-art approach, or training a global model by the centralization of all data. Nevertheless, in many relevant cases, the VDCVM achieves similar or even higher accuracy on several controlled and benchmark datasets.
A main result of the thesis is that communication-efficient learning is possible in cases where features from different nodes are conditionally independent, given the target value to be predicted. Most efficient are local models, which exchange label information between nodes. In comparison to consensus algorithms, which transmit labels repeatedly, TLMC sends labels only once between nodes. Communication could be even reduced further by learning from counts of labels. In the context of traffic flow prediction, the accuracy achieved is still sufficient in comparison to centralizing all data and training a global model. In the case of anomaly detection, similar results could be achieved by utilizing a sampling approach which draws only as many observations as needed to reach a (1+ε)-approximation of the minimum enclosing ball (MEB). The developed approaches have many applications in communication-constrained settings, in the sectors mentioned above. It has been shown that data can be reduced and learned from before it even enters the cloud. Decentralized processing might thus enable the analysis of big data masses, the more devices are getting connected to the IoT.2017-01-01T00:00:00ZAutomatic methods to extract latent meanings in large text corpora
http://hdl.handle.net/2003/35753
Title: Automatic methods to extract latent meanings in large text corpora
Authors: Pölitz, Christian
Abstract: This thesis concentrates on Data Mining in Corpus Linguistic. We show the use of modern Data Mining by developing efficient and effective methods for research and teaching in Corpus Linguistics in the fields of lexicography and semantics. Modern language resources as they are provided by Common Language Resources and Technology Infrastructure (http://clarin.eu) offer a large number of heterogeneous information resources of written language. Besides large text corpora, additional information about the sources or publication date of the documents from the corpora are available. Further, information about words from dictionaries or WordNets offer prior information of the word distributions. Starting with pre-studies in lexicography and semantics with large text corpora, we investigate the use of latent variable methods to extract hidden concepts in large text collections. We show that these hidden concepts correspond to meanings of words and subjects in text collections. This motivates an investigation of latent variable methods for large corpora to support linguistic research. In an extensive survey, latent variable models are described. Mathematical and geometrical foundations are explained to motivate the latent variable methods. We distinguish two starting points for latent variable models depending on how we represent documents internally. The first representation is based on geometric objects in a vector space and latent variable are represented by vectors. Latent factor models are described to extract latent variables by finding factorizations of matrices summarizing the document objects. The second representation is based on random sequences and the latent variables are random variables on which the sequences conditionally depend. Latent topic models are described to extract latent variables by finding conditionally depending variables. We explain state-of-the-art methods for factor and topic models.
To show the quality and hence the use of latent variable methods for corpus linguistic, different evaluation methods are discussed. Qualitative evaluation methods are described to effectively present the results of the latent variable methods to users. State-of-the-art quantitative evaluation methods are summarized to illustrate how to measure the quality of latent variable methods automatically. Additional, we propose new methods to efficiently estimate the quality of latent variable methods for corpora with time information about the documents. Besides standard evaluation methods based on likelihoods and coherences of the extracted hidden concepts, we develop methods to estimate the coherence of the concepts in terms of temporal aspects and likelihoods that including time.
Based on the survey on latent variable methods, we interpret the latent variable methods as optimization problem that finds latent variables to optimally describe the document corpus. To efficiently integrate additional information about a corpus from a modern language resources, we propose to extend the optimization for the latent variables with a regularization that includes this additional information. In terms of the different latent variable models, regularizations are proposed to either align latent factors or jointly model latent topics with information about the documents in the corpus.
From pre-studies and collaborations with researches from corpus linguistics, we compiled use cases to investigate the regularized latent variable methods for linguistic research and teaching. Two major application are investigated. In diachronic linguistics, we show efficient regularized latent topic models to jointly model latent variables with time stamps from documents. In variety linguistics, we integrate information about the sources of the documents to model similarities and dissimilarities between corpora.
Finally, a software package as Plugin for the Data Mining toolkit RapidMiner as it is developed to implement the methods from the thesis is described. The interfaces to the language resources and text corpora, the text processing methods, the latent variable methods and the evaluation methods are specified. We give detailed information about how the software is used on the use cases. The integration of the developed methods in the modern language resources like WebLicht or the Dictionary of the German Languages is explained to show the acceptance of our method in corpus linguistic research and teaching.2016-01-01T00:00:00ZGraphical models beyond standard settings: lifted decimation, labeling, and counting
http://hdl.handle.net/2003/35295
Title: Graphical models beyond standard settings: lifted decimation, labeling, and counting
Authors: Hadiji, Fabian
Abstract: With increasing complexity and growing problem sizes in AI and Machine Learning, inference and learning are still major issues in Probabilistic Graphical Models (PGMs). On the other hand, many problems are specified in such a way that symmetries arise from the underlying model structure. Exploiting these symmetries during inference, which is referred to as "lifted inference", has lead to significant efficiency gains. This thesis provides several enhanced versions of known algorithms that show to be liftable too and thereby applies lifting in "non-standard" settings. By doing so, the understanding of the applicability of lifted inference and lifting in general is extended. Among various other experiments, it is shown how lifted inference in combination with an innovative Web-based data harvesting pipeline is used to label author-paper-pairs with geographic information in online bibliographies. This results is a large-scale transnational bibliography containing affiliation information over time for roughly one million authors. Analyzing this dataset reveals the importance of understanding count data. Although counting is done literally everywhere, mainstream PGMs have widely been neglecting count data. In the case where the ranges of the random variables are defined over the natural numbers, crude approximations to the true distribution are often made by discretization or a Gaussian assumption. To handle count data, Poisson Dependency Networks (PDNs) are introduced which presents a new class of non-standard PGMs naturally handling count data.2015-01-01T00:00:00ZMining big data streams for multiple concepts
http://hdl.handle.net/2003/34363
Title: Mining big data streams for multiple concepts
Authors: Bockermann, Christian2015-01-01T00:00:00ZAbout the exploration of data mining techniques using structured features for information extraction
http://hdl.handle.net/2003/29487
Title: About the exploration of data mining techniques using structured features for information extraction
Authors: Jungermann, Felix
Abstract: The World Wide Web is a huge source of information. The amount of information being available in the World Wide Web becomes bigger and bigger every day. It is impossible to handle this amount of information by hand. Special techniques have to be used to deliver smaller excerpts of information which become manageable. Unfortunately, these techniques like search engines, for instance, just deliver a certain view of the informations original appearance. The delivered information is present in various types of les like websites, text documents, video clips, audio files and the like. The extraction of relevant and interesting pieces of information out of these files is very complex and time-consuming. Special techniques which allow for an automatic extraction of interesting informational units are analyzed in this work. Such techniques are based on Machine Learning methods. In contrast to traditional Machine Learning tasks the processing of text documents in this context needs certain techniques. The structure of natural language contained in text document poses constraints which should be respected by the Machine Learning method. These constraints and the specially tuned methods respecting them are another important aspect in this work. After defining all needed formalisms of Machine Learning which are used in this work, I present multiple approaches of Machine Learning applicable to the fields of Information Extraction. I describe the historical development from first approaches of Information Extraction over Named Entity Recognition to the point of Relation Extraction. The possibilities of using linguistic resources for the creation of feature sets for Information Extraction purposes are presented. I show how Relation Extraction is formally defined, and I additionally show what kind of methods are used for Relation Extraction in Machine Learning. I focus on Relation Extraction techniques which benefit on the one hand from minimum optimization and on the other hand from efficient data structure. Most of the experiments and implementations described in this work were done using the open source framework for Data Mining RapidMiner. To apply this framework on Information Extraction tasks I developed an extension called Information Extraction Plugin which is exhaustively described. Finally, I present applications which explicitly benefit from the collaboration of Data Mining and Information Extraction.2012-06-26T00:00:00ZResource-aware annotation through active learning
http://hdl.handle.net/2003/27172
Title: Resource-aware annotation through active learning
Authors: Tomanek, Katrin
Abstract: The annotation of corpora has become a crucial prerequisite for
information extraction systems which heavily rely on supervised
machine learning techniques and therefore require large amounts of
annotated training material. Annotation, however, requires human
intervention and is thus an extremely costly, labor-intensive, and
error-prone process. The burden of annotation is one of the major
obstacles when well-established information extraction systems are to
be applied to real-world problems and so a pressing research question
is how annotation can be made more efficient.
Most annotated corpora are built by collecting the documents to be
annotated on a random sampling basis or based on simple keyword
search. Only recently, more sophisticated approaches to select the
base material in order to reduce annotation effort are being
investigated. One promising direction is known as Active Learning (AL)
where only examples of high utility for classifier training are
selected for manual annotation. Because of this intelligent selection,
classifiers of a certain target performance can be yieled with less
labeled data points.
This thesis centers around the question how AL can be applied as
resource-aware strategy for linguistic annotation. A set of
requirements is defined and several approaches and adaptations to the
standard form of AL are proposed to meet these requirements. This
includes: (1) a novel method to monitor and stop the AL-driven
annotation process; (2) an approach to semi-supervised AL where only
highly critical tokens have to actually be manually annotated while
the rest is automatically tagged; (3) a discussion and empirical
investigation of the reusability of actively drawn samples; (4) a
comparative study how class imbalance can be reduced right upfront
during AL-driven data acquisition; (5) two methods for selective
sampling of examples which are useful for multiple learning problems;
(6) an extensive evaluation of the proposed approaches to AL for Named
Entity Recognition with respect to both savings in corpus size and
actual annotation time; and finally (7) three methods how these
approaches can be made cost-conscious so as to reduce annotation time
even more.2010-05-12T15:56:32ZNon-convex and multi-objective optimization in data mining
http://hdl.handle.net/2003/26104
Title: Non-convex and multi-objective optimization in data mining
Authors: Mierswa, Ingo2009-05-07T09:39:56ZDistributed collaborative structuring
http://hdl.handle.net/2003/25745
Title: Distributed collaborative structuring
Authors: Wurst, Michael
Abstract: Making Inter- and Intranet resources available in a structured way is one of the most important and challenging problems today. An underlying structure allows users to search for information, documents or relationships without a clearly defined information need. While search and filtering technology is becoming more and more powerful, the development of such explorative access methods lacks behind. This work is concerned with the development of large-scale data mining methods that allow to structure information spaces based on loosely coupled user annotations and navigation patterns. An essential challenge, that was not yet fully realized in this context, is heterogeneity. Different users and user groups often have different preferences and needs on how to access an information collection. While current Business Intelligence, Information Retrieval or Content Management solutions allow for a certain degree of personalization, these approaches are still very static. This considerably limits their applicability in heterogeneous environments. This work is based on a novel paradigm, called collaborative structuring. This term is chosen as a generalization to the term collaborative filtering. Instead of only filtering items, collaborative structuring allows users to organize information spaces in a loosely coupled way, based on patterns emerging through data mining. A first contribution of the work is to define the conceptual notion of collaborative structuring as combinatorial optimization problem and to put it into relation with existing research in the areas of data and web mining. As second contribution, highly scalable, distributed optimization strategies are proposed and analyzed. Finally, the proposed approaches are quantitatively evaluated against existing methods using several real-world data sets. Also, practical experience from two application areas is given, namely information access for heterogeneous expert communities and collaborative media organization.2008-07-17T09:43:44ZKnowledge discovery in databases at a conceptual level
http://hdl.handle.net/2003/24945
Title: Knowledge discovery in databases at a conceptual level
Authors: Euler, Timm
Abstract: Wissensentdeckung in Datenbanken (engl. Knowledge Discovery in
Databases, KDD) ist die Bezeichnung für einen nichttrivialen
Prozess, der im Kern eine oder mehrere Anwendungen eines
Algorithmus aus dem Maschinellen Lernen auf echte Daten beinhaltet.
Vorbereitende Schritte in diesem Prozess bereiten die Beispiele,
aus denen gelernt wird, auf, erstellen also die Beispiel-
Repräsentationssprache. Nachfolgende Schritte wenden die gelernten
Ergebnisse auf neue Daten an. In dieser Arbeit wird der gesamte
Prozess auf einer konzeptuellen (begrifflichen) Ebene analysiert.
Außerdem wird MiningMart beschrieben, eine Software, die den
gesamten Prozess unterstützt, aber den Fokus auf die Vorverarbeitung
der Daten legt. Diese Vorverarbeitungsphase ist die zeitintensivste
Phase des Wissensentdeckungsprozesses. Sie wird durch die Beiträge
dieser Arbeit umfassend und auf neuartige Weise unterstützt. Im
Ergebnis lässt sich der Aufwand für Benutzer bei der Erstellung,
beim Rapid Prototyping, bei der Modellierung, Ausführung,
Veröffentlichung und Wiederverwendung von KDD-Prozessen deutlich
reduzieren.; Knowledge Discovery in Databases (KDD) is a nontrivial process
centered around one or more applications of a Machine Learning
algorithm to real world data. Steps leading towards this central
step prepare the examples from which the algorithm learns, and
thus create the example representation language. Steps following
the central step may deploy the learned results to new data. In
this thesis, the complete process is described from a conceptual
view, and the MiningMart software is presented which supports the
whole process, but puts its focus on data preparation for KDD. This
preparation phase is the most time-consuming part of the process,
and is comprehensively supported in new ways by the contributions
towards MiningMart made in this thesis. The result are greatly
reduced user efforts for rapid prototyping, modelling, execution,
publication and re-use of KDD processes.2008-01-15T10:14:14ZScalable and accurate knowledge discovery in real world databases
http://hdl.handle.net/2003/24813
Title: Scalable and accurate knowledge discovery in real world databases
Authors: Scholz, Martin2007-10-31T13:57:43ZLearning interpretable models
http://hdl.handle.net/2003/23008
Title: Learning interpretable models
Authors: Rüping, Stefan
Abstract: Interpretability is an important, yet often neglected criterion when
applying machine learning algorithms to real-world tasks. An
understandable model enables the user to gain more knowledge from his
data and to participate in the knowledge discovery process in a more
detailed way. Hence, learning interpretable models is a challenging
task, whose complexity comes from the problems that interpretability is
a fuzzy, subjective concept and human mental capabilities are in some
ways astonishingly limited. At the same time, interpretability is a
critical problem, because it is crucial for problems that cannot be
solved purely automatically.
The work presented in this thesis is structured along the three
dimensions of understandability, accuracy, and efficiency. It contains
contributions on the levels of the optimization of the interpretability
of a learner with and without knowledge of its internals (white box and
black box approach), the description of a models errors by local
patterns and the improvement of global models with local models.
Starting from an analysis of the requirements for and measures of
interpretability in the context of knowledge discovery, diverse possible
approaches of generating understandable models are investigated, with a
particular focus on interpretable Support Vector Machines and local
effects in the data. Problems of existing techniques and ad-hoc
approaches to understandability optimization are analyzed and improved
algorithms are developed.2006-10-20T13:06:43ZAn interpretative approach to the model driven development of web applications
http://hdl.handle.net/2003/22277
Title: An interpretative approach to the model driven development of web applications
Authors: Haustein, Stefan
Abstract: The increasing size and complexity of web applications has led to a situation where the traditional approach of creating and managing a set of plain HTML files is inappropriate in many cases. Consistency in structure, look and feel, and hyperlinks needs to be maintained, and support for different content formats may be required. The combination of XML Schema, XML and XSLT is able to improve this situation, but the expressive power of XML Schema is insufficient for application domains where more than a pure hierarchical structure is required.
In this work, we have chosen the XML toolchain as a guideline to construct an alternative basis for web information systems at a higher level of abstraction, namely UML class diagrams. We have identified a UML counterpart or implemented a substitute for each constituent of the XML processing chain, showing that it is possible to build a consistent UML-based system for model driven web applications.
Since our approach is based on model interpretation, a system prototype can be created by simply drawing a conceptual model in the form of a UML class diagram---a step that is required in the relevant development methodologies anyway. By making this first step immediately operational without any compilation or transformation steps, the gap between web development methodologies and actual system implementation has been narrowed significantly.2006-04-07T10:25:01ZAutomatisierte Merkmalsextraction aus Audiodaten
http://hdl.handle.net/2003/21425
Title: Automatisierte Merkmalsextraction aus Audiodaten
Authors: Mierswa, Ingo2005-05-31T00:00:00ZEntdeckung von funktionalen Abhängigkeiten und unären Inklusionsabhängigkeiten in relationalen Datenbanken
http://hdl.handle.net/2003/2634
Title: Entdeckung von funktionalen Abhängigkeiten und unären Inklusionsabhängigkeiten in relationalen Datenbanken
Authors: Brockhausen, Peter2003-12-12T00:00:00ZmAGENTa
http://hdl.handle.net/2003/2633
Title: mAGENTa
Authors: Bullerdick, Kai; Ritthoff, Oliver2003-12-12T00:00:00ZErweiterung des EMILE-Verfahrens zum induktiven Lernen von kontextfreien Grammatiken für natürliche Sprache
http://hdl.handle.net/2003/2631
Title: Erweiterung des EMILE-Verfahrens zum induktiven Lernen von kontextfreien Grammatiken für natürliche Sprache
Authors: Dörnenburg, Erik2003-12-12T00:00:00ZAnwendung von Data Mining Verfahren auf Datenbestände der Bauwirtschaft
http://hdl.handle.net/2003/2632
Title: Anwendung von Data Mining Verfahren auf Datenbestände der Bauwirtschaft
Authors: Chliapnikov, Victor2003-12-12T00:00:00ZAnwendung eines Data Mining-Verfahrens auf Versicherungsvertragsdaten
http://hdl.handle.net/2003/2630
Title: Anwendung eines Data Mining-Verfahrens auf Versicherungsvertragsdaten
Authors: Fisseler, Jens2003-12-12T00:00:00ZAuffinden interessanter Wertebereiche in Datenbankattributen
http://hdl.handle.net/2003/2629
Title: Auffinden interessanter Wertebereiche in Datenbankattributen
Authors: Franzel, Christian2003-12-12T00:00:00ZLernen intervallbezogener Merkmale eines mobilen Roboters
http://hdl.handle.net/2003/2628
Title: Lernen intervallbezogener Merkmale eines mobilen Roboters
Authors: Goebel, Michael2003-12-12T00:00:00ZGraphbasierte Navigation innerhalb von Gebäuden für einen autonomen Roboter
http://hdl.handle.net/2003/2627
Title: Graphbasierte Navigation innerhalb von Gebäuden für einen autonomen Roboter
Authors: Haustein, Stefan2003-12-12T00:00:00ZInformationsextraktion aus Freitext-Einträgen einer Datenbank
http://hdl.handle.net/2003/2626
Title: Informationsextraktion aus Freitext-Einträgen einer Datenbank
Authors: Hölscher, Markus2003-12-12T00:00:00ZBenutzergeführtes Lernen von Dokument-Strukturauszeichnungen aus Formatierungsmerkmalen
http://hdl.handle.net/2003/2625
Title: Benutzergeführtes Lernen von Dokument-Strukturauszeichnungen aus Formatierungsmerkmalen
Authors: Hüppe, Christian2003-12-12T00:00:00ZEinsatz eines intelligenten, lernenden Agenten für das World Wide Web
http://hdl.handle.net/2003/2624
Title: Einsatz eines intelligenten, lernenden Agenten für das World Wide Web
Authors: Joachims, Thorsten2003-12-12T00:00:00ZRule Set Quality Measures For Inductive Learning Algorithms
http://hdl.handle.net/2003/2623
Title: Rule Set Quality Measures For Inductive Learning Algorithms
Authors: Klinkenberg, Ralf2003-12-12T00:00:00ZEntwicklung und Realisierung eines Konzepts zum Lesen der Beschriftung von Mikrochips
http://hdl.handle.net/2003/2622
Title: Entwicklung und Realisierung eines Konzepts zum Lesen der Beschriftung von Mikrochips
Authors: Knappmann, Stefan2003-12-12T00:00:00ZPositionsvorhersage von bewegten Objekten in großformatigen Bildsequenzen
http://hdl.handle.net/2003/2621
Title: Positionsvorhersage von bewegten Objekten in großformatigen Bildsequenzen
Authors: Loos, Hartmut S.2003-12-12T00:00:00ZEntwicklung eines Agenten zur Unterstützung der Umformatierung von Literaturhinweisen
http://hdl.handle.net/2003/2620
Title: Entwicklung eines Agenten zur Unterstützung der Umformatierung von Literaturhinweisen
Authors: Mansfeld, Eckart2003-12-12T00:00:00ZURLChecker
http://hdl.handle.net/2003/2619
Title: URLChecker
Authors: Masloch, André2003-12-12T00:00:00ZMenschliches und Maschinelles Lernen
http://hdl.handle.net/2003/2617
Title: Menschliches und Maschinelles Lernen
Authors: Mühlenbrock, Martin2003-12-12T00:00:00ZTechniken des Data Mining zur Analyse von Telekommunikationsnetzwerken
http://hdl.handle.net/2003/2618
Title: Techniken des Data Mining zur Analyse von Telekommunikationsnetzwerken
Authors: Michaelis, Stefan2003-12-12T00:00:00ZWissensentdeckung in Datenbanken mit dynamischer Anpassung des Hypothesentests
http://hdl.handle.net/2003/2616
Title: Wissensentdeckung in Datenbanken mit dynamischer Anpassung des Hypothesentests
Authors: Münstermann, Dirk2003-12-12T00:00:00ZAnnehmbarkeit verschiedener Verfahren zur Wissensentdeckung auf E-Commerce Daten
http://hdl.handle.net/2003/2615
Title: Annehmbarkeit verschiedener Verfahren zur Wissensentdeckung auf E-Commerce Daten
Authors: Neifach, Marina2003-12-12T00:00:00ZEntwicklung eines wissensbasierten Assistentensystems zur Analyse von Fall-Kontroll-Studien
http://hdl.handle.net/2003/2614
Title: Entwicklung eines wissensbasierten Assistentensystems zur Analyse von Fall-Kontroll-Studien
Authors: Robers, Ursula2003-12-12T00:00:00ZErwerb funktionaler, räumlicher und kausaler Beziehungen von Fahrzeugteilen aus einer technischen Dokumentation
http://hdl.handle.net/2003/2613
Title: Erwerb funktionaler, räumlicher und kausaler Beziehungen von Fahrzeugteilen aus einer technischen Dokumentation
Authors: Siebert, Mark2003-12-12T00:00:00ZRepräsentation operationaler Begriffe zum Lernen aus Roboter-Sensordaten
http://hdl.handle.net/2003/2612
Title: Repräsentation operationaler Begriffe zum Lernen aus Roboter-Sensordaten
Authors: Sklorz, Stefan2003-12-12T00:00:00ZEin Algorithmus zur Lösung des Farthest-Pair-Problems
http://hdl.handle.net/2003/2611
Title: Ein Algorithmus zur Lösung des Farthest-Pair-Problems
Authors: Stolpe, Marco2003-12-12T00:00:00ZEin Multiagentensystem zur Erstellung eines persönlichen Pressespiegels
http://hdl.handle.net/2003/2610
Title: Ein Multiagentensystem zur Erstellung eines persönlichen Pressespiegels
Authors: Veltmann, Georg2003-12-12T00:00:00ZLernen qualitativer Merkmale aus numerischen Robotersensordaten
http://hdl.handle.net/2003/2609
Title: Lernen qualitativer Merkmale aus numerischen Robotersensordaten
Authors: Wessel, Stephanie2003-12-12T00:00:00ZVerwaltung großer Datenmengen für die effiziente Anwendung des Apriori-Algorithmus zur Wissensentdeckung in Datenbanken
http://hdl.handle.net/2003/2608
Title: Verwaltung großer Datenmengen für die effiziente Anwendung des Apriori-Algorithmus zur Wissensentdeckung in Datenbanken
Authors: Wiechers, Frank2003-12-12T00:00:00ZProduktionsplanung mit Hilfe von Multiagentensystemen
http://hdl.handle.net/2003/2607
Title: Produktionsplanung mit Hilfe von Multiagentensystemen
Authors: Lüdecke, Sascha2003-12-12T00:00:00ZInformationsextraktion durch Zusammenfassung maschinell selektierter Textsegmente
http://hdl.handle.net/2003/2606
Title: Informationsextraktion durch Zusammenfassung maschinell selektierter Textsegmente
Authors: Euler, Timm
Abstract: In dieser Diplomarbeit werden zwei Stufen entwickelt, die die Kürzung von Texten unter einem inhaltlichen Gesichtspunkt ermöglichen. Die erste Stufe ist die Auswahl von einzelnen Sätzen aus beliebigen Texten unter dem Gesichtspunkt ihrer Zugehörigkeit zu einem vorgegebenen Thema. In der zweiten Stufe werden die extrahierten Sätze gekürzt, möglichst ohne dass wichtige Informationen verloren gehen. Eine Beispielanwendung, die näher untersucht wird, ist die Umwandlung von Emailtexten, die Terminabsprachen enthalten, in (längenbeschränkte) SMS-Nachrichten. Die Verfahren sind allgemein zur Textzusammenfassung oder Informationsextraktion anwendbar.2002-02-26T00:00:00ZMaschinelle Lernverfahren zum adaptiven Informationsfiltern bei sich verändernden Konzepten
http://hdl.handle.net/2003/2604
Title: Maschinelle Lernverfahren zum adaptiven Informationsfiltern bei sich verändernden Konzepten
Authors: Klinkenberg, Ralf2000-01-19T00:00:00ZModellierung eines intensivmedizinischen Behandlungsprotokolls zur Validierung anhand realer Patientendaten
http://hdl.handle.net/2003/2605
Title: Modellierung eines intensivmedizinischen Behandlungsprotokolls zur Validierung anhand realer Patientendaten
Authors: Scholz, Martin
Abstract: In dieser Arbeit wird ein neuer Ansatz zur Entwicklung intensivmedizinischer Behandlungsprotokolle für den Einsatz am Krankenbett vorgestellt. Im Gegensatz zu bisherigen Ansätzen wird dabei die Modellierung medizinischen Wissens bereits sehr früh anhand von protokollierten medizinischen Verlaufsdaten validiert. Der Schlüssel zu einer weitreichenden Unterstützung medizinischer Experten ist dabei die Formalisierung und Repräsentation als wissensbasiertes System. Nach der Umsetzung eines noch nicht lauffähigen Modellentwurfs in einen solchen Formalismus werden die Vorteile des Vorgehens exemplarisch anhand einiger Experimente mit Verlaufsdaten aufgezeigt.2002-02-26T00:00:00ZSuche und Extraktion relevanter Informationen aus dem Internet zur Ähnlichkeitsbewertung verschiedener Aspekte einer virtuellen Konferenz
http://hdl.handle.net/2003/2603
Title: Suche und Extraktion relevanter Informationen aus dem Internet zur Ähnlichkeitsbewertung verschiedener Aspekte einer virtuellen Konferenz
Authors: Schwering, Christian
Abstract: Thema dieser Diplomarbeit ist der Entwurf und die Implementierung eines Informationsagentensytems für die Beschaffung von personenspezifischen Daten aus dem World Wide Web. Die in ausgesuchten Datenquellen gefundenen und extrahierten Daten werden anschließend zur Wissensentdeckung genutzt. Es sollen Regeln entdeckt werden, die über das Verhältnis zwischen Personen Aufschluss geben. Im Vordergrund steht eine praktische Untersuchung über die Möglichkeiten und Schwierigkeiten von Agentensystemen im World Wide Web. Die Arbeit ist die elektronische Version der Diplomarbeit Schwering, Christian: Suche und Extraktion relevanter Informationen aus dem Internet zur Ähnlichkeitsbewertung verschiedener Aspekte einer virtuellen Konferenz".2000-01-18T00:00:00ZAutomatisierte WWW-Veröffentlichung auf der Basis formaler Auszeichnungssprachen
http://hdl.handle.net/2003/2602
Title: Automatisierte WWW-Veröffentlichung auf der Basis formaler Auszeichnungssprachen
Authors: Mintert, Stefan
Abstract: Das Ergebnis dieser Diplomarbeit ist ein System, das SGML/XML-kodierte Dokumente für die Veröffentlichung im World Wide Web aufbereitet. Der Benutzer, d.h. der Leser, hat die Möglichkeit, nach Stichworten in einer von mehreren Kategorien zu suchen. Suchbegriff und Suchkategorie ersetzen die häufig zu ungenaue Volltextsuche. Im Gegensatz zu den Ausgaben von herkömmlichen Suchfunktionen erhält er als Ergebnis keinen vorgefertigten Ausschnitt aus dem Dokument (z.B. ein Kapitel, einen Abschnitt usw.). Vielmehr wird der Ausschnitt in Abhängigkeit der Textstelle bestimmt, die bei der Suche als Treffer ermittelt wurde. Die Definition solcher Dokumentausschnitte und der Suchkategorie findet mit Bezug auf den Dokumenttyp, nicht bezüglich des konkreten Textes statt. Diese Herangehensweise besitzt den Vorteil, notwendige Konfigurationen nur einmal je Dokumenttyp machen zu müssen, die dann für sämtliche Dokumente dieses Typs anwendbar sind. Die Arbeit ist die elektronische Version der Diplomarbeit Mintert, Stefan: Automatisierte WWW-Veröffentlichung auf der Basis formaler Auszeichnungssprachen".2000-01-12T00:00:00ZEstimating the generalization performance of a SVM efficiently
http://hdl.handle.net/2003/2601
Title: Estimating the generalization performance of a SVM efficiently
Authors: Joachims, Thorsten
Abstract: This paper proposes and analyzes an approach to estimating the generalization performance of a support vector machine (SVM) for text classification. Without any computation intensive resampling, the new estimators are computationally much more efficient than cross-validation or bootstrap, since they can be computed immediately from the form of the hypothesis returned by the SVM. Moreover, the estimators delevoped here address the special performance measures needed for text classification. While they can be used to estimate error rate, one can also estimate the recall, the precision, and the F1. A theoretical analysis and experiments on three text classification collections show that the new method can effectively estimate the performance of SVM text classifiers in a very efficient way. The paper is written in English.2000-01-12T00:00:00ZZeitreihenprognose für Warenwirtschaftssysteme unter Berücksichtigung asymmetrischer Kostenfunktionen
http://hdl.handle.net/2003/2600
Title: Zeitreihenprognose für Warenwirtschaftssysteme unter Berücksichtigung asymmetrischer Kostenfunktionen
Authors: Rüping, Stefan1999-11-29T00:00:00ZEntwurf eines interaktiven Werkzeugs zur Datenverwaltung in einem situierten Lernsystem
http://hdl.handle.net/2003/2599
Title: Entwurf eines interaktiven Werkzeugs zur Datenverwaltung in einem situierten Lernsystem
Authors: Zitzler, Eckart1999-11-22T00:00:00ZAnwendung des Lernverfahrens RDT auf eine relationale Datenbank
http://hdl.handle.net/2003/2598
Title: Anwendung des Lernverfahrens RDT auf eine relationale Datenbank
Authors: Lindner, Guido
Abstract: Die Verbindung maschinellen Lernenverfahren mit relationalen Datenbanken ist für verschiedene Bereiche wie knowledge discovery in databases, deduktiven Datenbanken und maschinellen Lernen selbst von steigendem Interesse. Diese Arbeit beschreibt die Anwendung des logikorientierten Lernverfahrens RDT auf relationale Datenbanken. Nach einer Einführung in den Aufbau relationaler Datenbanken und der Beschreibung des Verfahrens RDT wird die Frage der Repräsentation von Datenbank für das logikbasierte Lernen diskutiert. Weiterhin wird gezeigt, wie bestimmte Datenbankeigenschaften für die Einschränkung des Hypothesenraums ausgenutzt werden können. Abschliessend wird RDT, angewand auf relationale Datenbanken (RDT/DB) in experimentellen Tests über dem im maschinellen Lernen bekannten Sachbereich KRK und einer Anwendung aus dem Bereich der Roboternavigation getestet. Die Arbeit ist die elektronische Version der Diplomarbeit Lindner, Guido: Anwendung des Lernverfahrens RDT auf eine relationale Datenbank". Eine Zusammenfassung der Diplomarbeit ist auch als Forschungsbericht Nr. 12 des Lehrstuhls Informatik VIII der Universität Dortmund unter dem Titel "Lindner, Guido: Logikbasiertes Lernen in relationalen Datenbanken" erschienen (http://eldorado.uni-dortmund.de:8080/FB4/ls8/reports/report12).1999-11-22T00:00:00ZAchieving intelligence in mobility
http://hdl.handle.net/2003/2597
Title: Achieving intelligence in mobility
Authors: Kaiser, Michael
Abstract: This paper presents an integrated approach to the application of machine learning tasks that can be observed throughout a number of typical applications of mobile robots and puts those tasks into persepective with respect to both existing and newly developed learning techniques. The actual realization of the approach has been carried out on the two mobile robots PRIAMOS and TESEO, which are both operating in a real office environment. In this context, several experimental results are presented. This paper appeared in: IEEE-Expert: Special Track on Intelligent Robotic Systems, Vol. 10, No. 2, April 1995.1999-11-09T00:00:00ZMaking large scale SVM learning practical
http://hdl.handle.net/2003/2596
Title: Making large scale SVM learning practical
Authors: Joachims, Thorsten
Abstract: Training a support vector machine (SVM) leads to a quadratic optimization problem with bound constraints and one linear equality constraint. Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner. In particular, for large learning tasks with many training examples, off-the-shelf optimization techniques for general quadratic programs quickly become intractable in their memory and time requirements. SVMLight is an implementation of an SVM learner which addresses the problem of large tasks. This chapter presents algorithmic and computational results developed for SVMlight V2.0, which make large-scale SVM training more practical. The results give guidelines for the application of SVMs to large domains. Also published in: 'Advances in Kernel Methods - Support Vector Learning', Bernhard Schölkopf, Christopher J. C. Burges, and Alexander J. Smola (eds.), MIT Press, Cambridge, USA, 1998. The paper is written in English.1999-10-29T00:00:00ZText categorization with support vector machines
http://hdl.handle.net/2003/2595
Title: Text categorization with support vector machines
Authors: Joachims, Thorsten
Abstract: This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies, why SVMs are appropriate for this task. Empirical results support the theoretical findings. SVMs achieve substantial improvements over the currently best performing methods and they behave robustly over a variety of different learning tasks. Furthermore, they are fully automatic, eliminating the need for manual parameter tuning. The paper is written in English.1999-10-29T00:00:00ZAutomatische Kategorisierung von Volltexten unter Anwendung von NLP-Techniken
http://hdl.handle.net/2003/2594
Title: Automatische Kategorisierung von Volltexten unter Anwendung von NLP-Techniken
Authors: Schewe, Sandra
Abstract: Die vorliegende Arbeit befasst sich mit der Informationsgewinnung aus Daten, wie sie das World Wide Web zur Verfügung stellt. Dabei liegt der Schwerpunkt auf der Verarbeitung von Volltexten, denn ein grosser Anteil der Daten ist im WWW in dieser Form verfügbar. Zur Unterstützung der Informationsgewinnung werden die Volltexte kategorisiert, so dass ein Benutzer entweder gezielt in einer Kategorie nach bestimmten Informationen suchen kann, oder so dass ihm nach Themen sortierte Texte vorgelegt werden können, aus denen er nach Interesse Themengebiete auswählen kann. Zur Kategorisierung der Texte werden Techniken aus dem Bereich Natural Language Processing, kurz NLP-Techniken, herangezogen. Überlegungen zu den besonderen Eigenschaften der deutschen Sprache führen zu der hier vorgestellten Verfahrensweise. Experimente werden zeigen, in wie weit der Einsatz von NLP-Techniken und damit die Berücksichtigung von Sprache von Nutzen ist. The paper is written in German.1999-10-29T00:00:00ZOptimizing chain datalog programs and their inference procedures
http://hdl.handle.net/2003/2592
Title: Optimizing chain datalog programs and their inference procedures
Authors: Rieger, Anke
Abstract: We present methods for optimizing chain Datalog programs by restructuring and post-processing. The rules of the programs define intensionally a set of target concepts, which are to be derived via forward chaining. The restructuring methods transform the rules, such that redundancies and ambiguities, which prevent efficient evaluations, are removed without changing the coverage of the target concepts. The post-processing method increases the coverage by introducing recursive rules in the chain Datalog program. Based on the correspondence between chain Datalog programs and context-free languages, which in our case reduce to regular ones, we present a method to map restructured and/or post-processed programs to prefix acceptors, which are deterministic finite state automata, whose input/output alphabets consist of predicates. We present an efficient marker passing method which is applied to a prefix acceptor, and which optimizes inferences. We proof that this method is sound and complete, i.e., it calculates the minimum Herbrand model of the chain Datalog program which has been mapped to the respective prefix acceptor. As the developments, presented in this paper, have been motivated by an ILP application to robotics, we have applied the methods to this real-world domain. The experimental results at the end of the paper reflect the improvements, we have gained. The paper is written in English.1999-10-29T00:00:00ZCORA - a knowledge based system for the analysis of case control studies
http://hdl.handle.net/2003/2593
Title: CORA - a knowledge based system for the analysis of case control studies
Authors: Robers, Ursula
Abstract: Carrying out a statistical analysis the researcher is concerned with the problem of choosing an appropriate statistical technique from a large number of competing methods. Most common statistical software offer different methods for analysing the data without giving any support concerning the adequacy of a method for a particular data set. This paper outlines the main features of the computer system CORA which provides a statistical analysis of stratified contingency tables and additionally supports the researcher at the different steps of this analysis. Here, the support given by the system consists of two different aspects. On the one hand, the help system of CORA contains general information on the implemented statistical methods which can be obtained on request. On the other hand an advice tool recommends an adequate statistical method which depends on the actual empirical casecontrol data to be analysed. To build up the advice tool a set of rules being discovered by machine learning from simulation studies is integrated into the system CORA. The paper is written in English.1999-10-29T00:00:00ZData preparation for inductive learning in robotics
http://hdl.handle.net/2003/2591
Title: Data preparation for inductive learning in robotics
Authors: Rieger, Anke
Abstract: The application of logic-based learning algorithms in real-world domains, such as robotics, requires extensive data engineering, including the transformation of numerical tabular representations of real-world data to logic-based representations, feature and concept selection, the generation of the respective descriptions, and the composition of training and test sets, which meet the requirements of the respective learning algorithms. We are developing a tool, which supports a user of inductive logic-based algorithms with handling these tasks. The tool is developed in the context of a robot navigation domain, in which different logic-based algorithms are applied to learn operational concepts. The paper is written in English.1999-10-29T00:00:00ZInferring probabilistic automata from sensor data for robot navigation
http://hdl.handle.net/2003/2590
Title: Inferring probabilistic automata from sensor data for robot navigation
Authors: Rieger, Anke
Abstract: We address the problem of guiding a robot in such a way, that it can decide, based on perceived sensor data, which future actions to choose, in order to reach a goal. In order to realize this guidance, the robot has access to a (probabilistic) automaton (PA), whose final states represent concepts, which have to be recognized in order to verify, that a goal has been achieved. The contribution of this work is to learn these PA's from classified sensor data of robot traces through known environments. Within this framework, we account for the uncertainties arising from ambiguous perceptions. We introduce a knowledge structure, called prefix tree , in which the sample data, represented as cases, is organized. The prefix tree is used to derive and estimate the parameters of deterministic, as well as probabilistic automata models, which reflect the inherent knowledge, implicit in the data, and which are used for recognition in a restricted first-order logic framework.1999-10-29T00:00:00ZMLnet report: training in Europe on machine learning
http://hdl.handle.net/2003/2589
Title: MLnet report: training in Europe on machine learning
Authors: Ellebrecht, Mario; Morik, Katharina
Abstract: Machine learning techniques offer opportunities for a variety of applications and the theory of machine learning investigates problems that are of interest for other fields of computer science (e.g., complexity theory, logic programming, pattern recognition). However, the impacts of machine learning can only be recognized by those who know the techniques and are able to apply them. Hence, teaching machine learning is necessary before this field can diversify computer science. In order to find out where machine learning is taught in which context and to what extent, MLnet has gathered data from several European countries. In this report, an incomplete overview of the training situation in Europe is given and a questionnaire for obtaining a more complete survey is proposed.1999-10-29T00:00:00ZThe expanded implication problem of data dependencies
http://hdl.handle.net/2003/2588
Title: The expanded implication problem of data dependencies
Authors: Bell, Siegfried
Abstract: The implication problem is the problem of deciding whether a given set of dependencies implies or entails another dependency. Up to now, the entailment of excluded dependencies or independencies is only regarded on a metalogical level which is not suitable for an automatic inference process of these. But the inference of independencies are important for new topics in database research like semantic query optimization. In this paper, the expanded implication problem is discussed in order to decide implications of dependencies and independencies. The main result is an axiomatization of functional, inclusion and multivalued independencies and the corresponding inference relations. Also we discuss the use of independencies in knowledge discovery in databases and semantic query optimization.1999-10-29T00:00:00ZDatengesteuertes Lernen von syntaktischen Einschränkungen des Hypothesenraumes für modellbasiertes Lernen
http://hdl.handle.net/2003/2587
Title: Datengesteuertes Lernen von syntaktischen Einschränkungen des Hypothesenraumes für modellbasiertes Lernen
Authors: Lübbe, Marcus
Abstract: Lernverfahren für prädikatenlogische Formalismen eignen sich als Werkzeuge, die den Aufbau und die Wartung komplexer Sachbereichstheorien unterstützen, da sie sowohl Hintergrundwissen in den Lernvorgang einbeziehen als auch relationale Beziehungen zwischen den Objekten der Theorie behandeln können. Die im Vergleich zu klassischen, auf Aussagenlogik basierenden Verfahren erweiterte Ausdrucksstärke führt aber auch zu einer grösseren Komplexität der Lernaufgabe. Das induktive Lernverfahren RDT der Werkbank MOBAL verwendet Modellwissen in Form von Regelmodellen um den Suchraum einzuschränken. Diese syntaktischen Vorgaben an das Lernziel ermöglichen zwar eine genaue Steuerung der Lernaufgabe durch den Benutzer, fehlen aber die zum Lernziel korrespondierenden Formelschemata, kann das Lernziel nicht erreicht werden. Die vorliegende Arbeit präsentiert daher einen heuristischen Ansatz zum automatischen Erwerb von Regelmodellen, der auf der Berechnung speziellster Generalisierungen beruht. Um Hintergrundwissen zu berücksichtigen, werden die für das Lernziel relevanten Teile dieses Wissens mit den Beispielen verknüpft. Die Berechnung speziellster Generalisierungen von Regelmodellen dient zur schrittweisen Verallgemeinerung der Regelmodelle. Eine neue Erweiterung der theta-Subsumtion auf Regelmodelle und ein Redundanzbegriff für solche Formelschemata sind weitere Bestandteile dieser Arbeit. The paper is written in German.1999-10-28T00:00:00ZLogikbasiertes Lernen in relationalen Datenbanken
http://hdl.handle.net/2003/2586
Title: Logikbasiertes Lernen in relationalen Datenbanken
Authors: Lindner, Guido
Abstract: Die Verbindung von Verfahren des maschinellen Lernens mit relationalen Datenbanken ist für verschiedene Bereiche, z.B. Knowledge Discovery, Deduktive Datenbanken und maschinelles Lernen selbst, von steigendem Interesse. Dieser Artikel beschreibt die Anwendung des logikorientierten Lernverfahrens RDT auf relationale Datenbanken. In diesem Rahmen wird die Frage der Repräsentation der Datenbank für das logikbasierte Lernen diskutiert. In ersten experimentellen Test über dem im maschinellen Lernen bekannten KRK-Sachbereich und einer Anwendung aus dem Bereich der Roboternavigation werden einige Vorteile und Nachteile von RDT/DB deutlich. The paper is written in German.1999-10-28T00:00:00ZComputational models of learning in astronomy
http://hdl.handle.net/2003/2585
Title: Computational models of learning in astronomy
Authors: Mühlenbrock, Martin
Abstract: Human learning appears to be heavily influenced by prior knowledge, yet the complex relationships between individual conceptions and their influence on the learning process are still subject to research. The computational representation of learning processes is assumed to yield a deeper insight into the interdependence of background knowledge and the product of learning. Based on a cross age study on children's explanations of the day/night cycle conducted by S. Vosniadou and W. F. Brewer, children's conceptions of the celestial bodies and their conceptions of the appearance and disappearance of objects have been modeled within the knowledge representation system MOBAL. The formal and operational models help to specify the interconceptual relations and the conceptual development reconciling the culturally accepted scientific explanation of the day/night cycle with alternative conceptions.1999-10-28T00:00:00ZKI und Neuroinformatik
http://hdl.handle.net/2003/2584
Title: KI und Neuroinformatik
Authors: Morik, Katharina1999-10-28T00:00:00ZTurning an action formalism into a planner
http://hdl.handle.net/2003/2583
Title: Turning an action formalism into a planner
Authors: Hertzberg, Joachim; Thiebaux, Sylvie
Abstract: The paper describes a case study that explores the idea of building a planner with a neat semantics of the plans it produces, by choosing some action formalism that is "ideal" for the planning application and building the planner accordingly. In general-and particularly so for the action formalism used in this study, which is quite expressive-this strategy is unlikely to yield fast and efficient planners if the formalism is used naively. Therefore, we adopt the idea that the planner approximates the theoretically ideal plans, where the approximation gets closer, the more run time the planner is allowed. As the particular formalism underlying our study allows a significant degree of uncertainty to be modeled and copes with the ramification problem, we end up in a planner that is functionally comparable to modern anytime uncertainty planners, yet is based on a neat formal semantics. To appear in the Journal of Logic and Computation, 1994. The paper is written in English.1999-10-28T00:00:00ZExperimentelle Analyse zweier logik-basierter Lernverfahren
http://hdl.handle.net/2003/2581
Title: Experimentelle Analyse zweier logik-basierter Lernverfahren
Authors: Lindner, Guido; Robers, Ursula
Abstract: Ein entscheidendes Problem des logik-basierten Lernens liegt in der Grösse des Hypothesenraumes. Möglichkeiten der Einschränkung sind das heuristische Durchsuchen eines vollständigen Hypothesenraums (z.B. FOIL) oder das vollständige Durchsuchen eines eingeschränkten Hypothesenraums (z.B. RDT). Während die theoretische Analyse die Lernbarkeit untersucht, wollen wir durch einen experimentellen Vergleich von RDT und FOIL feststellen, wie sich die unterschiedlichen Einschränkungen des Suchraums in der Praxis auswirken. Für unsere Experimente haben wir zum einen den KRK-Sachbereich und ausserdem einen neu modellierten Sachbereich, die Wohnortwahl für Studenten, verwendet. The paper is written in German.1999-10-28T00:00:00ZPlanen von Aktionen und Reaktionen
http://hdl.handle.net/2003/2582
Title: Planen von Aktionen und Reaktionen
Authors: Hertzberg, Joachim
Abstract: Dieser Text gibt einen Überblick über ältere und neuere KI-Methoden zum Planen, also zur Erstellung von Handlungsplänen. Er beleuchtet den Stand der KI-Kunst dazu aus drei miteinander verschränkten Perspektiven: (1) Es gibt eingeschränkte Formen von Handlungsplanung, die gut verstanden sind; der Text stellt die ``klassische Planung kurz vor. (2) Es gibt keine allgemein anwendbaren Verfahren, zielgeleitete Handlungen unter Bedingungen wie Zeitdruck oder Unsicherheit verfügbarer Information zu finden; der Text führt in die entsprechenden Probleme ein und beschreibt einige aktuelle Lösungsansätze. (3) Es gibt Logikkalküle, die Ereignisse und ihre Effekte unter einigen dieser Bedingungen sauber beschreiben bzw. ermitteln können; der Text beschreibt einen solchen Kalkül genauer und skizziert, wie man ihn verwenden kann, um Planer mit klarer Semantik zu bauen, die trotzdem funktionieren. The paper is written in German.1999-10-28T00:00:00ZGRDT: enhancing model based learning for its application in robot navigation
http://hdl.handle.net/2003/2580
Title: GRDT: enhancing model based learning for its application in robot navigation
Authors: Klingspor, Volker
Abstract: Robotics is one of the most challenging applications for the use of machine learning. Machine learning can offer an increase in flexibility and applicability in many robotic domains. In this paper, we sketch a framework to apply inductive logic programming (ILP) techniques to learning tasks of autonomous mobile robots. We point out differences between three existing algorithms used within this framework and their results. Since all of these algorithms have problems in solving the tasks, we developed GRDT (grammar based rule discovery tool), an algorithm combining their ideas and techniques. The paper is written in English.1999-10-28T00:00:00ZA note on paraconsistent entailment in machine learning
http://hdl.handle.net/2003/2579
Title: A note on paraconsistent entailment in machine learning
Authors: Bell, Siegfried; Weber, Steffo
Abstract: Recent publications witness that there is a growing interest in multi-valued logics for machine learning; some of them arose as a more or less formal description of a computer program's inferential behaviour. The referred origin of these systems is Belnap's fourvalued logic, which has been adopted for the various needs of knowledge representation in a machine learning system. However, it is unclear what an inconsistent knowledge base entails. We investigate Mobal's logic < and show how to interpret the term `paraconsistent inference' of this system. It turns out that the meaning of the basic connective ! of < can be represented as a combination of two systems of Kleene's strong three-valued logic, where the two systems differ in the set of designated truth values. The resulting logic is functionally complete but the entailment relation is not axiomatizable. This drawback yields a fundamental difference between nonmontonicity within belief-revision and non-monotonic reasoning systems like Servi's refinement 1 of Gabbay's .1999-10-27T00:00:00ZLearning action oriented perceptual features for robot navigation
http://hdl.handle.net/2003/2578
Title: Learning action oriented perceptual features for robot navigation
Authors: Morik, Katharina; Rieger, Anke
Abstract: Machine learning can offer an increase in the flexibility and applicability of robotics at several levels of control. In this paper, we characterize two symbolic learning tasks in the field of robotics. We outline an approach for learning features from sensory data and for using these features to learn more complex ones. We illustrate our approach with first experiments in the field of navigation. The paper is written in English.1999-10-26T00:00:00ZReport 0, Lehrstuhl für Künstliche Intelligenz
http://hdl.handle.net/2003/2577
Title: Report 0, Lehrstuhl für Künstliche Intelligenz
Authors: Morik, Katharina
Abstract: This report presents the members of the AI-unit at the University of Dortmund and their activities. It will we updated annually. The paper is written in German.1999-10-04T00:00:00ZA three-valued logic for Inductive Logic Programming
http://hdl.handle.net/2003/2576
Title: A three-valued logic for Inductive Logic Programming
Authors: Bell, Siegfried; Weber, Steffo
Abstract: Inductive Logic Programming (ILP) is closely related to Logic Programming (LP) by the name. We extract the basic differences of ILP and LP by comparing both and give definitions of the basic assumptions of their paradigms, e.g. closed world assumption, the open domain assumption and the open world assumption used in ILP. The paper is written in English.1999-09-09T00:00:00ZNeuronale Netzwerke
http://hdl.handle.net/2003/2575
Title: Neuronale Netzwerke
Authors: Rieger, Anke
Abstract: The report gives an introduction to neural networks. Starting with the basic terminology, different types of neural networks are described. Several applications of neural networks are shown, e.g. pattern recognition, content-adressable memory, and optimization problems. The major part of the report is focused on learning. Methods for learning from examples as well as methods for learning from observations are described. This report has been used as part of a script for a graduate student course in AI. It aims at teaching the basics of neural networks with the intention to make accessible the mathematical techniques used in this context. The paper is written in German.1999-08-12T00:00:00ZMaschinelles Lernen
http://hdl.handle.net/2003/2574
Title: Maschinelles Lernen
Authors: Morik, Katharina
Abstract: This report gives an overview of machine learning. The report concentrates on methods rather than on the large number of systems. The logic-based approaches are described in some detail. The main paradigms are indicated and used for presenting practical techniques in a unified way. The paper is written in German.1999-08-12T00:00:00ZDiscovery of data dependencies in relational databases
http://hdl.handle.net/2003/2573
Title: Discovery of data dependencies in relational databases
Authors: Bell, Siegfried; Brockhausen, Peter
Abstract: Knowledge discovery in databases is not only the nontrivial extraction of implicit, previously unknown and potentially useful information from databases. We argue that in contrast to machine learning, knowledge discovery in databases should be applied to real world databases. Since real world databases are known to be very large, they raise problems of the access. Therefore, real world databases only can be accessed by database management systems and the number of accesses has to be reduced to a minimum. Considering this property, we are forced to use, for example, standard set oriented interfaces of relational database management systems in order to apply methods of knowledge discovery in databases. We present a system for discovering data dependencies, which is build upon a set oriented interface. The point of main effort has been put on the discovery of value restrictions, unary inclusion- and functional dependencies in relational databases. The system also embodies an inference relation to minimize database access.1999-08-11T00:00:00Z