Some representation learning tasks and the inspection of their models

Pfahler, Lukas

Some representation learning tasks and the inspection of their models

Files

dissertation.pdf (3.44 MB)

Date

2022

Authors

Pfahler, Lukas

Abstract

Today, the field of machine learning knows a wide range of tasks with a wide range of supervision sources, ranging from the traditional classification tasks with neatly labeled data, over data with noisy labels to data with no labels, where we have to rely on other forms of supervision, like self-supervision. In the first part of this thesis, we design machine learning tasks for applications where we do not immediately have access to neatly-labeled training data. First, we design unsupervised representation learning tasks for training embedding models for mathematical expression that allow retrieval of related formulae. We train convolutional neural networks, transformer models and graph neural networks to embed formulas from scientific articles into a real-valued vector space using contextual similarity tasks as well as self-supervised tasks. We base our studies on a novel dataset that consists of over 28 million formulae that we have extracted from scientific articles published on arXiv.org. We represent the formulas in different input formats — images, sequences or trees — depending on the embedding model. We compile an evaluation dataset with annotated search queries from several different disciplines and showcase the usefulness of our approach for deploying a search engine for mathematical expressions. Second, we investigate machine learning tasks in astrophysics. Prediction models are currently trained on simulated data, with hand-crafted features and using multiple singletask models. In contrast, we build a single multi-task convolutional neural network that works directly on telescope images and uses convolution layers to learn suitable feature representations automatically. We design loss functions for each task and propose a novel way to combine the different loss functions to account for their different scales and behaviors. Next, we explore another form of supervision that does not rely on simulated training data, but learns from actual telescope recordings. Through the framework of noisy label learning, we propose an approach for learning gamma hadron classifiers that outperforms existing classifiers trained on simulated, fully-labeled data. Our method is general: it can be used for training models in scenarios that fit our noise assumption of class-conditional label noise with exactly one known noise probability. In the second part of this work, we develop methods to inspect models and gain trust into their decisions. We focus on large, non-linear models that can no longer be understood in their entirety through plain inspection of their trainable parameters. We investigate three approaches for establishing trust in models. First, we propose a method to highlight influential input nodes for similarity computations performed by graph neural networks. We test this approach with our embedding models for retrieval of related formulas and show that it can help understand the similarity scores computed by the models. Second, we investigate explanation methods that provide explanations based on the training process that produced the model. This way we provide explanations that are not merely an approximation of the computation of the prediction function, but actually an investigation into why the model learned to produce an output grounded in the actual data. We propose two different methods for tracking the training process and show how they can be easily implemented within existing deep learning frameworks. Third, we contribute a method to verify the adversarial robustness of random forest classifiers. Our method is based on knowledge distillation of a random forest model into a decision tree model. We bound the approximation error of using the decision tree as a proxy model to the given random forest model and use these bounds to provide guarantees on the adversarial robustness of the random forest. Consequently, our robustness guarantees are approximative, but we can provably control the quality of our results using a hyperparameter.

Keywords

Maschinelles Lernen, Künstliche Intelligenz, Deep Learning, Trustworthy AI

Subjects based on RSWK

Maschinelles Lernen, Künstliche Intelligenz, Deep learning

URI

http://hdl.handle.net/2003/41168
http://dx.doi.org/10.17877/DE290R-23015

Collections

LS 08 Künstliche Intelligenz

Full item page

Some representation learning tasks and the inspection of their models

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Alternative Title(s)

Abstract

Description

Table of contents

Keywords

Subjects based on RSWK

Citation

URI

Collections