Probabilistic inferences under maximum entropy for Description Logics
Loading...
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In practical applications of knowledge-based systems, handling uncertain knowledge is crucial. Probability theory provides a widely accepted and rigorous framework for formalizing uncertainty and deriving reliable inferences. Probability-based approaches to knowledge representation and reasoning are used in fields such as medicine, for example in diagnostics, and economics, including fraud detection. A common challenge with many probabilistic methods, like Bayesian Networks and Markov Logic Networks, is the need for a fully specified probability distribution. The principle of maximum entropy (MaxEnt) offers a solution to this issue. It enables the inductive completion of an underspecified probability distribution, typically provided as a probabilistic knowledge base, by ensuring that the information content of the MaxEnt distribution is minimal among all possible extensions. This results in cautious and principled reasoning, which is essential in the aforementioned application areas.
The application of the principle of maximum entropy is well-understood for probabilistic knowledge bases consisting of propositional conditional rules of the form "if A holds, then B holds with probability p." However, in practice, one is often interested in uncertain statements which can be formulated satisfactorily by using richer formal languages only. These are, in particular, statements about properties of individuals and objects, as well as their relationships with one another. Relational logics and Description Logics are formal logics well-suited for representing this type of ontological knowledge. Consequently, this thesis explores probability-based reasoning under the principle of maximum entropy for probabilistic conditional knowledge bases formulated over relational logics or Description Logics. A key challenge here is that the underlying probability space grows exponentially with the domain size, i.e., the number of individuals and objects, which transforms the principle of maximum entropy, already known as a black-box approach, into a true "monster" of complexity.
This thesis centers on a structured analysis of and efficient reasoning with probabilistic knowledge bases for large domains under the principle of maximum entropy. A major contribution is the development of "typed model counting", which establishes a formal framework for these analyses. Typed model counting extends knowledge compilation methods, such as those from "first-order model counting", to handle conditional statements effectively. Comparable advanced compilation methods have been developed for conditional knowledge bases only in limited cases, such as in the form of "weighted conditional impacts".
Typed model counting yields a highly compact representation of the information needed to compute the MaxEnt distribution from a knowledge base. This compact form necessitates new, problem-specific methods for calculating the MaxEnt distribution. Thus, a second major contribution of this thesis is the development of "condensed iterative scaling", an adaptation of the generalized iterative scaling algorithm, which again is a numerical method designed to approximate log-linear models like the MaxEnt distribution. Condensed iterative scaling is tailored specifically to the output produced by typed model counting.
Eventually, we introduce with ALC-ME a probabilistic Description Logic under a MaxEnt semantics, and apply advanced model counting techniques to enable domain-lifted inferences within ALC-ME. We also address challenges that arise when applying the principle of maximum entropy to infinite domains, proposing a solution based on "satisfiability modulo theory (SMT)". Overall, this thesis provides methods that, for the first time, enable the structural analysis of complex relational probabilistic knowledge bases under the principle of maximum entropy.
Description
Table of contents
Keywords
Probabilistic reasoning, Principle of maximum entropy, Aggregating semantics, Probabilistic description logics, Typed model counting