Dempster-Shafer theory, also known as belief theory or evidence theory, is a mathematical framework for reasoning and decision-making under uncertainty. It provides a way to combine and manage evidence from multiple sources, even when the evidence may be uncertain or conflicting. Dempster-Shafer theory extends classical probability theory by allowing for the representation and combination of incomplete or uncertain information.
The key concepts in Dempster-Shafer theory are:
- Basic Probability Assignment (BPA): In Dempster-Shafer theory, instead of assigning precise probabilities to events, a belief function is used to assign probabilities to sets of events. A belief function is defined over a power set of events and provides a measure of belief or uncertainty for each subset of events. The basic probability assignment assigns a belief value to each individual event and can capture the degree of evidence or support for that event.
- Mass Function: The mass function, also known as the belief function or the basic belief assignment, is a mathematical function that describes the belief or uncertainty associated with each subset of events. It assigns a belief value to each subset of events and represents the degree of support or evidence for that subset. The mass function must satisfy certain normalization properties, such as the sum of all belief values adding up to one.
- Combination Rule: The combination rule is used to combine belief functions from multiple sources of evidence. It determines how the belief functions are aggregated or fused to obtain an overall belief function. The Dempster’s rule of combination is a commonly used rule, which combines the belief functions by taking into account the degrees of overlap or conflict between the sources of evidence.
- Dempster’s Rule of Combination: Dempster’s rule combines the belief functions by taking into account the degrees of conflict between the sources of evidence. It calculates the belief values for each subset of events based on the belief values assigned by each individual source. The rule involves combining the masses of evidence and then renormalizing the resulting belief function.
- Decision Making: Dempster-Shafer theory can be used for decision-making by applying decision rules to the combined belief function. Decision rules determine how the belief values are mapped to decisions or actions. Common decision rules include maximum belief decision, where the decision is based on the subset of events with the highest belief value, and threshold decision, where a decision is made based on a predefined threshold value.
Dempster-Shafer theory provides a framework for reasoning and decision-making when dealing with uncertain or conflicting evidence. It allows for the fusion of evidence from different sources and provides a quantitative measure of belief or uncertainty for each event or hypothesis. Dempster-Shafer theory has applications in various fields, such as expert systems, decision support systems, pattern recognition, and information fusion.
Dempster – Shafer Theory (DST)
DST is a mathematical theory of evidence based on belief functions and plausible reasoning. It is used to combine separate pieces of information (evidence) to calculate the probability of an event.
DST offers an alternative to traditional probabilistic theory for the mathematical representation of uncertainty.
DST can be regarded as, a more general approach to represent uncertainty than the Bayesian approach.
Bayesian methods are sometimes inappropriate
Example :
Let A represent the proposition “Moore is attractive”. Then the axioms of probability insist that P(A) + P(¬A) = 1.
Now suppose that Andrew does not even know who “Moore” is, then
We cannot say that Andrew believes the proposition if he has no idea what it means.
Also, it is not fair to say that he disbelieves the proposition.
It would therefore be meaningful to denote Andrew’s belief B of
B(A) and B(¬A) as both being 0.
Certainty factors do not allow this.
Dempster-Shafer Model
The idea is to allocate a number between 0 and 1 to indicate a degree of belief on a proposal as in the probability framework.
However, it is not considered a probability but a belief mass. The distribution of masses is called basic belief assignment.
In other words, in this formalism a degree of belief (referred as mass) is represented as a belief function rather than a Bayesian probability distribution.
Example: Belief assignment (continued from previous slide)
Suppose a system has five members, say five independent states, and exactly one of which is actual. If the original set is called S, | S | = 5, then
the set of all subsets (the power set) is called 2S.
If each possible subset as a binary vector (describing any member is present or not by writing 1 or 0 ), then 25 subsets are possible, ranging from the empty subset ( 0, 0, 0, 0, 0 ) to the “everything” subset ( 1, 1, 1, 1, 1 ).
The “empty” subset represents a “contradiction”, which is not true in any state, and is thus assigned a mass of one ;
The remaining masses are normalized so that their total is 1.
The “everything” subset is labeled as “unknown”; it represents the state where all elements are present one , in the sense that you cannot tell which is actual.
Belief and Plausibility
Shafer’s framework allows for belief about propositions to be represented as intervals, bounded by two values, belief (or support) and plausibility:
belief ≤ plausibility
Belief in a hypothesis is constituted by the sum of the masses of all
sets enclosed by it (i.e. the sum of the masses of all subsets of the hypothesis). It is the amount of belief that directly supports a given hypothesis at least in part, forming a lower bound.
Plausibility is 1 minus the sum of the masses of all sets whose intersectionwith the hypothesis is empty. It is an upper bound on the possibility that the hypothesis could possibly happen, up to that value, because there is only so much evidence that contradicts that hypothesis.
Example :
A proposition say “the cat in the box is dead.”
Suppose we have belief of 0.5 and plausibility of 0.8 for the proposition.
Example :
Suppose we have belief of 0.5 and plausibility of 0.8 for the proposition.
Evidence to state strongly, that proposition is true with confidence 0.5. Evidence contrary to hypothesis (“the cat is alive”) has confidence 0.2.
Remaining mass of 0.3 (the gap between the 0.5 supporting evidence and the 0.2 contrary evidence) is “indeterminate,” meaning that the
cat could either be dead or alive. This interval represents the level of uncertainty based on the evidence in the system.
signals, which have respective reliabilities of 0.2 and 0.5. All-encompassing “Either” hypothesis (simply acknowledges there is a cat in the box) picks up the slack so that the sum of the masses is 1. Belief for the “Alive” and “Dead” hypotheses matches their corresponding masses because they have no subsets;
Belief for “Either” consists of the sum of all three masses (Either, Alive, and Dead) because “Alive” and “Dead” are each subsets of “Either”.
“Alive” plausibility is 1- m (Death) and “Dead” plausibility is 1- m (Alive). “Either” plausibility sums m(Alive) + m(Dead) + m(Either).
Universal hypothesis (“Either”) will always have 100% belief and plausibility; it acts as a checksum of sorts.
Dempster-Shafer Calculus
In the previous slides, two specific examples of Belief and plausibility have been stated. It would now be easy to understand their generalization.
The Dempster-Shafer (DS) Theory, requires a Universe of Discourse U (or Frame of Judgment) consisting of mutually exclusive alternatives, corresponding to an attribute value domain. For instance, in satellite image classification the set U may consist of all possible classes of interest.
Each subset S ⊆ U is assigned a basic probability m(S), a belief Bel(S), and a plausibility Pls(S) so that m(S), Bel(S), Pls(S) ∈ [0, 1] and Pls(S) ≥ Bel(S) where
m represents the strength of an evidence, is the basic probability;
e.g., a group of pixels belong to certain class, may be assigned value m.
Bel(S) summarizes all the reasons to believe S.
Pls(S) expresses how much one should believe in S if all currently unknown facts were to support S.
The true belief in S is somewhere in the belief interval [Bel(S), Pls(S)].
The basic probability assignment m is defined as function
m : 2U → [0,1] , where m(Ø) = 0 and sum of m over all subsets of
U is 1 (i.e., ∑ S ⊆ U m(s) = 1 ).
For a given basic probability assignment m, the belief Bel of a subset A of U is the sum of m(B) for all subsets B of A , and the plausibility Pls of a subset A of U is Pls(A) = 1 – Bel(A’) (5) where A’ is complement of A in U.
Summarize :
The confidence interval is that interval of probabilities within which the true probability lies with a certain confidence based on the belief “B” and plausibility “PL” provided by some evidence “E” for a proposition “P”.
The belief brings together all the evidence that would lead us to believe in the proposition P with some certainty.
The plausibility brings together the evidence that is compatible with the proposition P and is not inconsistent with it.
If “Ω” is the set of possible outcomes, then a mass probability “M” is defined for each member of the set 2Ω and takes values in the range [0,1] . The Null set, “ф”, is also a member of 2Ω .
Example
If Ω is the set { Flu (F), Cold (C), Pneumonia (P) }
Then 2Ω is the set Confidence interval is {ф, {F}, {C}, {P}, {F, C}, {F, P}, {C, P}, {F, C, P}} then defined as where
B(E) = ∑A M , where A ⊆ E i.e., all evidence that makes us believe in the correctness of P, and
PL(E) = 1 – B(¬E) = ∑¬A M , where ¬A ⊆ ¬E i.e., all the evidence that contradicts P.
Combining Beliefs
The Dempster-Shafer calculus combines the available evidences resulting in a belief and a plausibility in the combined evidence that represents a consensus on the correspondence. The model maximizes the belief in the combined evidences.
The rule of combination states that two basic probability assignments M1 and M2 are combined into a by third basic probability assignment the normalized orthogonal sum m1 (+) m2 stated below.
Suppose M1 and M2 are two belief functions.