Bayesian data analysis: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 1: Line 1:
The goal of [[:Wikipedia:Bayesian inference|Bayesian]]
Bayesian data analysis is based on [[:Wikipedia:Bayesian inference|Bayesian inference]].
<ref>D.S. Sivia, ''Data Analysis: A Bayesian Tutorial'', Oxford University Press, USA (1996) ISBN 0198518897</ref>
<ref>D.S. Sivia, ''Data Analysis: A Bayesian Tutorial'', Oxford University Press, USA (1996) ISBN 0198518897</ref>
<ref>P. Gregory, ''Bayesian Logical Data Analysis for the Physical Sciences'', Cambridge University Press, Cambridge (2005) ISBN 052184150X</ref>
<ref>P. Gregory, ''Bayesian Logical Data Analysis for the Physical Sciences'', Cambridge University Press, Cambridge (2005) ISBN 052184150X</ref>
or integrated data analysis (IDA) is to combine the information from a set of diagnostics providing complementary information in order to recover the best possible reconstruction of the actual state of the system subjected to measurement.
Briefly, this approach is based on the following straightforward property of probability distributions. Let ''p(x,y)'' be the joint probability of observing ''x'' and ''y'' simultaneously. Let ''p(x|y)'' be the ''conditional'' probability of observing ''x'', given ''y''. Then, by definition
:<math>p(x|y)p(y) = p(x,y) = p(y|x)p(x)\,</math>
from which follows ''Bayes' theorem'':
:<math>p(x|y) = \frac{p(y|x)p(x)}{p(y)}</math>
 
== Interpretation of Bayes' Theorem ==
 
The interpretation of this expression in the framework of data interpretation is as follows.
Given the initial knowledge of a system, quantified by the ''prior'' probability ''p(x)'' of system states ''x'', one makes an observation ''y'' with probability (or degree of confidence) ''p(y)'', adding to the prior knowledge.
The ''posterior'' probability ''p(x|y)'' quantifies this enhanced knowledge of the system, ''given'' the observation ''y''.
Thus, the Bayesian inference rule allows one to (a) gradually improve the knowledge of a system by adding observations, and (b) easily combining information from diverse sources by formulating the degree of knowledge of the system and the observations in terms of probability distributions.
The quantity ''p(y|x)'' is a fundamental quantity linking the prior and posterior distributions, called the ''likelihood'', and expresses the probability of observing ''y'', given the prior knowledge ''x''.
 
== Forward modelling ==
 
The likelihood is evaluated using a ''forward model'' of the experiment, returning the value of simulated measurements while ''assuming'' a given physical state ''x'' of the experimental system.
Mathematically, this forward model (mapping system parameters to measurements) is often much easier to evaluate than the reverse mapping (from measurements to system parameters), as the latter is often the inverse of a projection, which is therefore typically ill-determined.
On the other hand, evaluating the forward model requires detailed knowledge of the physical system and the complete measurement process.
 
== Comparison with Function Parametrization ==
 
[[Function parametrization]] (FP) is another statistical technique for recovering system parameters from diverse measurements.
Like FP, Bayesian data analysis requires having a ''forward model'' to predict the measurement readings for any given state of the physical system, and the state of the physical system and the measurement process is ''parametrized''. However 
* instead of computing an estimate of the inverse of the forward model (as with FP), Bayesian analysis finds the best model state corresponding to a specific measurement by a maximization procedure (maximization of the likelihood);
* the handling of error propagation is more sophisticated within Bayesian analysis, allowing non-Gaussian error distributions and absolutely general and complex parameter interdependencies; and
* additionally, it provides a systematic way to include prior knowledge into the analysis.
Typically, the maximization process is CPU intensive, so that Bayesian analysis is not usually suited for real-time data analysis (unlike FP).
 
== Integrated Data Analysis ==
 
The goal of Integrated Data Analysis (IDA) is to combine the information from a set of diagnostics providing complementary information in order to recover the best possible reconstruction of the actual state of the system subjected to measurement. This goal overlaps with the goal of Bayesian data analysis, but IDA applies Bayesian inference in a relatively loose manner to allow incorporating information obtained with traditional or non-Bayesian methods.
<ref>[http://dx.doi.org/10.1088/0741-3335/44/8/306 R. Fischer, C. Wendland, A. Dinklage, et al, '' Thomson scattering analysis with the Bayesian probability theory'', Plasma Phys. Control. Fusion '''44''' (2002) 1501]</ref>
<ref>[http://dx.doi.org/10.1088/0741-3335/44/8/306 R. Fischer, C. Wendland, A. Dinklage, et al, '' Thomson scattering analysis with the Bayesian probability theory'', Plasma Phys. Control. Fusion '''44''' (2002) 1501]</ref>
<ref>[http://dx.doi.org/10.1088/0741-3335/45/7/304 R. Fischer, A. Dinklage, and E. Pasch, ''Bayesian modelling of fusion diagnostics'', Plasma Phys. Control. Fusion '''45''' (2003) 1095-1111]</ref>
<ref>[http://dx.doi.org/10.1088/0741-3335/45/7/304 R. Fischer, A. Dinklage, and E. Pasch, ''Bayesian modelling of fusion diagnostics'', Plasma Phys. Control. Fusion '''45''' (2003) 1095-1111]</ref>
Line 10: Line 40:
<ref>[http://www.new.ans.org/pubs/journals/fst/a_10892 R. Fischer, C.J. Fuchs, B. Kurzan, et al., ''Integrated Data Analysis of Profile Diagnostics at ASDEX Upgrade'', Fusion Sci. Technol. '''58''' (2010) 675]</ref>
<ref>[http://www.new.ans.org/pubs/journals/fst/a_10892 R. Fischer, C.J. Fuchs, B. Kurzan, et al., ''Integrated Data Analysis of Profile Diagnostics at ASDEX Upgrade'', Fusion Sci. Technol. '''58''' (2010) 675]</ref>
<ref>[http://link.aip.org/link/doi/10.1063/1.3608551 B.Ph. van Milligen, T. Estrada, E. Ascasíbar, et al, ''Integrated data analysis at TJ-II: the density profile'', Rev. Sci. Instrum. '''82''' (2011) 073503]</ref>
<ref>[http://link.aip.org/link/doi/10.1063/1.3608551 B.Ph. van Milligen, T. Estrada, E. Ascasíbar, et al, ''Integrated data analysis at TJ-II: the density profile'', Rev. Sci. Instrum. '''82''' (2011) 073503]</ref>
Like [[Function parametrization]] (FP), this technique requires having a ''forward model'' to predict the measurement readings for any given state of the physical system; however 
* instead of computing an estimate of the inverse of the forward model (as with FP), IDA finds the best model state corresponding to a specific measurement by a maximization procedure (maximization of the likelihood);
* the handling of error propagation is more sophisticated within IDA, allowing non-Gaussian error distributions and absolutely general and complex parameter interdependencies; and
* additionally, it provides a systematic way to include prior knowledge into the analysis.
The maximization process is CPU intensive, so that Bayesian analysis is not suited for real-time data analysis (unlike FP).
== Bayes' Theorem ==
The method is based on Bayes' Theorem, expressed as follows:
:<math>
P(\vec {\alpha} |\vec {d},\vec{\sigma},I)
= \frac{{L}(\vec {d} | \vec {\alpha}, \vec{\sigma}, I)\pi(\vec {\alpha} |I)}{\int  {L}(\vec {d}|\vec {\alpha}, \vec{\sigma}, I)\pi(\vec {\alpha} |I)d\vec {\alpha}}
</math>
Here, ''&alpha;'' are a set of model parameters, and ''P'' is the probability distribution of these parameters, ''given'' the experimental data ''d'', their errors ''&sigma;'', and additional information ''I''.
Bayes' Theorem expresses this probability distribution as a product of the ''likelihood L'' of obtaining the cited experimental data ''given'' some values of the model parameters ''&alpha;'' as well as ''&sigma;'' and ''I'', and the ''prior distribution'' ''&pi;'' that expresses the knowledge concerning the model parameters preceding any measurement.
The likelihood ''L'' is computed using a ''forward model'' of the experiment, returning the value of simulated measurements while ''assuming'' a given physical state of the experimental system. It should be noted that this forward model (from system parameters to measurements) is often much easier to compute than the reverse mapping (from measurements to system parameters), as the latter is often the inverse of a projection, which is therefore typically ill-determined.
The normalization of the equation serves to maintain its character of a probability distribution, although it is not important for the determination of the best values of the parameters and their errors.
The optimum reconstruction is determined by ''maximizing'' the posterior ''P'', varying the parameters ''&alpha;''.


== See also ==
== See also ==