# Model validation

In fusion plasma physics, rather complex models are used to simulate, e.g., transport.
In view of the large number of parameters of such (numerical) models, the question arises as to whether these models are 'really true' (accurately describe the physical reality).
In other words, the models should be *validated*.
^{[1]}
^{[2]}

## Contents

## Model verification and validation

The term 'verification' is understood to refer to an internal consistency check of the model in itself, i.e., to verify that it actually computes what it is meant to compute without (numerical or conceptual) error. This contrasts with 'validation', which refers to checking that the model actually describes the physical reality to which it applies within a given error margin. Evidently, validation is the more difficult of the two checks.

## The logical trap

Anyone would agree that the logical inference 'if A is true, then B must be true' combined with the observation that 'B is true' does not imply that 'A is true'.
And yet this mistake appears to be rather common: if a given plasma model (A) describes a given experimental result (B), it is inferred that the model must be 'OK' - erroneously, because the agreement may be fortuitous or due to constraints that are hard to identify, or the data interpretation problem may be *badly posed* (see below).

There are several ways of avoiding this trap:

First, the logical inference can be completed with a clause like 'A is the *only* circumstance for which B can be true'. Then, if B is true, it is obvious that A must be true.
An example of this is the bootstrap current, which has been observed and for which Neoclassical theory provides the only available reasonable explanation.
It is therefore generally considered that NC theory has been validated via the correct prediction of the bootstrap current.

Second, the number of experiments for which 'A implies B' holds can be increased. E.g., if a model (A) is found to describe the obervations (B) for a large number of significantly different cases (experimental situations), without 'tweaking' parameters, then the validity of the model is enhanced; although it can never be *proven* that the model will always work in this way. Its validity will always remain subject to further testing.

## Hidden assumptions

The equations describing the behaviour of plasmas are mostly known (Maxwell's equations, etc.) but are untractable due to the large number of particles involved.
Hence, simplifying assumptions are always made, usually of the type 'assume *X >> Y* '. It is quite common that these assumptions are not made fully explicit, which entrains the risk that the assumptions are violated in some specific case without this circumstance being detected. Therefore, it is important to clarify as precisely as possible under what conditions the model is valid, and check that these conditions are met for all relevant applications of the model.

Of course, initial and boundary conditions are just as important to specify.

## Circular reasoning

Ideally, one would like the model only to take input from boundary and/or initial conditions, and predict the experimental outcome.
However, often experimental measurements are taken as input (e.g., a density profile) to predict (using a model, e.g., NC theory) another experimental profile (e.g., the radial electric field). In this case, coincidence with radial electric field measurements only show *consistency* with the theory used, but does not prove the model is correct in and of itself, as the coincidence may be due to constraints also obeyed by other (transport) theories, while the input profile may be (partly) a result of turbulent transport (not contained in NC theory).

## Badly posed problems

A so-called 'badly posed problem' is a problem such that many model parameter choices map to the same measurement outcome (within measurement error), i.e., the model is a projection. Thus, no analysis of the available measurements can reveal the 'true' value of all model parameters, even if the model is in itself correct. It may only be possible to determine the value of some parameters (with their corresponding error), while some others are largely or completely undetermined for lack of information from the measurements. See

Models can of course not be validated with respect to any parameters that are badly determined in an experimental situation.