vignettes/assessing_map_uncertainty.Rmd
assessing_map_uncertainty.Rmd
Once you have generated an antigenic map there are several methods to try and assess uncertainty, and different potential sources of uncertainty to consider. The different functions in Racmacs relate to these different sources as follows:
Uncertainty | Method |
---|---|
Geometric uncertainty | triangulationBlobs() |
Sample variation |
bootstrapBlobs() with
method = “resample” or
method = “bayesian”
|
Measurement noise and bias |
bootstrapBlobs() with
method = “noisy”
|
This article discusses these sources of error and the corresponding functions to help you decide how best to apply them to your data.
With all these methods it is important to consider that we are addressing the degree of uncertainty present in the solution to the map, given the map as a model for your data. This is an important distinction since we are not addressing the question of how well a particular map represents the data, which is covered in other articles. This is analogous to fitting e.g. a straight regression line through some data, the line could be a bad fit to the true patterns in the data, yet you could be very confident that if you really insist on fitting a straight line, you have the best parameters for the slope and intercept. It is like that with these methods, here we are addressing the confidence we have in the parameters of best in our model, not whether the best fit is a good representation.
A first source of uncertainty to consider is how constrained or not
the point positions are in your map given the information you have. In
an extreme case, if you had only a single measurement for an antigen
against a single serum, you could calculate an expected distance that
the antigen should be from that serum, but you would not have sufficient
information to triangulate its position in e.g. 2 dimensional space. In
reality of course, the situation is rarely as clear cut as that and,
indeed in that particular scenario, Racmacs would exclude that point
with a warning that there were an insufficient number of titrations to
coordinate it. More commonly, titer information is such that an optimal
solution place a point at a particular place on the map, but there is a
broader region where the solution is almost as good. This is what the
triangulationBlobs()
function tries to represent. In
particular it takes each point then performs a grid search to see how
stress varies for that point in different regions of the space, before
drawing a blob to demarcate the region of space where the total stress
increase is below a certain threshold. As you can imagine the stress
threshold you choose is an important determinant of how big the bob
regions will be. A generally sensible default of 1 unit increase of
stress is the default but there is no a prior reason for this value and
it is the relative size of the blobs that gives the most useful
information of how constrained certain point positions are relative to
others.
It is tempting to think of these triangulationBlobs as somehow representing confidence or credibility regions, but this is not the case, for example a point position may be well coordinated geometrically but heavily dependent on a particular titer or set of tiers that may in themselves not be reliable. In our own work these triangulation blobs mostly help inform whether there is enough information to start with to really determine positions of strains well in a given number of dimensions, or are e.g. more titrations against different types of sera required.
A key question in statistics is how well parameters fitted against a population sample are likely to represent the parameters of the population as a whole. The standard error of the mean is a classic estimator of this. There is the same issue with antigenic maps, for example how consistent would the map solution be if we imagine a different subset of similar antigens and sera had been titrated? Since there is no clear analytical solution to this, we apply the approach of bootstrapping to estimate these types of uncertainty.
There are several variant bootstrapping methods and the
bootstrapMap()
function performs these types of analyses
with the option to choose between three different approaches,
“resample”
and “bayesian”
, discussed in this
section, and “noisy”
, discussed further below.
Of the methods above, method = “resample”
is the most
vanilla flavour of bootstrapping, taking a standard approach of
resampling the original data, with replacement so that in the resulting
dataset some entries will be represented multiple times and some not at
all, thereby introducing variation into the original sample.
method = “bayesian”
is conceptually similar to “resample”
but rather than actually resampling the data it achieves a similar
effect through re-weighting the data according to a ?? distribution with
each repeat. A key difference is that no point is ever actually totally
excluded in the bayesian method and therefore if your map is
particularly dependent on a certain antigen or antigens for overall
geometric coordination, the bayesian method will tend to yield more
consistent results. On the flip side, the resample method can help
express uncertainty due to precisely this issue of reliance on a small
number of particular strains or sera for overall map coordination and is
there the more conservative of the two approaches.
The primary option in these bootstrapping approaches is the number of bootstrap resamples to perform, since each of these will be one realisation of a different map under an alternative simulated sample. To get an accurate assessment of how positions seem to vary with each run, at least 1000 is recommended, with more helping better to observe positional uncertainty and variation but becoming more expensive both with regards to computing time and also memory to store the results.
Further options include whether to apply the resampling to both antigens and sera (the default) or just one or the other, for example antigens = FALSE may help explore how sensitive is your particular constellation of antigens to different samples of sera. You can also decide whether to reoptimize the map from scratch with each bootstrap repeat, the default, or simply relax the map from its current configuration. The latter will be considerably faster but isn’t recommended since each repeat will have more of a tendency to become trapped in a local optima that’s similar to the starting configuration and therefore under-represent the true map uncertainty. Relatedly, when reoptimizing from scratch, you can also specify the number of optimizations runs for each sample. For maps that are easy to optimize only a small number of runs would be required, but for larger more complex maps this should ideally be around the number of optimization runs it took to consistently find the lowest stress solution for your original map in the first place. Since the reoptimization from scratch procedure can be so computationally expensive (1000 bootstrap repeats with 200 reoptimizations from scratch each = 200000 optimization runs!) the temptation is to skimp on these numbers, but if too few runs are allowed to give a chance of finding optimal solutions, uncertainty in the end sample may be overrepresented.
Generally, the resampling based bootstrap methods described above work best where there are a sufficient number of similar antigens and sera such that the sample is meaningfully representative of the distribution of variation in a theoretical wider population, for example multiple antigens and sera from each antigenic cluster in the 2004 H3 map. In reality this is not always the case but if we have some idea of the expected size of the measurement error associated with the data we can get some idea of how this may be affecting uncertainty in our resulting map by applying imaginary additional noise to our dataset and seeing how much map positions vary as a result. This is not a perfect approach since each sample with then have measurement noise applied twice, once originally and once artificially, but since in a simple system the artificial noise will cancel out measurement noise on average as much as it adds to it, it still works as a reasonable estimator of measurement noise based uncertainty.
In all datasets we have so far studied we find there are two important categories of noise to consider, the first is the expected noise due to general assay variation that tends to lead to unbiased variation in repeat data. We call this “titer” noise and is generally what most people think about when it comes to experimental noise. The second and far more pernicious noise is noise in antigen measurements that leads to a consistent bias in any given set of repeats, in this category for influenza are things like slight fluctuation in starting virus concentration after titrating to 4HAUs, different RBCs being used etc. Relatedly there is also bias that may be consistently present for an antigen even between repeats due to inherent but non-antigenically related differences, often referred to as high or low avidity strains, or strains with a tendency to be more easily neutralized etc. Since both these types of error have the same result of biasing all titers against an antigen to be consistently higher or lower, we call this “antigen noise”.
While antigenic cartography typically does a great job of averaging
out titer noise, antigen noise is trickier since there is no a priori
way to know whether an antigen has low titers because of additional
antigenic escape or falsely biased low reactivity in the assay. In the
real world we often have clues, for example biased variation between
repeats done at different times or low titers against homologous sera,
but such information is not built into the cartography software. We can
however simulate the effects of such noise in creating uncertainty in
the map and that is the idea of the method = “noisy”
bootstrap approach. As you can perhaps imagine, in addition to options
shared with the methods above, key parameters are how much titer and
antigen noise to simulate. Based on our influenza work both the titer
noise and antigen noise is set at a default of standard deviation = 0.7,
but depending on your data these may be wild over- or under-estimates.
By default both titer and antigen noise are applied, with titer noise
randomly determined and applied separately to each cell in the titer
table for each repeat and antigen noise determined and applied to each
row. In practise, especially given the tricky nature of estimating
antigen reactivity bias, it can be useful to look at the effect of titer
noise on it’s own first, by setting the
antigen_noise_sd = 0
, then see the effect in combination
with some possible per-antigen noise.
Once you have run your desired bootstrap method you will want to
visualise the results. One way is to open a map that you have run
bootstrapping on in the Racmacs viewer. If bootstrapping data is present
then when you select a point, a cloud of points appears representing the
relative position of that point in each of the bootstrap runs done,
allowing you to visualise variation explicitly. A second approach is to
use the bootstrapBlobs()
function to calculate an area that
encompasses a certain proportion of the positional variation across the
repeats. In this case, the conf.level argument sets the proportion of
variation to capture, with the default set at 0.68, approximately
equivalent to one standard deviation of the normal distribution.