Take a map object and perform cross-validation, seeing how well titers are predicted when they are excluded from the map.

dimensionTestMap(
  map,
  dimensions_to_test = 1:5,
  test_proportion = 0.1,
  minimum_column_basis = "none",
  fixed_column_bases = rep(NA, numSera(map)),
  number_of_optimizations = 1000,
  replicates_per_dimension = 100,
  options = list()
)

Arguments

map

The acmap data object

dimensions_to_test

A numeric vector of dimensions to be tested

test_proportion

The proportion of data to be used as the test set for each test run

minimum_column_basis

The minimum column basis to use

fixed_column_bases

A vector of fixed column bases with NA for sera where the minimum column basis should be applied

number_of_optimizations

The number of optimizations to perform when creating each map for the dimension test

replicates_per_dimension

The number of tests to perform per dimension tested

options

Map optimizer options, see RacOptimizer.options()

Value

Returns a data frame with the following columns. "dimensions" : the dimension tested, "mean_rmse_detectable" : mean prediction rmse for detectable titers across all runs. "var_rmse_detectable" the variance of the prediction rmse for detectable titers across all runs, useful for estimating confidence intervals. "mean_rmse_nondetectable" and "var_rmse_nondetectable" the equivalent for non-detectable titers

Details

For each run, the ag-sr titers that were randomly excluded are predicted according to their relative positions in the map trained without them. An RMSE is then calculated by comparing predicted titers inferred from the map on the log scale to the actual log titers. This is done separately for detectable titers (e.g. 40) and non-detectable titers (e.g. <10). For non-detectable titers, if the predicted titer is the same or lower than the log-titer threshold, the error is set to 0.