Package 'dsos' reference manual

Title:	Dataset Shift with Outlier Scores
Description:	Test for no adverse shift in two-sample comparison when we have a training set, the reference distribution, and a test set. The approach is flexible and relies on a robust and powerful test statistic, the weighted AUC. Technical details are in Kamulete, V. M. (2021) <arXiv:1908.04000>. Modern notions of outlyingness such as trust scores and prediction uncertainty can be used as the underlying scores for example.
Authors:	Vathy M. Kamulete [aut, cre] , Royal Bank of Canada (RBC) [cph] (Research supported and funded by RBC)
Maintainer:	Vathy M. Kamulete <[email protected]>
License:	GPL (>= 3)
Version:	0.1.2
Built:	2025-03-10 05:20:20 UTC
Source:	https://github.com/vathymut/dsos

Convert P-value to Bayes Factor

Description

Convert P-value to Bayes Factor

Usage

as_bf(pvalue)
as_bf(pvalue)

Arguments

pvalue

P-value.

Value

Bayes Factor (scalar value).

References

Marsman, M., & Wagenmakers, E. J. (2017). Three insights from a Bayesian interpretation of the one-sided P value. Educational and Psychological Measurement, 77(3), 529-539.

Examples


library(dsos)
bf_from_pvalue <- as_bf(pvalue = 0.5)
bf_from_pvalue


library(dsos)
bf_from_pvalue <- as_bf(pvalue = 0.5)
bf_from_pvalue

Convert Bayes Factor to P-value

Description

Convert Bayes Factor to P-value

Usage

as_pvalue(bf)
as_pvalue(bf)

Arguments

`bf`	Bayes factor.

Value

p-value (scalar value).

References

Marsman, M., & Wagenmakers, E. J. (2017). Three insights from a Bayesian interpretation of the one-sided P value. Educational and Psychological Measurement, 77(3), 529-539.

Examples


library(dsos)
pvalue_from_bf <- as_pvalue(bf = 1)
pvalue_from_bf


library(dsos)
pvalue_from_bf <- as_pvalue(bf = 1)
pvalue_from_bf

Asymptotic Test from Outlier Scores

Description

Test for no adverse shift with outlier scores. Like goodness-of-fit testing, this two-sample comparison takes the training set, x_train as the as the reference. The method checks whether the test set, x_test, is worse off relative to this reference set. The function scorer assigns an outlier score to each instance/observation in both training and test set.

Usage

at_from_os(os_train, os_test)
at_from_os(os_train, os_test)

Arguments

`os_train`	Outlier scores in training (reference) set.
`os_test`	Outlier scores in test set.

Details

Li and Fine (2010) derives the asymptotic null distribution for the weighted AUC (WAUC), the test statistic. This approach does not use permutations and can, as a result, be much faster because it sidesteps the need to refit the scoring function scorer. This works well for large samples. The prefix at stands for asymptotic test to tell it apart from the prefix pt, the permutation test.

Value

A named list of class outlier.test containing:

statistic: observed WAUC statistic
seq_mct: sequential Monte Carlo test, when applicable
p_value: p-value
outlier_scores: outlier scores from training and test set

Notes

The outlier scores should all mimic out-of-sample behaviour. Mind that the training scores are not in-sample and thus, biased (overfitted) while the test scores are out-of-sample. The mismatch – in-sample versus out-of-sample scores – voids the test validity. A simple fix for this is to get the training scores from an indepedent (fresh) validation set; this follows the train/validation/test sample splitting convention and the validation set is effectively the reference set or distribution in this case.

References

Kamulete, V. M. (2022). Test for non-negligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.

Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Journal of the American Statistical Association, 104(488), 1504-1511.

Examples


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
test_result <- at_from_os(os_train, os_test)
test_result


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
test_result <- at_from_os(os_train, os_test)
test_result

Asymptotic Test With Out-Of-Bag Scores

Description

Usage

at_oob(x_train, x_test, scorer)
at_oob(x_train, x_test, scorer)

Arguments

`x_train`	Training (reference/validation) sample.
`x_test`	Test sample.
`scorer`	Function which returns a named list with outlier scores from the training and test sample. The first argument to `scorer` must be `x_train`; the second, `x_test`. The returned named list contains two elements: train and test, each of which is a vector of (outlier) scores. See notes for more information.

Details

Value

A named list of class outlier.test containing:

statistic: observed WAUC statistic
seq_mct: sequential Monte Carlo test, when applicable
p_value: p-value
outlier_scores: outlier scores from training and test set

Notes

The scoring function, scorer, predicts out-of-bag scores to mimic out-of-sample behaviour. The suffix oob stands for out-of-bag to highlight this point. This out-of-bag variant avoids refitting the underlying algorithm from scorer at every permutation. It can, as a result, be computationally appealing.

References

Kamulete, V. M. (2022). Test for non-negligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.

Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Journal of the American Statistical Association, 104(488), 1504-1511.

Examples


library(dsos)
set.seed(12345)
data(iris)
setosa <- iris[1:50, 1:4] # Training sample: Species == 'setosa'
versicolor <- iris[51:100, 1:4] # Test sample: Species == 'versicolor'

# Using fake scoring function
scorer <- function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
oob_test <- at_oob(setosa, versicolor, scorer = scorer)
oob_test



library(dsos)
set.seed(12345)
data(iris)
setosa <- iris[1:50, 1:4] # Training sample: Species == 'setosa'
versicolor <- iris[51:100, 1:4] # Test sample: Species == 'versicolor'

# Using fake scoring function
scorer <- function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
oob_test <- at_oob(setosa, versicolor, scorer = scorer)
oob_test

Bayesian and Frequentist Test from Outlier Scores

Description

Test for no adverse shift with outlier scores. Like goodness-of-fit testing, this two-sample comparison takes the training (outlier) scores, os_train, as the reference. The method checks whether the test scores, os_test, are worse off relative to the training set.

Usage

bf_compare(os_train, os_test, threshold = 1/12, n_pt = 4000)
bf_compare(os_train, os_test, threshold = 1/12, n_pt = 4000)

Arguments

`os_train`	Outlier scores in training (reference) set.
`os_test`	Outlier scores in test set.
`threshold`	Threshold for adverse shift. Defaults to 1 / 12, the asymptotic value of the test statistic when the two samples are drawn from the same distribution.
`n_pt`	The number of permutations.

Details

This compares the Bayesian to the frequentist approach for convenience. The Bayesian test mimics 'bf_from_os()' and the frequentist one, 'pt_from_os()'. The Bayesian test computes Bayes factors based on the asymptotic (defaults to 1/12) and the exchangeable threshold. The latter calculates the threshold as the median weighted AUC (WAUC) after n_pt permutations assuming outlier scores are exchangeable. This is recommended for small samples. The frequentist test converts the one-sided (one-tailed) p-value to the Bayes factor - see as_bf function.

Value

A list of factors (BF) for 3 different test specifications:

frequentist: Frequentist BF.
bayes_noperm: Bayestion BF test with asymptotic threshold.
bayes_perm: Bayestion BF with exchangeable threshold.

Notes

Examples


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
bayes_test <- bf_compare(os_train, os_test)
bayes_test
# To run in parallel on local cluster, uncomment the next two lines.
# library(future)
# future::plan(future::multisession)
parallel_test <- bf_compare(os_train, os_test)
parallel_test


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
bayes_test <- bf_compare(os_train, os_test)
bayes_test
# To run in parallel on local cluster, uncomment the next two lines.
# library(future)
# future::plan(future::multisession)
parallel_test <- bf_compare(os_train, os_test)
parallel_test

Bayesian Test from Outlier Scores

Description

Usage

bf_from_os(os_train, os_test, n_pt = 4000, threshold = 1/12)
bf_from_os(os_train, os_test, n_pt = 4000, threshold = 1/12)

Arguments

`os_train`	Outlier scores in training (reference) set.
`os_test`	Outlier scores in test set.
`n_pt`	The number of permutations.
`threshold`	Threshold for adverse shift. Defaults to 1 / 12, the asymptotic value of the test statistic when the two samples are drawn from the same distribution.

Details

The posterior distribution of the test statistic is based on n_pt (boostrap) permutations. The method uses the Bayesian bootstrap as a resampling procedure as in Gu et al (2008). Johnson (2005) shows to leverage (turn) a test statistic into a Bayes factor. The test statistic is the weighted AUC (WAUC).

Value

A named list of class outlier.bayes containing:

posterior: Posterior distribution of WAUC test statistic
threshold: WAUC threshold for adverse shift
adverse_probability: probability of adverse shift
bayes_factor: Bayes factor
outlier_scores: outlier scores from training and test set

Notes

References

Kamulete, V. M. (2023). Are you OK? A Bayesian test for adverse shift. Manuscript in preparation.

Johnson, V. E. (2005). Bayes factors based on test statistics. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(5), 689-701.

Gu, J., Ghosal, S., & Roy, A. (2008). Bayesian bootstrap estimation of ROC curve. Statistics in medicine, 27(26), 5407-5420.

Examples


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
bayes_test <- bf_from_os(os_train, os_test)
bayes_test
# To run in parallel on local cluster, uncomment the next two lines.
# library(future)
# future::plan(future::multisession)
parallel_test <- bf_from_os(os_train, os_test)
parallel_test


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
bayes_test <- bf_from_os(os_train, os_test)
bayes_test
# To run in parallel on local cluster, uncomment the next two lines.
# library(future)
# future::plan(future::multisession)
parallel_test <- bf_from_os(os_train, os_test)
parallel_test

Plot Bayesian test for no adverse shift.

Description

Plot Bayesian test for no adverse shift.

Usage

## S3 method for class 'outlier.bayes'
plot(x, ...)
## S3 method for class 'outlier.bayes'
plot(x, ...)

Arguments

`x`	A `outlier.bayes` result from test of no adverse shift.
`...`	Placeholder to be compatible with S3 method `plot`.

Value

A ggplot2 plot with outlier scores and p-value.

Examples


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_plot <- bf_from_os(os_train, os_test)
plot(test_to_plot)


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_plot <- bf_from_os(os_train, os_test)
plot(test_to_plot)

Plot frequentist test for no adverse shift.

Description

Plot frequentist test for no adverse shift.

Usage

## S3 method for class 'outlier.test'
plot(x, ...)
## S3 method for class 'outlier.test'
plot(x, ...)

Arguments

`x`	A `outlier.test` result from test of no adverse shift.
`...`	Placeholder to be compatible with S3 method `plot`.

Value

A ggplot2 plot with outlier scores and p-value.

Examples


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_plot <- at_from_os(os_train, os_test)
# Also: pt_from_os(os_train, os_test) for permutation test
plot(test_to_plot)


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_plot <- at_from_os(os_train, os_test)
# Also: pt_from_os(os_train, os_test) for permutation test
plot(test_to_plot)

Print Bayesian test for no adverse shift.

Description

Print Bayesian test for no adverse shift.

Usage

## S3 method for class 'outlier.bayes'
print(x, ...)
## S3 method for class 'outlier.bayes'
print(x, ...)

Arguments

`x`	A `outlier.test` object from a D-SOS test.
`...`	Placeholder to be compatible with S3 method `plot`.

Value

Print to screen: display Bayes factor and other information.

Examples


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_print <- bf_from_os(os_train, os_test)
test_to_print


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_print <- bf_from_os(os_train, os_test)
test_to_print

Print frequentist test for no adverse shift.

Description

Print frequentist test for no adverse shift.

Usage

## S3 method for class 'outlier.test'
print(x, ...)
## S3 method for class 'outlier.test'
print(x, ...)

Arguments

`x`	A `outlier.test` object from a D-SOS test.
`...`	Placeholder to be compatible with S3 method `plot`.

Value

Print to screen: display p-value and other information.

Examples


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_print <- at_from_os(os_train, os_test)
# Also: pt_from_os(os_train, os_test) for permutation test
test_to_print


set.seed(12345)
os_train <- rnorm(n = 3e2)
os_test <- rnorm(n = 3e2)
test_to_print <- at_from_os(os_train, os_test)
# Also: pt_from_os(os_train, os_test) for permutation test
test_to_print

Permutation Test from Outlier Scores

Description

Usage

pt_from_os(os_train, os_test, n_pt = 2000)
pt_from_os(os_train, os_test, n_pt = 2000)

Arguments

`os_train`	Outlier scores in training (reference) set.
`os_test`	Outlier scores in test set.
`n_pt`	The number of permutations.

Details

The null distribution of the test statistic is based on n_pt permutations. For speed, this is implemented as a sequential Monte Carlo test with the simctest package. See Gandy (2009) for details. The prefix pt refers to permutation test. This approach does not use the asymptotic null distribution for the test statistic. This is the recommended approach for small samples. The test statistic is the weighted AUC (WAUC).

Value

A named list of class outlier.test containing:

statistic: observed WAUC statistic
seq_mct: sequential Monte Carlo test, when applicable
p_value: p-value
outlier_scores: outlier scores from training and test set

Notes

References

Kamulete, V. M. (2022). Test for non-negligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.

Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Journal of the American Statistical Association, 104(488), 1504-1511.

Examples


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
null_test <- pt_from_os(os_train, os_test)
null_test


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
null_test <- pt_from_os(os_train, os_test)
null_test

Permutation Test With Out-Of-Bag Scores

Description

Usage

pt_oob(x_train, x_test, scorer, n_pt = 2000)
pt_oob(x_train, x_test, scorer, n_pt = 2000)

Arguments

`x_train`	Training (reference/validation) sample.
`x_test`	Test sample.
`scorer`	Function which returns a named list with outlier scores from the training and test sample. The first argument to `scorer` must be `x_train`; the second, `x_test`. The returned named list contains two elements: train and test, each of which is a vector of corresponding (outlier) scores. See notes below for more information.
`n_pt`	The number of permutations.

Details

Value

A named list of class outlier.test containing:

statistic: observed WAUC statistic
seq_mct: sequential Monte Carlo test, when applicable
p_value: p-value
outlier_scores: outlier scores from training and test set

Notes

References

Kamulete, V. M. (2022). Test for non-negligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.

Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Journal of the American Statistical Association, 104(488), 1504-1511.

Examples


library(dsos)
set.seed(12345)
data(iris)
idx <- sample(nrow(iris), 2 / 3 * nrow(iris))
iris_train <- iris[idx, ]
iris_test <- iris[-idx, ]
# Use a synthetic (fake) scoring function for illustration
scorer <- function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
pt_test <- pt_oob(iris_train, iris_test, scorer = scorer)
pt_test


library(dsos)
set.seed(12345)
data(iris)
idx <- sample(nrow(iris), 2 / 3 * nrow(iris))
iris_train <- iris[idx, ]
iris_test <- iris[-idx, ]
# Use a synthetic (fake) scoring function for illustration
scorer <- function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
pt_test <- pt_oob(iris_train, iris_test, scorer = scorer)
pt_test

Permutation Test By Refitting

Description

Usage

pt_refit(x_train, x_test, scorer, n_pt = 2000)
pt_refit(x_train, x_test, scorer, n_pt = 2000)

Arguments

`x_train`	Training (reference/validation) sample.
`x_test`	Test sample.
`scorer`	Function which returns a named list with outlier scores from the training and test sample. The first argument to `scorer` must be `x_train`; the second, `x_test`. The returned named list contains two elements: train and test, each of which is a vector of corresponding (outlier) scores. See notes below for more information.
`n_pt`	The number of permutations.

Details

Value

A named list of class outlier.test containing:

statistic: observed WAUC statistic
seq_mct: sequential Monte Carlo test, when applicable
p_value: p-value
outlier_scores: outlier scores from training and test set

Notes

The scoring function, scorer, predicts out-of-sample scores by refitting the underlying algorithm from scorer at every permutation The suffix refit emphasizes this point. This is in contrast to the out-of-bag variant, pt_oob, which only fits once. This method can be be computationally expensive.

References

Kamulete, V. M. (2022). Test for non-negligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.

Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Journal of the American Statistical Association, 104(488), 1504-1511.

Examples


library(dsos)
set.seed(12345)
data(iris)
setosa <- iris[1:50, 1:4] # Training sample: Species == 'setosa'
versicolor <- iris[51:100, 1:4] # Test sample: Species == 'versicolor'
scorer <- function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
pt_test <- pt_refit(setosa, versicolor, scorer = scorer)
pt_test


library(dsos)
set.seed(12345)
data(iris)
setosa <- iris[1:50, 1:4] # Training sample: Species == 'setosa'
versicolor <- iris[51:100, 1:4] # Test sample: Species == 'versicolor'
scorer <- function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
pt_test <- pt_refit(setosa, versicolor, scorer = scorer)
pt_test

Weighted AUC from Outlier Scores

Description

Computes the weighted AUC with the weighting scheme described in Kamulete, V. M. (2021). This assumes that the training set is the reference distribution and specifies a particular functional form to derive weights from threshold scores.

Usage

wauc_from_os(os_train, os_test, weight = NULL)
wauc_from_os(os_train, os_test, weight = NULL)

Arguments

`os_train`	Outlier scores in training (reference) set.
`os_test`	Outlier scores in test set.
`weight`	Numeric vector of weights of length `length(os_train) + length(os_test)`. The first `length(os_train)` weights belongs to the training set, the rest is for the test set. If `NULL`, the default, all weights are set to 1.

Value

The weighted AUC (scalar value) given the weighting scheme.

References

Kamulete, V. M. (2022). Test for non-negligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.

Examples


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
test_stat <- wauc_from_os(os_train, os_test)


library(dsos)
set.seed(12345)
os_train <- rnorm(n = 100)
os_test <- rnorm(n = 100)
test_stat <- wauc_from_os(os_train, os_test)

Package 'dsos'

Help Index

Convert P-value to Bayes Factor

Description

Usage

Arguments

Value

References

See Also

Examples

Convert Bayes Factor to P-value

Description

Usage

Arguments

Value

References

See Also

Examples

Asymptotic Test from Outlier Scores

Description

Usage

Arguments

Details

Value

Notes

References

See Also

Examples

Asymptotic Test With Out-Of-Bag Scores

Description

Usage

Arguments

Details

Value

Notes

References

See Also

Examples

Bayesian and Frequentist Test from Outlier Scores

Description

Usage

Arguments

Details

Value

Notes

See Also

Examples

Bayesian Test from Outlier Scores

Description

Usage

Arguments

Details

Value

Notes

References

See Also

Examples

Plot Bayesian test for no adverse shift.

Description

Usage

Arguments

Value

See Also

Examples

Plot frequentist test for no adverse shift.

Description

Usage

Arguments

Value

See Also

Examples

Print Bayesian test for no adverse shift.

Description

Usage

Arguments

Value

See Also

Examples

Print frequentist test for no adverse shift.

Description