Skip to contents

This function returns a function that computes the selective p-value for a given value of b (the hypothesized value of beta1), conditional on rejection of the overall F-test. The p-value is estimated via Monte Carlo integration, accounting for the selection event.

Usage

get_pselb(
  X,
  y,
  sigma_sq,
  yPy = NULL,
  rss = NULL,
  alpha_ov = 0.05,
  B = 10000,
  min_select = B,
  B_max = 1e+07,
  verbose = FALSE
)

Arguments

X

A numeric matrix of predictors with n rows and p columns. The first column corresponds to the coefficient of interest.

y

A numeric response vector of length n.

sigma_sq

The variance of the error term. Must be specified in advance.

yPy

Optional value of the quadratic form y' P_X y. If NULL, it is computed internally.

rss

Optional residual sum of squares. If NULL, it is computed internally using y and yPy.

alpha_ov

Significance level for the overall F-test. Default is 0.05.

B

Number of Monte Carlo samples drawn per iteration. Default is 10,000.

min_select

Minimum number of selected samples required to estimate the p-value. Default is equal to B.

B_max

Maximum number of Monte Carlo samples drawn in any iteration. Default is 10 million.

verbose

Logical flag indicating whether to print progress messages during sampling.

Value

A function that takes a numeric argument b and returns the selective p-value corresponding to that value of beta1. If the selection condition is not met by the observed data, the returned function will always return NA.

Details

The returned function estimates the conditional probability that a test statistic T(b) exceeds the observed value, given that the data passes the F-test threshold. Sampling proceeds until at least min_select Monte Carlo draws satisfy the selection condition. If this cannot be achieved within the given sample limits, the function returns NA.