Construct the Selective P-Value Function for a Single Coefficient

This function returns a function that computes the selective p-value for a given value of b (the hypothesized value of beta1), conditional on rejection of the overall F-test. The p-value is estimated via Monte Carlo integration, accounting for the selection event.

Usage

get_pselb(
  X,
  y,
  sigma_sq,
  yPy = NULL,
  rss = NULL,
  alpha_ov = 0.05,
  B = 10000,
  min_select = B,
  B_max = 1e+07,
  verbose = FALSE
)

Arguments

X: A numeric matrix of predictors with n rows and p columns. The first column corresponds to the coefficient of interest.
y: A numeric response vector of length n.
sigma_sq: The variance of the error term. Must be specified in advance.
yPy: Optional value of the quadratic form y' P_X y. If NULL, it is computed internally.
rss: Optional residual sum of squares. If NULL, it is computed internally using y and yPy.
alpha_ov: Significance level for the overall F-test. Default is 0.05.
B: Number of Monte Carlo samples drawn per iteration. Default is 10,000.
min_select: Minimum number of selected samples required to estimate the p-value. Default is equal to B.
B_max: Maximum number of Monte Carlo samples drawn in any iteration. Default is 10 million.
verbose: Logical flag indicating whether to print progress messages during sampling.

Value

A function that takes a numeric argument b and returns the selective p-value corresponding to that value of beta1. If the selection condition is not met by the observed data, the returned function will always return NA.

Details

The returned function estimates the conditional probability that a test statistic T(b) exceeds the observed value, given that the data passes the F-test threshold. Sampling proceeds until at least min_select Monte Carlo draws satisfy the selection condition. If this cannot be achieved within the given sample limits, the function returns NA.