step_measure_drift_spline() creates a specification of a recipe step
that corrects for signal drift using smoothing splines fit to QC samples.
This offers more flexibility than linear correction while being more stable
than LOESS for sparse QC data.
Arguments
- recipe
A recipe object.
- ...
One or more selector functions to choose feature columns. For feature-level data, select the numeric response columns. For curve-level data with
.measures, leave empty to apply to all locations.- run_order_col
Name of the column containing run order (injection sequence). Must be numeric/integer.
- sample_type_col
Name of the column containing sample type.
- qc_type
Value(s) in
sample_type_colthat identify QC samples to use for drift modeling. Default is"qc".- apply_to
Which samples to apply correction to:
"all"(default): Correct all samples"unknown": Only correct unknown samples
- df
Degrees of freedom for the smoothing spline. Default is NULL, which uses cross-validation to select optimal df. Lower values = smoother.
- spar
Smoothing parameter (alternative to df). If NULL (default), cross-validation is used.
- min_qc
Minimum number of QC samples required. Default is 5.
- role
Not used by this step.
- trained
Logical indicating if the step has been trained.
- skip
Logical. Should the step be skipped when baking?
- id
Unique step identifier.
Details
How It Works
Uses stats::smooth.spline() to fit a flexible curve through QC responses.
The spline automatically adapts to the data complexity when df is not
specified.
See also
step_measure_drift_linear() for linear correction,
step_measure_drift_qc_loess() for LOESS-based correction.
Other drift-correction:
step_measure_drift_linear(),
step_measure_drift_qc_loess(),
step_measure_qc_bracket()
Examples
library(recipes)
# Data with non-linear drift
set.seed(123)
data <- data.frame(
sample_id = paste0("S", 1:30),
sample_type = rep(c("qc", "unknown", "unknown", "unknown", "unknown", "qc"), 5),
run_order = 1:30,
feature1 = 100 + sin((1:30) / 5) * 10 + rnorm(30, sd = 2)
)
rec <- recipe(~ ., data = data) |>
update_role(sample_id, new_role = "id") |>
step_measure_drift_spline(feature1) |>
prep()
corrected <- bake(rec, new_data = NULL)