step_measure_drift_linear() creates a specification of a recipe step
that corrects for linear signal drift across run order using QC or reference
samples. This is a simpler alternative to LOESS when drift is approximately
linear.
Arguments
- recipe
A recipe object.
- ...
One or more selector functions to choose feature columns. For feature-level data, select the numeric response columns. For curve-level data with
.measures, leave empty to apply to all locations.- run_order_col
Name of the column containing run order (injection sequence). Must be numeric/integer.
- sample_type_col
Name of the column containing sample type.
- qc_type
Value(s) in
sample_type_colthat identify QC samples to use for drift modeling. Default is"qc".- apply_to
Which samples to apply correction to:
"all"(default): Correct all samples"unknown": Only correct unknown samples
- min_qc
Minimum number of QC samples required. Default is 5.
- role
Not used by this step.
- trained
Logical indicating if the step has been trained.
- skip
Logical. Should the step be skipped when baking?
- id
Unique step identifier.
Details
When to Use
Use linear drift correction when:
Drift is approximately linear over the run
You have fewer QC samples (requires at least 3)
You want a more conservative correction
For non-linear drift patterns, use step_measure_drift_qc_loess() or
step_measure_drift_spline().
See also
step_measure_drift_qc_loess() for LOESS-based correction,
step_measure_drift_spline() for spline-based correction.
Other drift-correction:
step_measure_drift_qc_loess(),
step_measure_drift_spline(),
step_measure_qc_bracket()
Examples
library(recipes)
# Data with linear drift
data <- data.frame(
sample_id = paste0("S", 1:20),
sample_type = rep(c("qc", "unknown", "unknown", "unknown", "qc"), 4),
run_order = 1:20,
feature1 = 100 + (1:20) * 0.5 + rnorm(20, sd = 2)
)
rec <- recipe(~ ., data = data) |>
update_role(sample_id, new_role = "id") |>
step_measure_drift_linear(feature1) |>
prep()
corrected <- bake(rec, new_data = NULL)