step_measure_subtract_blank() creates a specification of a recipe step
that subtracts or divides by a blank measurement. The blank can be provided
externally or learned from training data.
Usage
step_measure_subtract_blank(
recipe,
blank = NULL,
blank_col = NULL,
blank_value = NULL,
method = "subtract",
measures = NULL,
role = NA,
trained = FALSE,
learned_blank = NULL,
skip = FALSE,
id = recipes::rand_id("measure_subtract_blank")
)Arguments
- recipe
A recipe object. The step will be added to the sequence of operations for this recipe.
- blank
An optional external blank to use. Can be:
A
measure_tblobject withlocationandvaluecolumnsA numeric vector (must match the number of locations in data)
A data.frame with
locationandvaluecolumns (will be interpolated) IfNULL, the blank is learned from training data usingblank_colandblank_value.
- blank_col
An optional column name (unquoted) that identifies sample types. Used with
blank_valueto identify blank samples in training data.- blank_value
The value in
blank_colthat identifies blank samples. When the step is prepped, the mean of all blank samples is computed and stored for use during baking.- method
The correction method to apply:
"subtract"(default): Subtract the blank from each spectrum"divide": Divide each spectrum by the blank
- measures
An optional character vector of measure column names to process. If
NULL(the default), all measure columns (columns with classmeasure_list) will be processed.- role
Not used by this step since no new variables are created.
- trained
A logical to indicate if the quantities for preprocessing have been estimated.
- learned_blank
A named list containing the learned blank values for each measure column. This is
NULLuntil the step is trained.- skip
A logical. Should the step be skipped when the recipe is baked?
- id
A character string that is unique to this step to identify it.
Value
An updated version of recipe with the new step added to the
sequence of any existing operations.
Details
Blank subtraction is a fundamental preprocessing step in analytical chemistry. It removes background signal that is present in all measurements but is not related to the analyte of interest.
Two modes of operation:
External blank: You provide a blank spectrum directly via the
blankargument. This is useful when you have a known reference blank.Learned blank: You specify which samples are blanks in your training data using
blank_colandblank_value. Duringprep(), the mean of all blank samples is computed and stored. This approach is useful for batch-specific blank correction.
Common use cases:
UV-Vis: Remove solvent absorbance
IR: Remove atmospheric CO2/H2O interference
Fluorescence: Remove buffer background and Raman scatter
Chromatography: Remove ghost peaks and solvent artifacts
No selectors should be supplied to this step function. The data should be
in the internal format produced by step_measure_input_wide() or
step_measure_input_long().
Tidying
When you tidy() this step, a tibble with columns
terms, method, blank_source, and id is returned.
See also
step_measure_subtract_reference() for simpler external reference
Other measure-preprocessing:
step_measure_absorbance(),
step_measure_calibrate_x(),
step_measure_calibrate_y(),
step_measure_derivative(),
step_measure_derivative_gap(),
step_measure_emsc(),
step_measure_kubelka_munk(),
step_measure_log(),
step_measure_map(),
step_measure_msc(),
step_measure_normalize_istd(),
step_measure_osc(),
step_measure_ratio_reference(),
step_measure_snv(),
step_measure_subtract_reference(),
step_measure_transmittance()
Examples
library(recipes)
# Example with external blank (numeric vector)
blank_spectrum <- rep(0.1, 100)
rec <- recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_subtract_blank(blank = blank_spectrum)
# Example learning blank from training data
# (assuming sample_type column with "blank" values)
# rec <- recipe(outcome ~ ., data = my_data) |>
# step_measure_input_long(...) |>
# step_measure_subtract_blank(blank_col = sample_type, blank_value = "blank")