step_measure_baseline_poly() creates a specification of a recipe step
that applies polynomial baseline correction to measurement data. The method
fits a polynomial to the spectrum, optionally with iterative peak exclusion.
Usage
step_measure_baseline_poly(
recipe,
measures = NULL,
degree = 2L,
max_iter = 0L,
threshold = 1.5,
role = NA,
trained = FALSE,
skip = FALSE,
id = recipes::rand_id("measure_baseline_poly")
)Arguments
- recipe
A recipe object. The step will be added to the sequence of operations for this recipe.
- measures
An optional character vector of measure column names to process. If
NULL(the default), all measure columns (columns with classmeasure_list) will be processed.- degree
Polynomial degree for baseline fitting. Default is
2(quadratic). Higher degrees fit more complex baselines but risk overfitting. Tunable viabaseline_degree().- max_iter
Maximum number of iterations for peak exclusion. Default is
0(no iteration, fit polynomial to all points). Set to a positive integer to iteratively exclude points above the fitted baseline.- threshold
Number of standard deviations above baseline for a point to be excluded in iterative fitting. Default is
1.5. Only used whenmax_iter > 0.- role
Not used by this step since no new variables are created.
- trained
A logical to indicate if the quantities for preprocessing have been estimated.
- skip
A logical. Should the step be skipped when the recipe is baked?
- id
A character string that is unique to this step to identify it.
Value
An updated version of recipe with the new step added to the
sequence of any existing operations.
Details
Polynomial baseline correction fits a polynomial function to the spectrum and subtracts it. This is effective for removing smooth, curved baselines caused by instrumental drift, scattering, or other slowly varying effects.
When max_iter > 0, the algorithm uses iterative peak exclusion:
Fit polynomial to all points
Calculate residuals (spectrum - baseline)
Exclude points where residual > threshold * SD(residuals)
Refit polynomial to remaining points
Repeat until convergence or max_iter reached
This iterative approach prevents peaks from pulling up the baseline estimate.
Degree selection:
degree = 1: Linear baseline (for simple drift)degree = 2: Quadratic (most common, handles gentle curvature)degree = 3-5: Higher-order (for complex baselines, use cautiously)
No selectors should be supplied to this step function. The data should be
in the internal format produced by step_measure_input_wide() or
step_measure_input_long().
Tidying
When you tidy() this step, a tibble with columns
terms, degree, and id is returned.
See also
Other measure-baseline:
step_measure_baseline_airpls(),
step_measure_baseline_als(),
step_measure_baseline_arpls(),
step_measure_baseline_auto(),
step_measure_baseline_custom(),
step_measure_baseline_gpc(),
step_measure_baseline_minima(),
step_measure_baseline_morph(),
step_measure_baseline_py(),
step_measure_baseline_rf(),
step_measure_baseline_rolling(),
step_measure_baseline_snip(),
step_measure_baseline_tophat(),
step_measure_detrend()
Examples
library(recipes)
# Simple polynomial baseline (no iteration)
rec <- recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_baseline_poly(degree = 2) |>
prep()
bake(rec, new_data = NULL)
#> # A tibble: 215 × 5
#> id water fat protein .measures
#> <int> <dbl> <dbl> <dbl> <meas>
#> 1 1 60.5 22.5 16.7 [100 × 2]
#> 2 2 46 40.1 13.5 [100 × 2]
#> 3 3 71 8.4 20.5 [100 × 2]
#> 4 4 72.8 5.9 20.7 [100 × 2]
#> 5 5 58.3 25.5 15.5 [100 × 2]
#> 6 6 44 42.7 13.7 [100 × 2]
#> 7 7 44 42.7 13.7 [100 × 2]
#> 8 8 69.3 10.6 19.3 [100 × 2]
#> 9 9 61.4 19.9 17.7 [100 × 2]
#> 10 10 61.4 19.9 17.7 [100 × 2]
#> # ℹ 205 more rows
# With iterative peak exclusion
rec2 <- recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_baseline_poly(degree = 3, max_iter = 5, threshold = 2) |>
prep()