Skip to contents

step_measure_baseline_fastchrom() creates a specification of a recipe step that applies fast baseline correction optimized for chromatography data.

Usage

step_measure_baseline_fastchrom(
  recipe,
  measures = NULL,
  lambda = 1e+06,
  window = 50L,
  max_iter = 10L,
  role = NA,
  trained = FALSE,
  skip = FALSE,
  id = recipes::rand_id("measure_baseline_fastchrom")
)

Arguments

recipe

A recipe object.

measures

An optional character vector of measure column names.

lambda

Smoothness parameter. Default is 1e6.

window

Window size for local minima detection. Default is 50.

max_iter

Maximum number of refinement iterations. Default is 10.

role

Not used.

trained

Logical indicating if the step has been trained.

skip

Logical. Should the step be skipped when baking?

id

Unique step identifier.

Value

An updated recipe with the new step added.

Details

This algorithm combines morphological operations with penalized least squares for fast and robust baseline estimation:

  1. Finds local minima using a rolling window

  2. Smooths the minima to get initial baseline estimate

  3. Iteratively refines using weighted PLS

Particularly effective for SEC/GPC chromatography and other analytical techniques with well-defined peaks.

Examples

library(recipes)

# \donttest{
rec <- recipe(water + fat + protein ~ ., data = meats_long) |>
  update_role(id, new_role = "id") |>
  step_measure_input_long(transmittance, location = vars(channel)) |>
  step_measure_baseline_fastchrom(lambda = 1e6, window = 50) |>
  prep()

bake(rec, new_data = NULL)
#> # A tibble: 215 × 6
#>       id water   fat protein .measures channel    
#>    <int> <dbl> <dbl>   <dbl>    <meas> <list>     
#>  1     1  60.5  22.5    16.7 [100 × 2] <int [100]>
#>  2     2  46    40.1    13.5 [100 × 2] <int [100]>
#>  3     3  71     8.4    20.5 [100 × 2] <int [100]>
#>  4     4  72.8   5.9    20.7 [100 × 2] <int [100]>
#>  5     5  58.3  25.5    15.5 [100 × 2] <int [100]>
#>  6     6  44    42.7    13.7 [100 × 2] <int [100]>
#>  7     7  44    42.7    13.7 [100 × 2] <int [100]>
#>  8     8  69.3  10.6    19.3 [100 × 2] <int [100]>
#>  9     9  61.4  19.9    17.7 [100 × 2] <int [100]>
#> 10    10  61.4  19.9    17.7 [100 × 2] <int [100]>
#> # ℹ 205 more rows
# }