Python-Based Baseline Correction via pybaselines
Source:R/baseline-py.R
step_measure_baseline_py.Rdstep_measure_baseline_py() creates a specification of a recipe step
that applies baseline correction using the Python pybaselines library,
which provides 50+ baseline correction algorithms.
Usage
step_measure_baseline_py(
recipe,
method = "asls",
...,
subtract = TRUE,
measures = NULL,
role = NA,
trained = FALSE,
skip = FALSE,
id = recipes::rand_id("measure_baseline_py")
)Arguments
- recipe
A recipe object. The step will be added to the sequence of operations for this recipe.
- method
The pybaselines method to use. Common methods include:
Whittaker methods:
"asls","iasls","airpls","arpls","drpls","psalsa"Polynomial methods:
"poly","modpoly","imodpoly","loess","quant_reg"Morphological:
"mor","imor","rolling_ball","tophat"Spline:
"pspline_asls","pspline_airpls","mixture_model"Smooth:
"snip","swima","noise_median"Classification:
"dietrich","golotvin","fastchrom"See pybaselines documentation for the full list.
- ...
Additional arguments passed to the pybaselines method. Common parameters include:
lam: Smoothness parameter for Whittaker methods (default varies by method)p: Asymmetry parameter for ALS methods (default ~0.01)poly_order: Polynomial degree for polynomial methodshalf_window: Window size for morphological methodsmax_half_window: Maximum window for SNIP method
- subtract
If
TRUE(default), the baseline is subtracted from the signal. IfFALSE, the baseline values replace the original values (useful for extracting baselines).- measures
An optional character vector of measure column names to process. If
NULL(the default), all measure columns (columns with classmeasure_list) will be processed.- role
Not used by this step since no new variables are created.
- trained
A logical to indicate if the quantities for preprocessing have been estimated.
- skip
A logical. Should the step be skipped when the recipe is baked?
- id
A character string that is unique to this step to identify it.
Value
An updated version of recipe with the new step added to the
sequence of any existing operations.
Details
This step provides access to the comprehensive pybaselines Python library, which implements over 50 baseline correction algorithms across several categories:
Whittaker Methods
Based on penalized least squares with asymmetric weights:
asls: Asymmetric Least Squares (good general-purpose method)iasls: Improved ALS with automatic smoothness selectionairpls: Adaptive iteratively reweighted penalized least squaresarpls: Asymmetrically reweighted penalized least squarespsalsa: Peaked Signal's Asymmetric Least Squares Algorithm
Polynomial Methods
Fit polynomials to baseline regions:
poly: Simple polynomial fittingmodpoly: Modified polynomial (iterative)imodpoly: Improved modified polynomialloess: Local regression (LOESS)
Morphological Methods
Based on mathematical morphology:
mor: Morphological openingimor: Improved morphologicalrolling_ball: Rolling ball algorithmtophat: Top-hat transform
Requirements
This step requires the reticulate package and Python with pybaselines
installed. Install pybaselines with:
reticulate::py_require("pybaselines")No selectors should be supplied to this step function. The data should be
in the internal format produced by step_measure_input_wide() or
step_measure_input_long().
Tidying
When you tidy() this step, a tibble with columns
terms, method, subtract, and id is returned.
See also
step_measure_baseline_als(), step_measure_baseline_custom() for
R-based alternatives.
Other measure-baseline:
step_measure_baseline_airpls(),
step_measure_baseline_als(),
step_measure_baseline_arpls(),
step_measure_baseline_auto(),
step_measure_baseline_custom(),
step_measure_baseline_gpc(),
step_measure_baseline_minima(),
step_measure_baseline_morph(),
step_measure_baseline_poly(),
step_measure_baseline_rf(),
step_measure_baseline_rolling(),
step_measure_baseline_snip(),
step_measure_baseline_tophat(),
step_measure_detrend()
Examples
if (FALSE) { # measure:::.pybaselines_available()
library(recipes)
# Asymmetric Least Squares baseline correction
rec <- recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_baseline_py(method = "asls", lam = 1e6, p = 0.01) |>
prep()
bake(rec, new_data = NULL)
# Using SNIP algorithm
rec2 <- recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_baseline_py(method = "snip", max_half_window = 40) |>
prep()
}