step_measure_align_cow() creates a specification of a recipe step that
aligns spectra using Correlation Optimized Warping (COW). This method uses
piecewise linear warping to correct for non-linear shifts.
Arguments
- recipe
A recipe object.
- measures
An optional character vector of measure column names.
- reference
How to determine the reference:
"mean"(default, mean spectrum from training),"median"(median spectrum from training), or"first"(first sample).- segment_length
Length of each segment for warping. Default is 30. Tunable via
align_segment_length().- slack
Maximum compression/expansion per segment in points. Default is 1. A slack of 1 means each segment can shrink or expand by 1 point.
- role
Not used.
- trained
Logical indicating if the step has been trained.
- skip
Logical. Should the step be skipped when baking?
- id
Unique step identifier.
Details
Correlation Optimized Warping (COW) divides signals into segments and uses dynamic programming to find the optimal piecewise linear warping that maximizes correlation with the reference spectrum.
Key parameters:
segment_length: Controls the resolution of warping. Smaller segments allow more local corrections but increase computation.slack: Controls how much each segment can stretch or compress. Larger values allow more flexibility but may introduce artifacts.
This is a pure R implementation based on Nielsen et al. (1998).
References
Nielsen, N.P.V., Carstensen, J.M., and Smedsgaard, J. (1998). Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. Journal of Chromatography A, 805, 17-35.
See also
Other measure-align:
step_measure_align_dtw(),
step_measure_align_ptw(),
step_measure_align_reference(),
step_measure_align_shift()
Examples
library(recipes)
rec <- recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_align_cow(segment_length = 20, slack = 2) |>
prep()
bake(rec, new_data = NULL)
#> # A tibble: 215 × 5
#> id water fat protein .measures
#> <int> <dbl> <dbl> <dbl> <meas>
#> 1 1 60.5 22.5 16.7 [100 × 2]
#> 2 2 46 40.1 13.5 [100 × 2]
#> 3 3 71 8.4 20.5 [100 × 2]
#> 4 4 72.8 5.9 20.7 [100 × 2]
#> 5 5 58.3 25.5 15.5 [100 × 2]
#> 6 6 44 42.7 13.7 [100 × 2]
#> 7 7 44 42.7 13.7 [100 × 2]
#> 8 8 69.3 10.6 19.3 [100 × 2]
#> 9 9 61.4 19.9 17.7 [100 × 2]
#> 10 10 61.4 19.9 17.7 [100 × 2]
#> # ℹ 205 more rows