step_measure_snv() creates a specification of a recipe step that applies
Standard Normal Variate transformation to spectral data. SNV normalizes each
spectrum to have zero mean and unit standard deviation.
Usage
step_measure_snv(
recipe,
measures = NULL,
role = NA,
trained = FALSE,
skip = FALSE,
id = recipes::rand_id("measure_snv")
)Arguments
- recipe
A recipe object. The step will be added to the sequence of operations for this recipe.
- measures
An optional character vector of measure column names to process. If
NULL(the default), all measure columns (columns with classmeasure_list) will be processed. Use this to limit processing to specific measure columns when working with multiple measurement types.- role
Not used by this step since no new variables are created.
- trained
A logical to indicate if the quantities for preprocessing have been estimated.
- skip
A logical. Should the step be skipped when the recipe is baked by
recipes::bake()? While all operations are baked whenrecipes::prep()is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when usingskip = TRUEas it may affect the computations for subsequent operations.- id
A character string that is unique to this step to identify it.
Value
An updated version of recipe with the new step added to the
sequence of any existing operations.
Details
Standard Normal Variate (SNV) is a row-wise transformation that normalizes each spectrum independently. For a spectrum \(x\), the transformation is:
$$SNV(x) = \frac{x - \bar{x}}{s_x}$$
where \(\bar{x}\) is the mean and \(s_x\) is the standard deviation of the spectrum values.
SNV is commonly used to remove multiplicative effects of scatter and particle size in NIR spectroscopy. After SNV transformation, each spectrum will have a mean of zero and a standard deviation of one.
No selectors should be supplied to this step function. The data should be
in the internal format produced by step_measure_input_wide() or
step_measure_input_long().
The measurement locations are preserved unchanged.
Tidying
When you tidy() this step, a tibble with column
terms (set to ".measures") and id is returned.
See also
Other measure-preprocessing:
step_measure_absorbance(),
step_measure_calibrate_x(),
step_measure_calibrate_y(),
step_measure_derivative(),
step_measure_derivative_gap(),
step_measure_emsc(),
step_measure_kubelka_munk(),
step_measure_log(),
step_measure_map(),
step_measure_msc(),
step_measure_normalize_istd(),
step_measure_osc(),
step_measure_ratio_reference(),
step_measure_subtract_blank(),
step_measure_subtract_reference(),
step_measure_transmittance()
Examples
library(recipes)
rec <-
recipe(water + fat + protein ~ ., data = meats_long) |>
update_role(id, new_role = "id") |>
step_measure_input_long(transmittance, location = vars(channel)) |>
step_measure_snv() |>
prep()
bake(rec, new_data = NULL)
#> # A tibble: 215 × 5
#> id water fat protein .measures
#> <int> <dbl> <dbl> <dbl> <meas>
#> 1 1 60.5 22.5 16.7 [100 × 2]
#> 2 2 46 40.1 13.5 [100 × 2]
#> 3 3 71 8.4 20.5 [100 × 2]
#> 4 4 72.8 5.9 20.7 [100 × 2]
#> 5 5 58.3 25.5 15.5 [100 × 2]
#> 6 6 44 42.7 13.7 [100 × 2]
#> 7 7 44 42.7 13.7 [100 × 2]
#> 8 8 69.3 10.6 19.3 [100 × 2]
#> 9 9 61.4 19.9 17.7 [100 × 2]
#> 10 10 61.4 19.9 17.7 [100 × 2]
#> # ℹ 205 more rows