step_measure_parafac() creates a specification of a recipe step that
applies Parallel Factor Analysis (PARAFAC) to multi-dimensional measurement
data, extracting component scores as features for modeling.
Usage
step_measure_parafac(
recipe,
...,
n_components = 3L,
center = TRUE,
scale = FALSE,
max_iter = 500L,
tol = 1e-06,
prefix = "parafac_",
role = "predictor",
trained = FALSE,
skip = FALSE,
id = recipes::rand_id("measure_parafac")
)Arguments
- recipe
A recipe object.
- ...
One or more selector functions to choose measure columns. If empty, all nD measure columns are used.
- n_components
Number of PARAFAC components to extract. Default is 3.
- center
Logical. Should data be centered before decomposition? Default is
TRUE.- scale
Logical. Should data be scaled before decomposition? Default is
FALSE.- max_iter
Maximum number of iterations. Default is 500.
- tol
Convergence tolerance. Default is 1e-6.
- prefix
Prefix for output column names. Default is
"parafac_".- role
Not used.
- trained
Logical indicating if the step has been trained.
- skip
Logical. Should the step be skipped when baking?
- id
Unique step identifier.
Details
PARAFAC (also known as CANDECOMP/PARAFAC or CP decomposition) decomposes a three-way or higher array into a sum of rank-one tensors. For measurement data like EEM (excitation-emission matrices) or LC-DAD, this extracts interpretable components corresponding to underlying chemical species.
See also
step_measure_tucker() for Tucker decomposition
Other measure-multiway:
step_measure_mcr_als(),
step_measure_tucker()
Examples
if (FALSE) { # \dontrun{
library(recipes)
# After ingesting EEM data as 2D measurements
rec <- recipe(concentration ~ ., data = eem_data) |>
step_measure_input_long(
fluorescence,
location = vars(excitation, emission)
) |>
step_measure_parafac(n_components = 3) |>
prep()
bake(rec, new_data = NULL)
} # }