Skip to contents

step_sec_mw_averages() creates a specification of a recipe step that calculates molecular weight averages from size exclusion chromatography data.

Usage

step_sec_mw_averages(
  recipe,
  measures = NULL,
  calibration = NULL,
  integration_range = NULL,
  output_cols = c("mn", "mw", "mz", "mp", "dispersity"),
  include_uncertainty = FALSE,
  calibration_error = NULL,
  prefix = "mw_",
  role = "predictor",
  trained = FALSE,
  skip = FALSE,
  id = recipes::rand_id("sec_mw_averages")
)

Arguments

recipe

A recipe object.

measures

An optional character vector of measure column names.

calibration

Calibration method for converting x-axis to log(MW). Can be:

  • NULL (default): Assumes x-axis is already log10(MW)

  • A numeric vector of length 2: Linear calibration c(slope, intercept) where log10(MW) = slope * x + intercept

  • "auto": Estimate from data range (assumes typical polymer range)

integration_range

Optional numeric vector c(min, max) specifying the x-axis range for integration. If NULL, uses full range.

output_cols

Character vector of metrics to calculate. Default includes all: c("mn", "mw", "mz", "mp", "dispersity").

include_uncertainty

Logical. If TRUE, calculates and outputs uncertainty estimates for MW averages. Requires calibration_error to be specified. Default is FALSE.

calibration_error

Calibration error (RMSE) in log10(MW) units for uncertainty propagation. Required when include_uncertainty = TRUE. Can be obtained from tidy() output of step_sec_conventional_cal().

prefix

Prefix for output column names. Default is "mw_".

role

Role for generated columns. Default is "predictor".

trained

Logical indicating if the step has been trained.

skip

Logical. Should the step be skipped when baking?

id

Unique step identifier.

Value

An updated recipe with the new step added.

Details

This step calculates standard molecular weight averages from SEC/GPC data:

MetricFormulaDescription
Mnsum(w) / sum(w/M)Number-average molecular weight
Mwsum(wM) / sum(w)Weight-average molecular weight
Mzsum(wM^2) / sum(wM)Z-average molecular weight
MpM at peak maximumPeak molecular weight
DMw/MnDispersity (polydispersity index)

The detector signal is assumed to be proportional to weight concentration. For RI detection, this is typically valid. For UV detection, response factors may need to be applied first.

Uncertainty Propagation:

When include_uncertainty = TRUE, the step calculates uncertainty estimates based on calibration error propagation. The uncertainties account for:

  • Calibration curve fit error (RMSE in log10 MW units)

  • MW distribution width effects on different averages

The propagation follows:

  • Mn uncertainty is enhanced for wide distributions (most sensitive to low MW)

  • Mw uncertainty equals the relative calibration error

  • Mz uncertainty is enhanced for high MW sensitivity

  • Dispersity uncertainty from error propagation of Mw/Mn

Prerequisites:

  • Data should be baseline corrected

  • X-axis should represent retention time/volume or log(MW)

  • Integration limits should exclude solvent peaks

Examples

if (FALSE) { # \dontrun{
library(recipes)
library(measure)

# Assuming x-axis is already calibrated to log10(MW)
rec <- recipe(~., data = sec_triple_detect) |>
  step_measure_input_wide(starts_with("signal_")) |>
  step_sec_baseline() |>
  step_sec_mw_averages() |>
  prep()

# With uncertainty propagation (calibration_error from tidy() of calibration step)
rec_with_unc <- recipe(~., data = sec_triple_detect) |>
  step_measure_input_wide(starts_with("signal_")) |>
  step_sec_baseline() |>
  step_sec_mw_averages(
    include_uncertainty = TRUE,
    calibration_error = 0.02  # RMSE in log10(MW)
  ) |>
  prep()
} # }