Validate adjusted OD flows against a benchmark overall — validate_flow

Compares bias-adjusted MPD flows to benchmark (e.g., census) OD flows. Uses adjusted flows as estimates (x) and benchmark as targets (y), returning summary fit metrics that capture both concordance (for example, correlation) and aggregate error (for example, RMSE, MAE, MAPE). For OD-level residual auditing, use validate_flow_residuals() or the lower-level validate_flow_pairs() table.

Retained for backwards compatibility. New code should prefer validate_flow_overall().

Usage

validate_flow_overall(
  adj_df,
  benchmark_od_df,
  flow_col_adj = "flow_adj",
  flow_col_bench = "flow",
  drop_zeros = TRUE,
  na_rm = TRUE,
  by_source = FALSE,
  return_joined = TRUE,
  method_name = NA_character_,
  comparisons = "adjusted_vs_benchmark",
  flow_col_mpd = "flow"
)

validate_flow_benchmark(...)

Arguments

adj_df: Data frame with at least: origin, destination, and a column of adjusted flows (default name "flow_adj"). If present, an mpd_source column is carried through.
benchmark_od_df: Data frame with at least: origin, destination, and a column of benchmark flows (default name "flow").
flow_col_adj: Name of adjusted flow column in adj_df. Default "flow_adj".
flow_col_bench: Name of benchmark flow column in benchmark_od_df. Default "flow".
drop_zeros: Logical, drop rows where either x or y == 0 before metrics. Default TRUE.
na_rm: Logical, remove non-finite rows before metrics. Default TRUE.
by_source: Logical, if TRUE and mpd_source exists in both inputs (or in adj_df), compute metrics per mpd_source as well as overall. Default FALSE.
return_joined: Logical, return the joined row-level data in the result list. Default TRUE.
method_name: Optional label for the adjustment method (e.g. "adjust_inverse_penetration", "adjust_selection_rate"). Stored in the output for comparison workflows.
comparisons: Flow comparison(s) to compute. The package convention is that the first series in the ID is the x-axis/baseline, the second series is the y-axis/reference, and signed residuals are Y - X. Default "adjusted_vs_benchmark". Use "all" for all supported comparisons.
flow_col_mpd: Name of raw MPD flow column in adj_df. Default "flow". Required only when comparisons includes raw MPD flows.
...: Arguments passed to validate_flow_overall().

Value

A list with:

method (if provided)
comparison metadata, n, sum_x, sum_y, sum_adj, sum_bench
pearson_r, spearman_rho
rmse, mae, mape
ols_intercept, ols_slope, r_squared (from lm(y ~ x))
(optional) summary: a tibble of metric rows when more than one comparison is requested
(optional) by_source: a tibble of per-source metrics when by_source = TRUE
(optional) data: the joined tibble used for the calculations