Skip to contents

Create quantile-quantile (QQ) plots to compare the distribution of variables between treatment groups before and after weighting. This function helps assess covariate balance by visualizing how well the quantiles align between groups.

Usage

plot_qq(
  .data,
  .var,
  .group,
  .wts = NULL,
  quantiles = seq(0.01, 0.99, 0.01),
  include_observed = TRUE,
  treatment_level = NULL,
  na.rm = FALSE
)

Arguments

.data

A data frame containing the variables.

.var

Variable to plot. Can be unquoted (e.g., age) or quoted (e.g., "age").

.group

Column name of treatment/group variable. Can be unquoted (e.g., qsmk) or quoted (e.g., "qsmk").

.wts

Optional weighting variable(s). Can be unquoted variable names, a character vector, or NULL. Multiple weights can be provided to compare different weighting schemes. Default is NULL (unweighted).

quantiles

Numeric vector of quantiles to compute. Default is seq(0.01, 0.99, 0.01) for 99 quantiles.

include_observed

Logical. If using .wts, also show observed (unweighted) QQ plot? Defaults to TRUE.

treatment_level

The reference treatment level to use for comparisons. If NULL (default), uses the last level for factors or the maximum value for numeric variables.

na.rm

Logical; if TRUE, drop NA values before computation.

Value

A ggplot2 object.

Details

QQ plots display the quantiles of one distribution against the quantiles of another. Perfect distributional balance appears as points along the 45-degree line (y = x). This function automatically adds this reference line and appropriate axis labels.

For an alternative visualization of the same information, see geom_ecdf(), which shows the empirical cumulative distribution functions directly.

See also

  • geom_ecdf() for ECDF plots, an alternative distributional visualization

  • geom_qq2() for the underlying geom used by this function

  • qq() for computing QQ data without plotting

Examples

library(ggplot2)

# Basic QQ plot (observed)
plot_qq(nhefs_weights, age, qsmk)


# With weighting
plot_qq(nhefs_weights, age, qsmk, .wts = w_ate)


# Compare multiple weighting schemes
plot_qq(nhefs_weights, age, qsmk, .wts = c(w_ate, w_att))


# For propensity scores
plot_qq(nhefs_weights, .fitted, qsmk, .wts = w_ate)


# Without observed comparison
plot_qq(nhefs_weights, age, qsmk, .wts = w_ate, include_observed = FALSE)