pithist {topmodels}R Documentation

PIT Histograms for Assessing Goodness of Fit of Probability Models

Description

PIT histograms graphically compare empirical probabilities from fitted models with a uniform distribution.

Usage

pithist(object, ...)

## Default S3 method:
pithist(object, newdata = NULL, plot = TRUE, flavor = NULL,
  style = c("histogram", "lines"), type = c("random", "proportional"), nsim = 1L, 
  delta = NULL, freq = FALSE, breaks = NULL, confint = TRUE, 
  confint_level = 0.95, confint_type = c("exact", "approximation"),
  single_graph = FALSE, xlim = c(0, 1), ylim = c(0, NA),
  xlab = "PIT", ylab = if (freq) "Frequency" else "Density", main = NULL, ...)

Arguments

object

an object from which probability integral transforms can be extracted with procast.

newdata

optionally, a data frame in which to look for variables with which to predict. If omitted, the original observations are used.

plot

logical. Should the plot method be called to draw the computed PIT histogram?

flavor

Should the rootogram be a base or ggplot2 style graphic, accordingly the invisible return value is either a data.frame or a tibble. Either set flavor expicitly to "base" vs. "tidyverse", or it's chosen automatically conditional if the packages ggplot2 and dplyr or tibble are loaded.

style

character specifying the syle of rootogram (see below). FIXME: Description

type

character. In case of discrete distributions should the PITs be drawn randomly from the corresponding interval or distributed proportionally?

nsim

integer. If type is "random" how many simulated PITs should be drawn?

delta

numeric. The minimal difference to compute the range of proabilities corresponding to each observation according to get (randomized) quantile residuals. For NULL, the minimal observed difference in the resonse divided by 5e-6 is used.

freq

logical. If TRUE, the PIT histogram is represented by frequencies, the counts component of the result; if FALSE, probability densities, component density, are plotted (so that the histogram has a total area of one).

breaks

numeric. Breaks for the histogram intervals.

confint

logical. Should confident intervals be drawn?

confint_level

numeric. The confidence level required.

confint_type

character. Which type of confidence interval. According to Agresti and Coull (1998) for interval estimation of binomial proportions an approximation can be better than exact.

single_graph

logical. Should all computed extended reliability diagrams be plotted in a single graph?

xlim, ylim

graphical parameters. These may pertain either to the whole plot or just the histogram or just the fitted line.

xlab, ylab, main

graphical parameters.

...

further graphical parameters.

Details

PIT histograms graphically the probability integral transform (PIT), i.e., observed probabilities from fitted probability models, with a uniform distribution. It leverages the procast generic and then essentially draws a hist.

In case of discrete distributions the PIT is either drawn randomly from the corresponding interval or distributed proportionally in the histogram (FIXME: not yet implemented).

References

Czado C, Gneiting T, Held L (2009). “Predictive Model Assessment for Count Data.” Biometrics, 65(4), 1254–1261.

Agresti A, Coull A B (1998). “Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions.” The American Statistician, 52(2), 119–126.

See Also

procast, hist

Examples

require("crch")
m1 <- lm(dist ~ speed, data = cars)
m2 <- crch(dist ~ speed | speed, data = cars)
m3 <- crch(dist ~ speed | speed, left = 30, data = cars)

pit1 <- pithist(m1)
pit2 <- pithist(m2, plot = FALSE)
pit3 <- pithist(m3, plot = FALSE)

plot(pit1, confint = "red", ref = "blue", fill = "lightblue")

plot(c(pit1, pit2, pit3), col = c(1, 2, 3), style = "lines")

plot(c(pit1, pit2), col = c(1, 2), single_graph = TRUE)
lines(pit3, col = 3)

[Package topmodels version 0.1-0 Index]