- Transformed distributions now work better for monotonic increasing functions when the domain of the untransformed distribution is R (#129).
- The
quantile()
method fordist_multivariate_normal()
now defaults to equicoordinate quantiles. To obtain marginal quantiles, you should usequantile(dist, p, type = "marginal")
.
support()
now shows whether the interval of support is open or closed (@venpopov, #97).- Added default method for
cdf()
which estimates the CDF using Monte Carlo integration (@robjhyndman, #122).
- Added
dist_gk()
for g-and-k distributions. - Added
dist_gh()
for g-and-h distributions. - Added
dist_gev()
for the Generalised Extreme Value distribution anddist_gpd()
for the Generalised Pareto distribution (@robjhyndman, #124).
dist_mixture()
now displays the components of the mixture when the output width is sufficiently wide (@statasaurus, #112).generate()
now respectsdimnames()
for multivariate distributions.dist_mixture()
now supports multivariate distributions (@robjhyndman, #122).
- Fixed error when using '-' as a unary operator on a distribution different from
dist_normal()
(@venpopov, #95) - Density for transformed distributions now correctly gives 0 instead of NaNs for values outside the support of the distribution (@venpopov, #97)
- Fixed
quantile()
andcdf()
for transformed distributions with monotonically decreasing transformations (#100). - Fixed multivariate
dist_sample()
methods not structuring multivariate results correctly as matrices. - The
cdf()
method fordist_multivariate_normal()
now gives the P(X <= q) rather than P(X > q) for consistency with all othercdf()
methods. - The
quantile()
method fordist_multivariate_normal()
now correctly gives the boundaries whenp=0
orp=1
whentype="equicoordinate"
. - The
quantile()
method fordist_multivariate_normal()
now only square roots the diagonal elements whentype="marginal"
.
- All graphics related functionality has been removed from the package in favour of the ggdist (https://cran.r-project.org/package=ggdist) package. This breaking change was done to substantially reduce the package's dependencies, focusing the functionality on representing vectors of distributions.
Small patch to resolve issues with CRAN checks.
- Fixed object structure resulting from transforming sample distributions (#81).
- Improved reliability of
quantile(<dist_mixture>)
. - Defined
cdf(<dist_sample>)
as Pr(X <= x), not Pr(X < x). - Fixed S3 generic argument name
p
forlog_quantile()
.
- Add Math and Ops methods for sample distribution, which applies the functions directly to the samples.
- Added
mean
andsd
as aliases formu
andsigma
respectively indist_normal()
anddist_student_t()
to match arguments of the stats package interface (#76). - Added
scale
argument for alternative specification fordist_burr()
anddist_gamma()
.
- Generics introduced by this package now allow
na.rm
and other parameters to be passed to distribution methods, even if these parameters aren't used. The package no longer checks the usage of...
with theellipsis
package, if you'd like to check that all...
are used, you can write your own wrapping functions. - Lists of functions can now be used in
dist_transformed()
, allowing the transformation to differ for each distribution. covariance()
and other matrix output functions of multivariate distributions now name the result using the distribution's dimension names.- Improve handling of mixture distribution quantiles at boundaries {0,1}.
- Fixed issue with computing multiple values from a univariate distribution with named dimensions (#79).
- Added
dist_categorical()
for the Categorical distribution. - Added
dist_lognormal()
for the log-normal distribution. Mathematical conversion shortcuts have also been added, soexp(dist_normal())
producesdist_lognormal()
.
- Added
parameters()
generic for obtaining the distribution's parameters. - Added
family(<distribution>)
for getting the distribution's family name. - Added
covariance()
to return the covariance of a distribution. - Added
support()
to identify the distribution's region of support (#8). - Added
log_likelihood()
for computing the log-likelihood of observing a sample from a distribution.
variance()
now always returns a variance. It will not default to providing a covariance matrix for matrices. This also applies to multivariate distributions such asdist_multivariate_normal()
. The covariance can now be obtained using thecovariance()
function.dist_wrap()
can now search for distribution functions in any environment, not just packages. If thepackage
argument isNULL
, it will search the calling environment for the functions. You can also provide a package name as before, and additionally an arbitrary environment to this argument.median()
methods will now ignore thena.rm
option when it does not apply to that distribution type (#72).dist_sample()
now allows for missing values to be stored. Note thatdensity()
,quantile()
andcdf()
will remove these missing values by default. This behaviour can be changed with thena.rm
argument.<hilo>
objects now support non-numeric and multivariate distributions.<hilo>
vectors that have different bound types cannot be mixed (#74).- Improved performance of default methods of
mean()
andvariance()
, which no longer use sampling based means and variances for univariate continuous distributions (#71, @mjskay) dist_binomial()
distributions now return integers forquantile()
andgenerate()
methods.- Added conditional examples for distributions using functions from supported packages.
- Fixed fallback
format()
function for distributions classes that have not defined this method (#67).
variance()
on adist_multivariate_normal()
will now return the diagonal instead of the complete variance-covariance matrix.dist_bernoulli()
will now return logical values forquantile()
andgenerate()
.
- Added
is_distribution()
to identify if an object is a distribution.
- Improved NA structure of distributions, allowing it to work with
is.na()
andvctrs
vector resizing / filling functionality. - Added
as.character(<hilo>)
method, allowing datasets containinghilo()
objects to be saved as a text file (#57).
- Fixed issue with
hdr()
rangesize
incorrectly being treated as100-size
, giving 5% ranges for 95% sizes and vice-versa (#61).
A small performance and methods release. Some issues with truncated distributions have been fixed, and some more distribution methods have been added which improve performance of common tasks.
- Added
dist_missing()
for representing unknown or missing (NA) distributions.
- Documentation improvements.
- Added
cdf()
method fordist_sample()
which uses the emperical cdf. dist_mixture()
now preservesdimnames()
if all distributions have the samedimnames()
.- Added
density()
andgenerate()
methods for sample distributions. - Added
skewness()
method fordist_sample()
. - Improved performance for truncated Normal and sample distributions (#49).
- Improved vectorisation of distribution methods.
- Fixed issue with computing the median of
dist_truncated()
distributions. - Fixed format method for
dist_truncated()
distributions with no upper or lower limit. - Fixed issue with naming objects giving an invalid structure. It now gives an informative error (#23).
- Fixed documentation for Negative Binomial distribution (#46).
- Added
dist_wrap()
for wrapping distributions not yet added in the package.
- Added
likelihood()
for computing the likelihood of observing a sample from a distribution. - Added
skewness()
for computing the skewness of a distribution. - Added
kurtosis()
for computing the kurtosis of a distribution. - The
density()
,cdf()
andquantile()
methods now accept alog
argument which will use/return probabilities as log probabilities.
- Improved documentation for most distributions to include equations for the
region of support, summary statistics, density functions and moments. This is
the work of @alexpghayes in the
distributions3
package. - Documentation improvements
- Added support for displaying distributions with
View()
. hilo()
intervals can no longer be added to other intervals, as this is a common mistake when aggregating forecasts.- Incremented
d
fornumDeriv::hessian()
when computing mean and variance of transformed distributions.
- Graphics functionality provided by
autoplot.distribution()
is now deprecated in favour of using theggdist
package. Theggdist
package allows distributions produced by distributional to be used directly with ggplot2 as aesthetics.
First release.
distribution
: Distributions are represented in a vectorised format using the vctrs package. This makes distributions suitable for inclusion in model prediction output. Adistribution
is a container for distribution-specific S3 classes.hilo
: Intervals are also stored in a vector. Ahilo
consists of alower
bound,upper
bound, and confidencelevel
. Each numerical element can be extracted using$
, for example my_hilo$lower to obtain the lower bounds.hdr
: Highest density regions are currently stored as lists ofhilo
values. This is an experimental feature, and is likely to be expanded upon in an upcoming release.
Values of interest can be computed from the distribution using generic functions. The first release provides 9 functions for interacting with distributions:
density()
: The probability density/mass function (equivalent tod...()
).cdf()
: The cumulative distribution function (equivalent top...()
).generate()
: Random generation from the distribution (equivalent tor...()
).quantile()
: Compute quantiles of the distribution (equivalent toq...()
).hilo()
: Compute probability intervals of probability distribution(s).hdr()
: Compute highest density regions of probability distribution(s).mean()
: Obtain the mean(s) of probability distribution(s).median()
: Obtain the median(s) of probability distribution(s).variance()
: Obtain the variance(s) of probability distribution(s).
- Added an
autoplot()
method for visualising the probability density function ([density()
]) or cumulative distribution function ([cdf()
]) of one or more distribution. - Added
geom_hilo_ribbon()
andgeom_hilo_linerange()
geometries for ggplot2. These geoms allow uncertainty to be shown graphically withhilo()
intervals.
- Added 20 continuous probability distributions:
dist_beta()
,dist_burr()
,dist_cauchy()
,dist_chisq()
,dist_exponential()
,dist_f()
,dist_gamma()
,dist_gumbel()
,dist_hypergeometric()
,dist_inverse_exponential()
,dist_inverse_gamma()
,dist_inverse_gaussian()
,dist_logistic()
,dist_multivariate_normal()
,dist_normal()
,dist_pareto()
,dist_student_t()
,dist_studentized_range()
,dist_uniform()
,dist_weibull()
- Added 8 discrete probability distributions:
dist_bernoulli()
,dist_binomial()
,dist_geometric()
,dist_logarithmic()
,dist_multinomial()
,dist_negative_binomial()
,dist_poisson()
,dist_poisson_inverse_gaussian()
- Added 3 miscellaneous probability distributions:
dist_degenerate()
,dist_percentile()
,dist_sample()
- Added
dist_inflated()
which inflates a specific value of a distribution by a given probability. This can be used to produce zero-inflated distributions. - Added
dist_transformed()
for transforming distributions. This can be used to produce log distributions such as logNormal:dist_transformed(dist_normal(), transform = exp, inverse = log)
- Added
dist_mixture()
for producing weighted mixtures of distributions. - Added
dist_truncated()
to impose boundaries on a distribution's domain via truncation.