17 Aug 16:07

mayer79

f3cd709

CRAN release 2.6.0 Latest

Latest

Major bug fix

Fixes a major bug, by which responses would be used as covariates in the random forests. Thanks for reporting @flystar233, see #78.
You can expect different and better imputations.

Major feature

Out-of-sample application is now possible! Thanks to @jeandigitale for pushing the idea in #58.

This means you can run imp <- missRanger(..., keep_forests = TRUE) and then apply its models to new data via predict(imp, newdata). The "missRanger" object can be saved/loaded as binary file, e.g, via saveRDS()/readRDS() for later use.

Note that out-of-sample imputation works best for rows in newdata with only one
missing value (counting only missings in variables used as covariates in random forests). We call this the "easy case". In the "hard case",
even multiple iterations (set by iter) can lead to unsatisfactory results.

The out-of-sample algorithm works as follows:

Impute univariately all relevant columns by randomly drawing values
from the original unimputed data. This step will only impact "hard case" rows.
Replace univariate imputations by predictions of random forests. This is done
sequentially over variables, where the variables are sorted to minimize the impact
of univariate imputations. Optionally, this is followed by predictive mean matching (PMM).
Repeat Step 2 for "hard case" rows multiple times.

Possibly breaking changes

Columns of special type like date/time can't be imputed anymore. You will need to convert them to numeric before imputation.
pmm() is more picky: xtrain and xtest must both be either numeric, logical, or factor (with identical levels).

Minor changes in output object

Add original data as data_raw.
Renamed visit_seq to to_impute.

Other changes

Now requires ranger >= 0.16.0.
More compact vignettes.
Better examples and README.
Many relevant ranger() arguments are now explicit arguments in missRanger() to improve tab-completion experience:
- num.trees = 500
- mtry = NULL
- min.node.size = NULL
- min.bucket = NULL
- max.depth = NULL
- replace = TRUE
- sample.fraction = if (replace) 1 else 0.632
- case.weights = NULL
- num.threads = NULL
- save.memory = FALSE
For variables that can't be used, more information is printed.
If keep_forests = TRUE, the argument data_only is set to FALSE by default.
"missRanger" object now stores pmm.k.
verbose argument is passed to ranger() as well.

Assets 2

13 Jul 07:30

mayer79

2.5.0

20247e3

CRAN release 2.5.0

Bug fixes

Since Release 2.3.0, unintentionally, negative formula terms haven't been dropped, see #62. This is fixed now.

Enhancements

The vignette on multiple imputations has been revised, and a larger number of donors in predictive mean matching is being used in the example.

Assets 2

19 Nov 09:56

mayer79

2.4.0

fa1775c

CRAN release 2.4.0

Future Output API

New argument data_only = TRUE to control if only the imputed data should be returned (default), or an object of class "missRanger". This object contains the imputed data and infos like OOB prediction errors, fixing #28. The value FALSE will later becoming the default in {missRanger 3.0.0}. This will be announced via deprecation cycle.

Enhancements

New argument keep_forests = FALSE. Should the random forests of the best iteration (the one that generated the final imputed data) be added to the "missRanger" object? Note that this will use a lot of memory. Only relevant if data_only = FALSE. This solves #54.

Bug fixes

In case the algorithm did not converge, the data of the last iteration was returned instead of the current one. This has been fixed.

Assets 2

20 Oct 19:32

mayer79

2.3.0

ac24bae

CRAN release 2.3.0

Major improvements

missRanger() now works with syntactically wrong variable names like "1bad:variable". This solves an old issue, recently popping up in this new issue.
missRanger() now works with any number of features, as long as the formula is left at its default, i.e., . ~ .. This solves this issue.

Other changes

Documentation improvement.
ranger() is now called via the x/y interface, not the formula interface anymore.

Assets 2

28 Apr 16:48

mayer79

2.2.1

300e9fd

CRAN release 2.2.1

Switch from importFrom to :: code style
Documentation improved

Assets 2

25 Mar 08:46

mayer79

2.2.0

382bb46

CRAN release 2.2.0

missRanger 2.2.0

Less dependencies

Removed {mice} from "suggested" packages.
Removed {dplyr} from "suggested" packages.
Removed {survival} from "suggested" packages.

Maintenance

Adding Github pages.
Introduction of Github actions.

Assets 2

29 Jan 11:44

mayer79

2.1.5

e65a394

Release 2.1.5

A maintenance release, mainly improving the package structuring.

Assets 2

10 Apr 14:05

mayer79

2.1.3

316acae

CRAN release 2.1.3

cran submission 2.1.3

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major bug fix

Major feature

Possibly breaking changes

Minor changes in output object

Other changes

Bug fixes

Enhancements

Future Output API

Enhancements

Bug fixes

Major improvements

Other changes

missRanger 2.2.0

Less dependencies

Maintenance

Releases: mayer79/missRanger

CRAN release 2.6.0

Major bug fix

Major feature

Possibly breaking changes

Minor changes in output object

Other changes

CRAN release 2.5.0

Bug fixes

Enhancements

CRAN release 2.4.0

Future Output API

Enhancements

Bug fixes

CRAN release 2.3.0

Major improvements

Other changes

CRAN release 2.2.1

CRAN release 2.2.0

missRanger 2.2.0

Less dependencies

Maintenance

Release 2.1.5

CRAN release 2.1.3