Releases: gesistsa/rio
Slimmer loads and support for multi-object I/O
This release adds new functionality, including an import_list()
function to retrieve multiple files from a directory or multiple data frames from a multi-object file (e.g., HTML page, .Rdata file, zip directory, etc.). export()
to Excel (.xlsx) and HTML also supports multi-object writing via export()
.
The release also streamlines the set of default packages, so that the package has fewer Imports
dependencies, installs faster, and loads faster. install_formats()
will install all Suggests
packages to enable full file format support.
New features
- New function
import_list()
returns a list of data frames from a multi-object Excel Workbook, .Rdata file, zip directory, or HTML file. (#126, #129) import_list()
now returns aNULL
entry for any failed imports, with a warning. (#149)import_list()
gains additional argumentsrbind_fill
andrbind_label
to control rbind-ing behavior. (#149)- Added an
rbind
argument toimport_list()
. (#149) - Added a
setclass
argument toimport_list()
, ala the same inimport()
. export()
can now write a list of data frames to an Excel (.xlsx) workbook. (#142, h/t Jeremy Johnson)export()
can now write a list of data frames to an HTML (.html) file.- Improved documentation of mapping between file format support and the packages used for each format. (#151, h/t Patrick Kennedy)
- Moved all non-critical format packages to Suggests, rather than Imports. (#143)
- Import to and export from the clipboard now relies on
clipr::read_clip()
andclipr::write_clip()
, respectively, thus (finally) providing Linux support. (#105, h/t Matthew Lincoln) - Added support for Matlab formats. (#78, #98)
- Added support for fst format. (#138)
- With new data.table release, export using
fwrite()
is now the default for text-based file formats. - Handle HTML tables with
<tbody>
elements. (h/t Mohamed Elgoussi) - Google Spreadsheets can now be imported using any of the allowed formats (CSV, TSV, XLSX, ODS).
- Added support for writing to ODS files via
readODS::write_ods()
. (#96) - Modified defaults and argument handling in internal function
read_delim()
.
Bug fixes
- Further fixes to .csv.gz import/export. (#146, h/t Trevor Davis)
- Verbosity of
export(format = "fwf")
now depends onoptions("verbose")
. - Fixed various errors, warnings, and messages in fixed-width format tests.
- Bumped readxl dependency to
>= 0.1.1
(#130, h/t Yongfa Chen) - Pass explicit
excel_format
arguments when using readxl functions. (#130) - Fixed a bug in the
.import.rio_xls()
and.import.rio_xlsx()
where thesheet
argument would return an error. - Fixed a bug in the import of delimited files when
fread = FALSE
. (#133, h/t Christopher Gandrud) - Fixed a bug in
.import.rio_xls()
wherein thewhich
argument was ignored. (h/t Mohamed Elgoussi) - Fixed handling of "data.table", "tibble", and "data.frame" classes in
set_class()
. (#144)
Maintenance Release
This is a maintenance release primarily intended to continue support for SPSS, Stata, and SAS files after the update of haven to v1.0.0. The issues addressed since last release are:
New features
- Added support for importing from multi-table HTML files using the
which
argument. (#126) - Update import and export methods to use new xml2 for XML and HTML export. (#86)
- Added support for export of .sas7bdat files via haven (#116)
- Restored support for import from SPSS portable via haven (#116)
- Improved behavior of
import()
andexport()
with respect to unrecognized file types. (#124, #125, h/t Jason Becker) - Attempt to recognize compressed but non-archived file formats (e.g., ".csv.gz"). (#123, h/t trevorld)
Minor notes
- Added explicit tests of the S3 extension mechanism for
.import()
and.export()
. - Fix failing tests related to stricter variable name handling for Stata files in development version of haven. (#113, h/t Hadley Wickham)
- Updated import methods to reflect changed formal argument names in haven. (#116)
- Converted to roxygen2 documentation and made NEWS an explicit markdown file.
Maintenance Release
This is a patch release that contains a few small fixes:
- Changed feather Imports to Suggests to make rio installable on older R versions. (#104)
- Migrated CSVY-related code to separate package (https://github.com/leeper/csvy/). (#111)
- Fix import of European-style CSV files (sep = "," and sep2 = ";"). (#106, #107, h/t Stani Stadlmann)
- Removed unnecessary error in xlsx imports. (#103, h/t Kevin Wright)
- Noted new RStudio add-in, GREA, that uses rio. (#109)
- Note unsupported NumPy i/o via RcppCNPy. (#112)
Maintenance Release
This is a maintenance release with a few small bug fixes and the removal of support for importing SPSS portable files.
- Removed support for import of SPSS Portable (.por) files, given deprecation from haven. (#100)
- Fixed a bug in the handling of "labelled" class variables imported from haven caused by haven return "tbl_df" rather than "data.frame" class structures. (#102, h/t Pierre LaFortune)
- Improved use of the
sep
argument for import of delimited files to allow override of defaults. (#99, h/t Danny Parsons) - Fixed a failing test of file compression that was found in v0.4.3 on some platforms.
- Fixed other tests to remove (unimportant) warnings.
v0.4.3
This is a small maintenance release with some changes that probably should have made it into v0.4.
New Features
Bug Fixes
v0.4
New Features
Improved Attribute Handling
- Attribute-handling behavior from v0.2 is restored, keeping attributes at the data.frame level. This has been made consistent across all
import()
methods to further increase consistency in the structure of imported data regardless of import method or file format. It also means import usinghaven = TRUE
for SAS, Stata, and SPSS files should be inconsequential for data structure (compared to use of the "foreign" package) while retaining speed improvements. (#80) - Added a
gather_attrs()
function that moves variable-level attributes to the data.frame level. (#80)
Extension Mechanism
.import()
and.export()
are now exported S3 generics and documentation has been added to describe how to write rio extensions for new file types. An example of this functionality is shown in the new rio.db package. (#42, h/t Jason Becker)- When rio receives an unrecognized file format, it now issues a message. The new internal
.import.default()
and.export.default()
then produce an error. This enables add-on packages to support additional formats through new s3 methods of the form.import.rio_EXTENSION()
and.export.rio_EXTENSION()
.
New Format Support
- Added support for import from and export to HTML tables (#86)
- Added support for import from fixed-width format files via
readr::read_fwf()
with a specifiedwidths
argument. This may enable faster import of these types of files and provides a base-like interface for working with readr. (#48) - Added support for import from and export to yaml. (#83)
- Export of CSVY files and metadata now supported by
export()
. (#73, #74)
Bug Fixes
- Fixed a bug in
import()
(introduced in #62, 7a7480e) that prevented import from clipboard. (h/t Kevin Wright) - Export to tar now tries to correct for bugs in
tar()
that are being fixed in base R via PR#16716. export()
returns a character string. (#82)- Fixed error in export to CSVY with a commented yaml header. (#81, h/t Andrew MacDonald)
- Fixed a bug in import from remote URLs with incorrect file extensions.
- Fixed a bug when reading from an uncommented CSVY yaml header that contained single-line comments. (#84, h/t Tom Aldenberg)
Miscellaneous Improvements
import()
now uses xml2 to read XML structures andexport()
uses a custom method for writing to XML, thereby negating dependency on the XML package. (#67)- Enhancements were made to import and export of CSVY to store attribute metadata as variable-level attributes (like imports from binary file formats).
import()
gains awhich
argument that is used to select which file to return from within a compressed tar or zip archive.export()
now allows automatic file compression as tar, gzip, or zip using thefile
argument (e.g.,export(iris, "iris.csv.zip")
).- Exporting factors to fixed-width format now saves those values as integer rather than numeric.
- Expanded verbosity of
export()
for fixed-width format files and added a commented header containing column class and width information. - Expanded test suite and separated tests into format-specific files. (#51)
- Diagnostic messages were cleaned up to facilitate translation. (#57)
v0.3
New Features
- Added support for direct import from Google Sheets. (#60, #63, h/t Chung-hong Chan)
- Use readxl for Excel file imports.
- Support for import from CSVY files
- Improved support for importing from compressed directories, especially web-based compressed directories. (#38)
- New CONTRIBUTING.md describes how to contribute to the package.
Bug(-like) Fixes
- Modified behavior so that files imported using haven now store variable metadata at the data.frame level by default (unlike the default behavior in haven, which can cause problems). (#37, h/t Ista Zahn)
- Set a default numerical precision (of 2 decimal places) for export to fixed-width format.
Internal improvements
- Added test suite to test file import, export, and conversion, including some small example files.
- Setup message internationalization. Contributions of message translations are welcome.
- Refactored remote file retrieval into separate (non-exported) function used by
import()
. (#62) - If file format for a remote file cannot be identified from the supplied URL or the final URL reported by
curl::curl_fetch_memory()
, the HTTP headers are checked for a filename in the Content-Disposition header. (#36) - Use
urltools::url_parse()
to extract file extensions from complex URLs (e.g., those with query arguments). (#56) - Added import dependency on data.table 1.9.5. (#39)