Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version 1.0.0 #84

Merged
merged 37 commits into from
Nov 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
286fbe7
use fr, update codec_check, drivetime standalone
cole-brokamp Oct 10, 2023
71e3062
remove other data sources for now
cole-brokamp Oct 10, 2023
29e9f1c
update codec_data to work with new fr
cole-brokamp Oct 10, 2023
ea2fed3
deprecate geography argument, keep interpolation outside of tdr md
cole-brokamp Oct 10, 2023
c078667
check_* functions can also take a tibble or list
cole-brokamp Oct 12, 2023
81b1239
fix data catalog page for now
cole-brokamp Oct 12, 2023
e50a708
package dependencies and check notes
cole-brokamp Oct 12, 2023
3aa3642
depend on specific version of fr for now
cole-brokamp Oct 12, 2023
6025801
depend on fr >= 0.1.0
cole-brokamp Oct 13, 2023
5feb02c
remotes typo
cole-brokamp Oct 13, 2023
d88c5c9
use vroom to test read csv file
cole-brokamp Oct 13, 2023
ba7ea01
add back hamilton_landcover
cole-brokamp Oct 13, 2023
678b383
add back hamilton_traffic
cole-brokamp Oct 13, 2023
ba6181e
use v in the version
cole-brokamp Oct 14, 2023
6964ce6
fr with dyn dots for hh_acs_measures import
cole-brokamp Oct 14, 2023
43ebb0c
don't print v
cole-brokamp Oct 14, 2023
5c03ac1
add back property code enforcement
cole-brokamp Oct 14, 2023
21f89e2
add tract_indices back
cole-brokamp Oct 14, 2023
80886d3
simplify traffic curation
cole-brokamp Oct 14, 2023
bee34a1
running hh_acs_measures to get more metadata
cole-brokamp Oct 14, 2023
5db6525
add crime risk data
cole-brokamp Oct 14, 2023
0035e7d
crime risk on homepage
cole-brokamp Oct 14, 2023
8dad8e8
use updated version of fr to use template for md in some
cole-brokamp Oct 15, 2023
ec730e9
simplify specs documentation; update version
cole-brokamp Oct 16, 2023
6b4ea2f
don't use articles
cole-brokamp Oct 16, 2023
161ee5f
rely on fr 0.2.0
cole-brokamp Oct 16, 2023
99fe046
import cincy >= 1.1.0
cole-brokamp Oct 17, 2023
c33e310
this will be version 1.0.0
cole-brokamp Oct 17, 2023
095c1cb
add ability to specify geography for codec_datajk
cole-brokamp Oct 24, 2023
dd4538b
add starter for shiny app
cole-brokamp Oct 24, 2023
4c9a04f
update names for crime risk fr_tdr
cole-brokamp Oct 24, 2023
6cc80b7
use updated fr dplyr functions in v0.3.0
cole-brokamp Oct 24, 2023
be98258
rework app to rely on fr pkg
andrew-vancil Nov 7, 2023
48580fd
use later version of fr
cole-brokamp Nov 8, 2023
781868d
simplify data page
cole-brokamp Nov 8, 2023
20d64d9
add andrew to author list
cole-brokamp Nov 8, 2023
5755dd6
finish data page for now
cole-brokamp Nov 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 11 additions & 13 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
Package: codec
Title: Community Data Explorer for Cincinnati
Version: 0.7.2
Version: 1.0.0
Authors@R: c(
person("Cole", "Brokamp",
email = "[email protected]",
role = c("aut", "cre")),
person("Erika", "Manning",
role = "aut"),
person("Andrew", "Vancil",
role = "aut")
)
Description: codec provides tools for working with metadata in R and storing it alongside data in a YAML file. This package serves as the definition of the CoDEC data specifications and provides helpers to contribute and validate CoDEC data.
Expand All @@ -18,33 +20,29 @@ Suggests:
roxygen2,
knitr,
rmarkdown,
dplyr,
curl,
glue,
mapview,
gh,
DT,
leaflet,
callr,
downloadthis,
cincy (>= 1.0.2),
tibble,
bsplus
Remotes:
geomarker-io/cincy
geomarker-io/cincy,
cole-brokamp/fr
Config/testthat/edition: 3
URL: https://github.com/geomarker-io/codec,
http://geomarker.io/codec/
BugReports: https://github.com/geomarker-io/codec/issues
Imports:
dplyr,
forcats,
fs,
purrr (>= 1.0.0),
readr,
rlang (>= 0.4.11),
stringi,
stringr,
tibble,
yaml,
yaml,
vroom,
cincy (>= 1.1.0),
fr (>= 0.4.0),
sf
VignetteBuilder: knitr
Depends:
Expand Down
24 changes: 1 addition & 23 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,13 +1,5 @@
# Generated by roxygen2: do not edit by hand

export(":=")
export(.data)
export(add_attr_from_tdr)
export(add_attrs)
export(add_col_attrs)
export(add_type_attrs)
export(as_label)
export(as_name)
export(check_codec_tdr)
export(check_codec_tdr_csv)
export(check_files)
Expand All @@ -16,18 +8,4 @@ export(check_tdr_path)
export(codec_colors)
export(codec_data)
export(codec_tdr)
export(enquo)
export(enquos)
export(glimpse_attr)
export(glimpse_schema)
export(glimpse_tdr)
export(read_tdr)
export(read_tdr_csv)
export(write_tdr)
export(write_tdr_csv)
importFrom(rlang,":=")
importFrom(rlang,.data)
importFrom(rlang,as_label)
importFrom(rlang,as_name)
importFrom(rlang,enquo)
importFrom(rlang,enquos)
importFrom(cincy,interpolate)
84 changes: 0 additions & 84 deletions R/attributes.R

This file was deleted.

80 changes: 38 additions & 42 deletions R/codec_check.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,63 +9,51 @@
#' - the data contains a year (or year and month) column(s)
#' - all fields in the CSV data are described in the metadata and vice-versa
#' See `vignette("codec-specs")` for the CoDEC specifications.
#' @param tdr a codec tabular-data-resource
#' @param tdr_md a codec tabular-data-resource metadata list object
#' @param x a codec fr_tdr object (or data frame for check_census_tract_id(), check_date()
#' and a list for check_codec_tdr())
#' @param path path to tdr folder
#' @param name the name field from tabular-data-resource.yaml
#' @return for `check_codec_tdr_csv`, a tibble with added
#' tabular-data-resource attributes (equivalent to read_tdr_csv with `codec = TRUE`)
#' @importFrom cincy interpolate
#' @export
check_codec_tdr_csv <- function(path) {
check_files(path)
tdr <- read_tdr(path)$tdr
check_codec_tdr(tdr)
d <- fr::read_fr_tdr(fs::path(path, "tabular-data-resource.yaml"))

md_fields <- names(tdr$schema$fields)
d_fields <- names(readr::read_csv(read_tdr(path)$csv_file, n_max = 0, show_col_types = FALSE))
if(! all(d_fields %in% md_fields)) {
stop("the metadata does not describe all fields in the data", call. = FALSE)
}
if(! all(md_fields %in% d_fields)) {
stop("the metadata describes fields that are not in the data", call. = FALSE)
}

tdr_d <- read_tdr_csv(path)
check_data(tdr_d)
return(invisible(tdr_d))
}
check_codec_tdr(as.list(d))
check_census_tract_id(as.data.frame(d))
check_date(as.data.frame(d))

#' Check data
#' @rdname check_codec_tdr_csv
check_data <- function(tdr) {
check_census_tract_id(tdr)
check_date(tdr)
return(invisible(d))
}

#' Check census tract id column
#' @rdname check_codec_tdr_csv
check_census_tract_id <- function(tdr) {
check_census_tract_id <- function(x) {
census_tract_id_names <- paste0("census_tract_id", c("_2000", "_2010", "_2020"))
tdr_data <- as.data.frame(x)
tdr_data_names <- names(tdr_data)

# has census_tract_id_{year} or census_tract_id column
if (!any(names(tdr) %in% census_tract_id_names)) {
if (!any(tdr_data_names %in% census_tract_id_names)) {
stop("must contain a census tract id column called census_tract_id_2000, census_tract_id_2010, or census_tract_id_2020", call. = FALSE)
}

# make sure only one tract column
if (sum(names(tdr) %in% census_tract_id_names) > 1) {
if (sum(tdr_data_names %in% census_tract_id_names) > 1) {
stop("must contain only one census tract id column", call. = FALSE)
}

census_tract_id_name <- census_tract_id_names[census_tract_id_names %in% names(tdr)]
census_tract_id_name <- census_tract_id_names[census_tract_id_names %in% tdr_data_names]
census_tract_id_year <- stringr::str_extract(census_tract_id_name, "[0-9]+")

required_census_tract_ids <-
parse(text = paste0("cincy::tract_tigris_", census_tract_id_year)) |>
eval() |>
purrr::pluck(paste0("census_tract_id_", census_tract_id_year))

if (!all(required_census_tract_ids %in% tdr[[census_tract_id_name]])) {
if (!all(required_census_tract_ids %in% tdr_data[[census_tract_id_name]])) {
stop("the census tract id column, ",
census_tract_id_name,
", does not contain every census tract in ",
Expand All @@ -74,28 +62,31 @@ check_census_tract_id <- function(tdr) {
)
}

return(invisible(tdr))
return(invisible(x))
}

#' Check date
#' Check year or year-month column
#' @rdname check_codec_tdr_csv
check_date <- function(tdr) {
check_date <- function(x) {

tdr_data <- as.data.frame(x)
tdr_data_names <- names(tdr_data)

if (! "year" %in% names(tdr)) {
if (! "year" %in% tdr_data_names) {
stop("must contain a 'year' column", call. = FALSE)
}

years <- unique(tdr$year)
if (! identical(years, as.integer(years))) {
stop("the 'year' field must only contain integer years", call. = FALSE)
years <- unique(tdr_data$year)
if (! all(years %in% 1970:2099)) {
stop("the 'year' field must only contain integer years between 1970 and 2099", call. = FALSE)
}

if ("month" %in% names(tdr)) {
if (! all(tdr$month %in% 1:12)) {
if ("month" %in% tdr_data_names) {
if (! all(tdr_data$month %in% 1:12)) {
stop("the 'month' field must only contain integer values 1-12", call. = FALSE)
}
}
return(invisible(tdr))
return(invisible(x))
}

#' Check files
Expand Down Expand Up @@ -129,17 +120,18 @@ check_files <- function(path) {

# try to read (first 100 lines of) CSV file
test_read_csv_file <-
purrr::safely(readr::read_csv)(
purrr::safely(vroom::vroom)(
file = tdr_csv,
delim = ",",
n_max = 100,
col_names = TRUE,
show_col_types = FALSE,
locale = readr::locale(
locale = vroom::locale(
encoding = "UTF-8",
decimal_mark = ".",
grouping_mark = "",
),
name_repair = "check_unique",
.name_repair = "check_unique",
)

if (!is.null(test_read_csv_file$error)) {
Expand All @@ -152,7 +144,9 @@ check_files <- function(path) {
#' check CoDEC tdr
#' @rdname check_codec_tdr_csv
#' @export
check_codec_tdr <- function(tdr_md) {
check_codec_tdr <- function(x) {

tdr_md <- as.list(x)

# must have "name" and "path" descriptors
if (!purrr::pluck_exists(tdr_md, "name")) stop("`name` property descriptor is required", call. = FALSE)
Expand Down Expand Up @@ -212,7 +206,7 @@ check_codec_tdr <- function(tdr_md) {
)
}

return(invisible(tdr_md))
return(invisible(x))
}


Expand Down Expand Up @@ -242,6 +236,8 @@ check_tdr_path <- function(path) {
# path ends with .csv
if (! fs::path_ext(path) == "csv") stop("'path' must end with '.csv'", call. = FALSE)
# path can be a URL

is_url <- function(.x) grepl("^((http|ftp)s?|sftp)://", .x)
if (is_url(path)) return(invisible(NULL))
# if not URL, check for absolute path
if (fs::is_absolute_path(path)) stop("'path' must be a relative file path")
Expand Down
Loading
Loading