diff --git a/.Rbuildignore b/.Rbuildignore index dd6c114..85b0690 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -12,3 +12,4 @@ auditor_online_cache.zip ^README_cache$ ^.*\.Rproj$ ^\.Rproj\.user$ +auditor_online_scrape_2023-08-10.rds diff --git a/R/zzz.R b/R/zzz.R index f9ee7b2..6537171 100644 --- a/R/zzz.R +++ b/R/zzz.R @@ -19,3 +19,4 @@ utils::globalVariables("gaz") utils::globalVariables("f") utils::globalVariables("high_score") utils::globalVariables("score") +utils::globalVariables("apt_id") diff --git a/README.Rmd b/README.Rmd index f79f36a..942e99c 100644 --- a/README.Rmd +++ b/README.Rmd @@ -5,6 +5,8 @@ output: github_document ```{r, include = FALSE} +library(parcel) + knitr::opts_chunk$set( collapse = TRUE, comment = "#>", @@ -20,14 +22,25 @@ knitr::opts_chunk$set( -The goal of parcel is to provide tools for matching real-world addresses to reference sets of addresses; e.g., "352 Helen Street", "352 Helen St." or "352 helen st". This package is motivated by the included example data resources of auditor parcel tax data from Hamilton County, Ohio. +The goal of parcel is to provide tools for matching real-world addresses to reference sets of addresses; e.g., "352 Helen Street", "352 Helen St." or "352 helen st". This package is motivated by the included example data resources of auditor parcel tax data from Hamilton County, Ohio. Use `get_parcel_data()` to get the corresponding parcel data for a vector of addresses: + +```{r} +get_parcel_data( + c("1069 Overlook Avenue Cincinnati OH 45238", + "419 Elm St. Cincinnati OH 45238", + "3333 Burnet Ave Cincinnati OH 45219", + "3830 President Drive Cincinnati Ohio 45225", + "3544 Linwood Av Cincinnati OH 45226") +) +``` With this specific goal in mind, parcel includes: - functions for cleaning and tagging components of addresses: **`clean_address()`**, **`tag_address()`**, and **`create_address_stub()`** - the `cagis_parcels` tabular-data-resource, which contains parcel identifiers, parcel addresses, and parcel characteristics downloaded from the [Cincinnati Area Geographic Information System (CAGIS)](https://cagismaps.hamilton-co.org/cagisportal/mapdata/download) - the `hamilton_online_parcels` tabular-data-resource, which contains parcel characteristics scraped from [Hamilton County Auditor Online](https://wedge1.hcauditor.org/) -- functions for joining addresses to parcel identifiers based on an included model pretrained on electronic health record addresses in Hamilton County, OH: **`link_parcel()`** +- functions for joining addresses to parcel identifiers based on an included model pretrained on electronic health record addresses in Hamilton County, OH and a list of custom pseudo-identifiers for multi-building apartment complexes: **`link_parcel()`**, **`link_apt()`** + ## Installation @@ -47,18 +60,7 @@ The development version of parcel can be installed with: pak::pak("geomarker-io/parcel") ``` -## Example Usage - -Use `get_parcel_data()` to get the corresponding parcel data for a vector of addresses: - -```{r} -library(parcel) -get_parcel_data(c("1069 Overlook Avenue Cincinnati OH 45238", - "419 Elm St. Cincinnati OH 45238", - "3544 Linwood Av Cincinnati OH 45226")) -``` - -## Python, `miniconda`, and `virtualenv` +### Python, `miniconda`, and `virtualenv` `reticulate::py_install()` assumes a non-system version of Python is already installed and will offer to install Miniconda and create an environment specifically for R and the reticulate package. @@ -82,34 +84,11 @@ reticulate::py_config() reticulate::py_list_packages() ``` -## CAGIS Parcels Data - -The `cagis_parcels` tabular data resource (TDR) is created using the R scripts in `/inst` and stored within the package. It can be loaded using {[`codec`](https://geomarker.io/codec)}: - -```{r} -d_parcel <- codec::read_tdr_csv(fs::path_package("parcel", "cagis_parcels")) - -head(d_parcel) - -# without codec: -# read.csv(fs::path_package("parcel", "cagis_parcels")) -``` - -```{r} -#| results: asis -options(knitr.kable.NA = '') -codec::glimpse_attr(d_parcel) |> - knitr::kable() -``` +## Identifiers for Parcels and Properties -```{r} -#| results: asis -options(knitr.kable.NA = '') -codec::glimpse_schema(d_parcel) |> - knitr::kable() -``` +A `parcel_id` refers to the Hamilton County Auditor's "Parcel Number", which is referred to as the "Property Number" within the CAGIS Open Data and uniquely identifies properties. In rare cases, multple addresses can share the same parcel boundaries, but have unique `parcel_id`s and in these cases, their resulting centroid coordinates would also be identical. -Some of the parcel characteristics do not make sense in certain contexts and should not be interpreted incorrectly; for example, the value of a parcel for a multi-family or multi-unit housing structure shouldn't be compared to the value of a parcel for a single-family household for the purposes of assesing individual-level SES. Within the process of matching to a parcel, an individual address could be merged with differing types and resolutions of data: +Within the process of matching to a parcel, an individual address could be merged with differing types and resolutions of data: ```mermaid %%{init: { "fontFamily": "arial" } }%% @@ -136,38 +115,74 @@ res -- multi-family dwelling --> lu("auditor land use type \n (e.g., two family hc --> npm(not matched \nto a parcel):::tool ``` -## Hamilton County Auditor Online Data +### Non-Residential Parcels -The `hamilton_online_parcels` TDR is created by linking a saved scraping of the [auditor's website](https://wedge1.hcauditor.org/) to the parcel identifiers in the `cagis_parcels` TDR. +Known non-residential addresses will be matched and returned with a special parcel identifer denoting that the matched parcel is non-residential; e.g., Cincinnati Children's Hospital Medical Center, Jobs and Family Services, Ronald McDonald House): -Similarly, the `hamilton_online_parcel` TDR is created using the R scripts in `/inst` and stored within the package. It can be loaded using {[`codec`](https://geomarker.io/codec)}: +```{r} +get_parcel_data( + c("222 E Central Parkway Cincinnati Ohio 45220", + "3333 Burnet Ave Cincinnati Ohio 45219", + "3333 Burnet Avenue Cincinnati Ohio 45219", + "350 Erkenbrecher Ave Cincinnati Ohio 45219") +) |> + dplyr::select(input_address, parcel_id) +``` + +### Condominiums + +Because "second line" address components (e.g., "Unit 2B") are not captured, a single address can refer to multiple parcels in the case of condos or otherwise shared building ownership. For example, the address "323 Fifth St" has six distinct `parcel_id`s, each with different home values and land uses: + +|parcel_id | market_total_value|land_use | +|:-----------|------------------:|:---------------------------| +|14500010321 | 397500|condominium unit | +|14500010317 | 123000|condominium office building | +|14500010320 | 180000|condominium unit | +|14500010319 | 255000|condominium unit | +|14500010322 | 388230|condominium unit | +|14500010318 | 239500|condominium unit | + +In this case, a special parcel identifier `TIED_MATCH` is returned to denote that the address matched more than one parcel: ```{r} -d_online <- codec::read_tdr_csv(fs::path_package("parcel", "hamilton_online_parcels")) +get_parcel_data("323 Fifth St W Cincinnati OH 45202")$parcel_id +``` -head(d_online) +### Large Apartment Complexes + +Large apartment complexes often use multiple mailing addresses that are not the same as the parcel address(es). In these special cases, `link_apt()` is used to match addresses exactly based on their street name if the street number falls within a certain range: + +```{r} +str(parcel:::apt_defs) +``` + +## CAGIS Parcels Data + +The `cagis_parcels` tabular data resource (TDR) is created using the R scripts in `/inst` and stored within the package. It can be loaded using {[`codec`](https://geomarker.io/codec)}: + +```{r} +d_parcel <- codec::read_tdr_csv(fs::path_package("parcel", "cagis_parcels")) + +head(d_parcel) # without codec: -# read.csv(fs::path_package("parcel", "hamilton_online_parcels")) +# read.csv(fs::path_package("parcel", "cagis_parcels")) ``` ```{r} #| results: asis options(knitr.kable.NA = '') -codec::glimpse_attr(d_online) |> +codec::glimpse_attr(d_parcel) |> knitr::kable() ``` ```{r} #| results: asis options(knitr.kable.NA = '') -codec::glimpse_schema(d_online) |> +codec::glimpse_schema(d_parcel) |> knitr::kable() ``` - -## Inclusion/Exclusion Criteria for Parcel Data - Auditor parcel-level data were excluded if they (1) did not contain a parcel identifier, (2) did not contain a property address number/name, or (3) had a duplicated parcel identifier. Parcels with the following land use categories are included in the data resource and others are excluded. These were selected to reflect *residential* usages of parcels. @@ -182,16 +197,35 @@ d_parcel |> knitr::kable() ``` -## Non-Residential Parcels +Some of the parcel characteristics do not make sense in certain contexts and should not be interpreted incorrectly; for example, the value of a parcel for a multi-family or multi-unit housing structure shouldn't be compared to the value of a parcel for a single-family household for the purposes of assesing individual-level SES. -Known non-residential addresses will be matched and returned with a special parcel identifer denoting that the matched parcel is non-residential; e.g., Cincinnati Children's Hospital Medical Center, Jobs and Family Services, Ronald McDonald House): +## Hamilton County Auditor Online Data + +The `hamilton_online_parcels` TDR is created by linking a saved scraping of the [auditor's website](https://wedge1.hcauditor.org/) to the parcel identifiers in the `cagis_parcels` TDR. + +Similarly, the `hamilton_online_parcel` TDR is created using the R scripts in `/inst` and stored within the package. It can be loaded using {[`codec`](https://geomarker.io/codec)}: + +```{r} +d_online <- codec::read_tdr_csv(fs::path_package("parcel", "hamilton_online_parcels")) + +head(d_online) + +# without codec: +# read.csv(fs::path_package("parcel", "hamilton_online_parcels")) +``` + +```{r} +#| results: asis +options(knitr.kable.NA = '') +codec::glimpse_attr(d_online) |> + knitr::kable() +``` ```{r} -c("222 E Central Parkway Cincinnati Ohio 45220", - "3333 Burnet Ave Cincinnati Ohio 45219", - "3333 Burnet Avenue Cincinnati Ohio 45219", - "350 Erkenbrecher Ave Cincinnati Ohio 45219") |> - get_parcel_data() +#| results: asis +options(knitr.kable.NA = '') +codec::glimpse_schema(d_online) |> + knitr::kable() ``` ## Estimating the number of households per parcel @@ -223,23 +257,3 @@ Certain calculations needs to be weighted by households instead of parcel; e.g. |other residential structure |0| |boataminium |0| -## Identifiers for Parcels and Properties - -A `parcel_id` refers to the Hamilton County Auditor's "Parcel Number", which is referred to as the "Property Number" within the CAGIS Open Data and uniquely identifies properties. In rare cases, multple addresses can share the same parcel boundaries, but have unique `parcel_id`s and in these cases, their resulting centroid coordinates would also be identical. - -Because "second line" address components (e.g., "Unit 2B") are not captured, a single address can refer to multiple parcels in the case of condos or otherwise shared building ownership. For example, the address "323 Fifth St" has six distinct `parcel_id`s, each with different home values and land uses: - -|parcel_id | market_total_value|land_use | -|:-----------|------------------:|:---------------------------| -|14500010321 | 397500|condominium unit | -|14500010317 | 123000|condominium office building | -|14500010320 | 180000|condominium unit | -|14500010319 | 255000|condominium unit | -|14500010322 | 388230|condominium unit | -|14500010318 | 239500|condominium unit | - -In this case, a special parcel identifier `TIED_MATCH` is returned to denote that the address matched more than one parcel: - -```{r} -get_parcel_data("323 Fifth St W Cincinnati OH 45202") -``` diff --git a/README.md b/README.md index f047498..54fd748 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,33 @@ The goal of parcel is to provide tools for matching real-world addresses to reference sets of addresses; e.g., “352 Helen Street”, “352 Helen St.” or “352 helen st”. This package is motivated by the included example data resources of auditor parcel tax data from Hamilton County, -Ohio. +Ohio. Use `get_parcel_data()` to get the corresponding parcel data for a +vector of addresses: + +``` r +get_parcel_data( + c("1069 Overlook Avenue Cincinnati OH 45238", + "419 Elm St. Cincinnati OH 45238", + "3333 Burnet Ave Cincinnati OH 45219", + "3830 President Drive Cincinnati Ohio 45225", + "3544 Linwood Av Cincinnati OH 45226") +) +#> # A tibble: 5 × 23 +#> input_address parcel_id score parcel_address property_addr_number +#> +#> 1 1069 Overlook Avenue Cin… 1800A800… 0.717 1069 OVERLOOK… 1069 +#> 2 419 Elm St. Cincinnati O… 54000410… 0.733 419 ELM ST 419 +#> 3 3333 Burnet Ave Cincinna… nonres-c… 0.733 +#> 4 3830 President Drive Cin… president NA +#> 5 3544 Linwood Av Cincinna… 01900010… 0.733 3544 LINWOOD … 3544 +#> # ℹ 18 more variables: property_addr_street , property_addr_suffix , +#> # condo_id , condo_unit , parcel_centroid_lat , +#> # parcel_centroid_lon , market_total_value , land_use , +#> # acreage , homestead , rental_registration , +#> # RED_25_FLAG , year_built , n_total_rooms , n_bedrooms , +#> # n_full_bathrooms , n_half_bathrooms , +#> # online_market_total_value +``` With this specific goal in mind, parcel includes: @@ -28,7 +54,9 @@ With this specific goal in mind, parcel includes: Online](https://wedge1.hcauditor.org/) - functions for joining addresses to parcel identifiers based on an included model pretrained on electronic health record addresses in - Hamilton County, OH: **`link_parcel()`** + Hamilton County, OH and a list of custom pseudo-identifiers for + multi-building apartment complexes: **`link_parcel()`**, + **`link_apt()`** ## Installation @@ -49,32 +77,7 @@ The development version of parcel can be installed with: pak::pak("geomarker-io/parcel") ``` -## Example Usage - -Use `get_parcel_data()` to get the corresponding parcel data for a -vector of addresses: - -``` r -library(parcel) -get_parcel_data(c("1069 Overlook Avenue Cincinnati OH 45238", - "419 Elm St. Cincinnati OH 45238", - "3544 Linwood Av Cincinnati OH 45226")) -#> # A tibble: 3 × 23 -#> input_address parcel_id score parcel_address property_addr_number -#> -#> 1 1069 Overlook Avenue Cinc… 1800A800… 0.717 1069 OVERLOOK… 1069 -#> 2 419 Elm St. Cincinnati OH… 54000410… 0.733 419 ELM ST 419 -#> 3 3544 Linwood Av Cincinnat… 01900010… 0.733 3544 LINWOOD … 3544 -#> # ℹ 18 more variables: property_addr_street , property_addr_suffix , -#> # condo_id , condo_unit , parcel_centroid_lat , -#> # parcel_centroid_lon , market_total_value , land_use , -#> # acreage , homestead , rental_registration , -#> # RED_25_FLAG , year_built , n_total_rooms , n_bedrooms , -#> # n_full_bathrooms , n_half_bathrooms , -#> # online_market_total_value -``` - -## Python, `miniconda`, and `virtualenv` +### Python, `miniconda`, and `virtualenv` `reticulate::py_install()` assumes a non-system version of Python is already installed and will offer to install Miniconda and create an @@ -103,6 +106,136 @@ reticulate::py_config() reticulate::py_list_packages() ``` +## Identifiers for Parcels and Properties + +A `parcel_id` refers to the Hamilton County Auditor’s “Parcel Number”, +which is referred to as the “Property Number” within the CAGIS Open Data +and uniquely identifies properties. In rare cases, multple addresses can +share the same parcel boundaries, but have unique `parcel_id`s and in +these cases, their resulting centroid coordinates would also be +identical. + +Within the process of matching to a parcel, an individual address could +be merged with differing types and resolutions of data: + +``` mermaid +%%{init: { "fontFamily": "arial" } }%% + +flowchart LR +classDef id fill:#fff,stroke:#000,stroke-width:1px; +classDef tool fill:#e8e8e8,stroke:#000,stroke-width:1px,stroke-dasharray: 5 2; +classDef data fill:#fff,stroke:#000,stroke-width:1px; + +addr(hospitalization):::id ---> hc("likely in \nHamilton County \n (by ZIP code)"):::data +addr ---> nhc("not in Hamilton County"):::tool + +hc --> inst[institutional parcel]:::id +inst -. "institution 'type' linkage\n (e.g., JFS, CCHMC, RMH)" .-> sdoh("temporary housing,\n foster care,\n low income housing tax credit"):::data + +hc --> res(residential parcel):::id + +res -- CCHMC \nlinkage --> hhh("home's hospitalization history \n (i.e., pedigree)"):::data +res -- CAGIS & \nODC linkage --> hce(housing code enforcement,\n public service calls, crime):::data + +res -- single family dwelling --> vat("family-level SES measures \n (e.g., value, age, condition, tenure)"):::data +res -- multi-family dwelling --> lu("auditor land use type \n (e.g., two family dwelling, \n apartment with 20-39 units)"):::data + +hc --> npm(not matched \nto a parcel):::tool +``` + +### Non-Residential Parcels + +Known non-residential addresses will be matched and returned with a +special parcel identifer denoting that the matched parcel is +non-residential; e.g., Cincinnati Children’s Hospital Medical Center, +Jobs and Family Services, Ronald McDonald House): + +``` r +get_parcel_data( + c("222 E Central Parkway Cincinnati Ohio 45220", + "3333 Burnet Ave Cincinnati Ohio 45219", + "3333 Burnet Avenue Cincinnati Ohio 45219", + "350 Erkenbrecher Ave Cincinnati Ohio 45219") +) |> + dplyr::select(input_address, parcel_id) +#> # A tibble: 4 × 2 +#> input_address parcel_id +#> +#> 1 222 E Central Parkway Cincinnati Ohio 45220 nonres-jfs-e +#> 2 3333 Burnet Ave Cincinnati Ohio 45219 nonres-cchmc +#> 3 3333 Burnet Avenue Cincinnati Ohio 45219 nonres-cchmc +#> 4 350 Erkenbrecher Ave Cincinnati Ohio 45219 nonres-rmh-350 +``` + +### Condominiums + +Because “second line” address components (e.g., “Unit 2B”) are not +captured, a single address can refer to multiple parcels in the case of +condos or otherwise shared building ownership. For example, the address +“323 Fifth St” has six distinct `parcel_id`s, each with different home +values and land uses: + +| parcel_id | market_total_value | land_use | +|:------------|-------------------:|:----------------------------| +| 14500010321 | 397500 | condominium unit | +| 14500010317 | 123000 | condominium office building | +| 14500010320 | 180000 | condominium unit | +| 14500010319 | 255000 | condominium unit | +| 14500010322 | 388230 | condominium unit | +| 14500010318 | 239500 | condominium unit | + +In this case, a special parcel identifier `TIED_MATCH` is returned to +denote that the address matched more than one parcel: + +``` r +get_parcel_data("323 Fifth St W Cincinnati OH 45202")$parcel_id +#> [1] "TIED_MATCHES" +``` + +### Large Apartment Complexes + +Large apartment complexes often use multiple mailing addresses that are +not the same as the parcel address(es). In these special cases, +`link_apt()` is used to match addresses exactly based on their street +name if the street number falls within a certain range: + +``` r +str(parcel:::apt_defs) +#> List of 8 +#> $ president :List of 3 +#> ..$ street_name: chr [1:2] "president drive" "president dr" +#> ..$ range_low : num 3000 +#> ..$ range_high : num 3999 +#> $ tower :List of 3 +#> ..$ street_name: chr [1:4] "east tower drive" "e tower drive" "east tower dr" "e tower dr" +#> ..$ range_low : num 2000 +#> ..$ range_high : num 29999 +#> $ bahama :List of 3 +#> ..$ street_name: chr [1:3] "bahama terrace" "bahama te" "bahama ter" +#> ..$ range_low : num 5000 +#> ..$ range_high : num 5999 +#> $ hawaiian :List of 3 +#> ..$ street_name: chr [1:3] "hawaiian terrace" "hawaiian te" "hawaiian ter" +#> ..$ range_low : num 4000 +#> ..$ range_high : num 5999 +#> $ dewdrop :List of 3 +#> ..$ street_name: chr "dewdrop circle" +#> ..$ range_low : num 400 +#> ..$ range_high : num 599 +#> $ winneste :List of 3 +#> ..$ street_name: chr [1:3] "winneste avenue" "winneste ave" "winneste av" +#> ..$ range_low : num 4000 +#> ..$ range_high : num 5999 +#> $ walden_glen:List of 3 +#> ..$ street_name: chr "walden glen circle" +#> ..$ range_low : num 2000 +#> ..$ range_high : num 2999 +#> $ clovernook :List of 3 +#> ..$ street_name: chr [1:3] "clovernook avenue" "clovernook ave" "clovernook av" +#> ..$ range_low : num 7000 +#> ..$ range_high : num 7999 +``` + ## CAGIS Parcels Data The `cagis_parcels` tabular data resource (TDR) is created using the R @@ -171,38 +304,49 @@ codec::glimpse_schema(d_parcel) |> | rental_registration | Rental Registration | | boolean | | | RED_25_FLAG | | | boolean | | -Some of the parcel characteristics do not make sense in certain contexts -and should not be interpreted incorrectly; for example, the value of a -parcel for a multi-family or multi-unit housing structure shouldn’t be -compared to the value of a parcel for a single-family household for the -purposes of assesing individual-level SES. Within the process of -matching to a parcel, an individual address could be merged with -differing types and resolutions of data: - -``` mermaid -%%{init: { "fontFamily": "arial" } }%% - -flowchart LR -classDef id fill:#fff,stroke:#000,stroke-width:1px; -classDef tool fill:#e8e8e8,stroke:#000,stroke-width:1px,stroke-dasharray: 5 2; -classDef data fill:#fff,stroke:#000,stroke-width:1px; - -addr(hospitalization):::id ---> hc("likely in \nHamilton County \n (by ZIP code)"):::data -addr ---> nhc("not in Hamilton County"):::tool +Auditor parcel-level data were excluded if they (1) did not contain a +parcel identifier, (2) did not contain a property address number/name, +or (3) had a duplicated parcel identifier. -hc --> inst[institutional parcel]:::id -inst -. "institution 'type' linkage\n (e.g., JFS, CCHMC, RMH)" .-> sdoh("temporary housing,\n foster care,\n low income housing tax credit"):::data +Parcels with the following land use categories are included in the data +resource and others are excluded. These were selected to reflect +*residential* usages of parcels. -hc --> res(residential parcel):::id +``` r +library(dplyr, warn.conflicts = FALSE) -res -- CCHMC \nlinkage --> hhh("home's hospitalization history \n (i.e., pedigree)"):::data -res -- CAGIS & \nODC linkage --> hce(housing code enforcement,\n public service calls, crime):::data +d_parcel |> + group_by(land_use) |> + summarize(n_parcels = n()) |> + arrange(desc(n_parcels)) |> + knitr::kable() +``` -res -- single family dwelling --> vat("family-level SES measures \n (e.g., value, age, condition, tenure)"):::data -res -- multi-family dwelling --> lu("auditor land use type \n (e.g., two family dwelling, \n apartment with 20-39 units)"):::data +| land_use | n_parcels | +|:-------------------------------|----------:| +| single family dwelling | 213044 | +| condominium unit | 19754 | +| two family dwelling | 11352 | +| apartment, 4-19 units | 5659 | +| landominium | 3047 | +| three family dwelling | 1863 | +| condo or pud garage | 1063 | +| other residential structure | 875 | +| metropolitan housing authority | 744 | +| apartment, 40+ units | 625 | +| apartment, 20-39 units | 457 | +| manufactured home | 204 | +| office / apartment over | 188 | +| boataminium | 141 | +| other commercial housing | 99 | +| mobile home / trailer park | 40 | +| lihtc res | 25 | -hc --> npm(not matched \nto a parcel):::tool -``` +Some of the parcel characteristics do not make sense in certain contexts +and should not be interpreted incorrectly; for example, the value of a +parcel for a multi-family or multi-unit housing structure shouldn’t be +compared to the value of a parcel for a single-family household for the +purposes of assesing individual-level SES. ## Hamilton County Auditor Online Data @@ -265,75 +409,6 @@ codec::glimpse_schema(d_online) |> | n_half_bathrooms | | | integer | | online_market_total_value | | May differ from the market_total_value from CAGIS auditor online data. This value is scraped from the auditor’s website. | number | -## Inclusion/Exclusion Criteria for Parcel Data - -Auditor parcel-level data were excluded if they (1) did not contain a -parcel identifier, (2) did not contain a property address number/name, -or (3) had a duplicated parcel identifier. - -Parcels with the following land use categories are included in the data -resource and others are excluded. These were selected to reflect -*residential* usages of parcels. - -``` r -library(dplyr, warn.conflicts = FALSE) - -d_parcel |> - group_by(land_use) |> - summarize(n_parcels = n()) |> - arrange(desc(n_parcels)) |> - knitr::kable() -``` - -| land_use | n_parcels | -|:-------------------------------|----------:| -| single family dwelling | 213044 | -| condominium unit | 19754 | -| two family dwelling | 11352 | -| apartment, 4-19 units | 5659 | -| landominium | 3047 | -| three family dwelling | 1863 | -| condo or pud garage | 1063 | -| other residential structure | 875 | -| metropolitan housing authority | 744 | -| apartment, 40+ units | 625 | -| apartment, 20-39 units | 457 | -| manufactured home | 204 | -| office / apartment over | 188 | -| boataminium | 141 | -| other commercial housing | 99 | -| mobile home / trailer park | 40 | -| lihtc res | 25 | - -## Non-Residential Parcels - -Known non-residential addresses will be matched and returned with a -special parcel identifer denoting that the matched parcel is -non-residential; e.g., Cincinnati Children’s Hospital Medical Center, -Jobs and Family Services, Ronald McDonald House): - -``` r -c("222 E Central Parkway Cincinnati Ohio 45220", - "3333 Burnet Ave Cincinnati Ohio 45219", - "3333 Burnet Avenue Cincinnati Ohio 45219", - "350 Erkenbrecher Ave Cincinnati Ohio 45219") |> - get_parcel_data() -#> # A tibble: 4 × 23 -#> input_address parcel_id score parcel_address property_addr_number -#> -#> 1 222 E Central Parkway Cin… nonres-j… 0.620 -#> 2 3333 Burnet Ave Cincinnat… nonres-c… 0.733 -#> 3 3333 Burnet Avenue Cincin… nonres-c… 0.720 -#> 4 350 Erkenbrecher Ave Cinc… nonres-r… 0.733 -#> # ℹ 18 more variables: property_addr_street , property_addr_suffix , -#> # condo_id , condo_unit , parcel_centroid_lat , -#> # parcel_centroid_lon , market_total_value , land_use , -#> # acreage , homestead , rental_registration , -#> # RED_25_FLAG , year_built , n_total_rooms , n_bedrooms , -#> # n_full_bathrooms , n_half_bathrooms , -#> # online_market_total_value -``` - ## Estimating the number of households per parcel Certain calculations needs to be weighted by households instead of @@ -365,45 +440,3 @@ households per parcel for each `land_use` code: | condominium office building | 0 | | other residential structure | 0 | | boataminium | 0 | - -## Identifiers for Parcels and Properties - -A `parcel_id` refers to the Hamilton County Auditor’s “Parcel Number”, -which is referred to as the “Property Number” within the CAGIS Open Data -and uniquely identifies properties. In rare cases, multple addresses can -share the same parcel boundaries, but have unique `parcel_id`s and in -these cases, their resulting centroid coordinates would also be -identical. - -Because “second line” address components (e.g., “Unit 2B”) are not -captured, a single address can refer to multiple parcels in the case of -condos or otherwise shared building ownership. For example, the address -“323 Fifth St” has six distinct `parcel_id`s, each with different home -values and land uses: - -| parcel_id | market_total_value | land_use | -|:------------|-------------------:|:----------------------------| -| 14500010321 | 397500 | condominium unit | -| 14500010317 | 123000 | condominium office building | -| 14500010320 | 180000 | condominium unit | -| 14500010319 | 255000 | condominium unit | -| 14500010322 | 388230 | condominium unit | -| 14500010318 | 239500 | condominium unit | - -In this case, a special parcel identifier `TIED_MATCH` is returned to -denote that the address matched more than one parcel: - -``` r -get_parcel_data("323 Fifth St W Cincinnati OH 45202") -#> # A tibble: 1 × 23 -#> input_address parcel_id score parcel_address property_addr_number -#> -#> 1 323 Fifth St W Cincinnati… TIED_MAT… 0.733 -#> # ℹ 18 more variables: property_addr_street , property_addr_suffix , -#> # condo_id , condo_unit , parcel_centroid_lat , -#> # parcel_centroid_lon , market_total_value , land_use , -#> # acreage , homestead , rental_registration , -#> # RED_25_FLAG , year_built , n_total_rooms , n_bedrooms , -#> # n_full_bathrooms , n_half_bathrooms , -#> # online_market_total_value -``` diff --git a/_pkgdown.yml b/_pkgdown.yml deleted file mode 100644 index ef7c6e0..0000000 --- a/_pkgdown.yml +++ /dev/null @@ -1,21 +0,0 @@ -url: http://geomarker.io/parcel/ - # bslib: - # bg: "#FFFFFF" - # fg: "#396175" - # primary: "#C28273" - # info: "#EACEC5" - # warning: "#E49865" - # grid-gutter-width: "0.0rem" - # border-radius: 0.5rem - # btn-border-radius: 0.25rem -reference: -- title: Address manipulation - contents: - - create_address_stub - - clean_address - - tag_address -- title: Parcel identification - contents: - - get_parcel_data - - link_parcel - - link_apt