Skip to content

Latest commit

 

History

History
98 lines (72 loc) · 4.82 KB

readme.md

File metadata and controls

98 lines (72 loc) · 4.82 KB

Large brewing kettles

Beer Production

The data this week comes from the Alcohol and Tobacco Tax and Trade Bureau (TTB). H/t to Bart Watson for sharing the source of the data.

There's a literal treasure trove of data here:

  • State-level beer production by year (2008-2019)
  • Number of brewers by production size by year (2008-2019)
  • Monthly beer stats aggregated across the US (2008-2019)

Some considerations:

  • A barrel of beer for this data is 31 gallons
  • Most data is in barrels removed/taxed or produced
  • Removals = "Total barrels removed subject to tax by the breweries comprising the named strata of data", essentially how much was produced and removed for consumption.
  • A LOT of data came from PDFs - I included all the code I used to grab data and tidy it up, take a peek and try out your own mechanism for getting the tables out.

Massive shoutout to pdftools by ROpenSci and stringr for doing a lot of heavy lifting with the datacleaning and prep here.

Get the data here

# Get the Data

brewing_materials <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/brewing_materials.csv')
beer_taxed <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_taxed.csv')
brewer_size <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/brewer_size.csv')
beer_states <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-31/beer_states.csv')

# Or read in with tidytuesdayR package (https://github.com/dslc-io/tidytuesdayR)
# PLEASE NOTE TO USE 2020 DATA YOU NEED TO USE tidytuesdayR version ? from GitHub

# Either ISO-8601 date or year/week works!

# Install via pak::pak("dslc-io/tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2020-03-31')
tuesdata <- tidytuesdayR::tt_load(2020, week = 14)


brewing_materials <- tuesdata$brewing_materials

Data Dictionary

brewing_materials.csv

variable class description
data_type character Pounds of Material - this is a sub-table from beer_taxed
material_type character Grain product, Totals, Non-Grain Product (basically hops vs grains)
year double Year
month integer Month
type character Actual line-item from material type
month_current double Current number of barrels for this year/month
month_prior_year double Prior year number of barrels for same month
ytd_current double Cumulative year to date of current year
ytd_prior_year double Cumulative year to date for prior year

beer_states.csv

variable class description
state character State abbreviated
year integer Year
barrels double Barrels produced within each type
type character Type of production/use (On premise, Bottles/Cans, Kegs/Barrels)

beer_taxed.csv

variable class description
data_type character Barrels Produced
tax_status character The Tax Status, factor with Totals, Taxable, Sub Total Taxable, Tax Free, Sub Total Tax-Free
year double Year
month integer Month
type character Type of production, either Total Production (Production) or specific sub-category and sub-totals
month_current double Current number of barrels for this year/month
month_prior_year double Prior year number of barrels for same month
ytd_current double Cumulative year to date of current year
ytd_prior_year double Cumulative year to date for prior year

brewer_size.csv

variable class description
year integer Year
brewer_size character Range of production for brewer size, number of barrels produced
n_of_brewers double Number of brewers at that brewer size
total_barrels double Total barrels of beer produced at that brewer size
taxable_removals double Taxable barrels for removals - removals for consumption under taxation
total_shipped double Total barrels shipped - produced beer that is not taxed

Cleaning Script

Please see download script via download_beer.R

Please see cleaning scripts via scrape_beers.R