Home

Design

This is some work-in-progress design planning. OzData was made at the Brisbane Ozunconf in April 2017. At the Melbourne OzUnconf we're extending and refactoring.

Design goals

Problems to solve:

finding data across a wider range of sources than just data.gov.au
reducing the time between "this dataset looks useful" and actually querying it inside R
abstracting away details of datasets and services to simplify the goal of someone actually using the data to do stuff

Terminology

data catalogue: a place where datasets and data services are listed, like data.gov.au
data service: an online API through which data can be queried
dataset: one or more files that have to be downloaded to be used

We envision a set of packages that streamline the process of finding, accessing and using open datasets from government agencies and research organisations in Australia.

Functions

Search: Find datasets matching some query
Curated lists: Curated lists of datasets on some topic (eg "weather")
Generic datasets: (Things that work on thousands of different datasets)
Get tabular dataset: Download, unzip and lightly process some generic dataset that is ultimately a table
Known datasets: (Things that are individually implemented on a handful of particular datasets)
Get dataset: Download and process a dataset, manipulating it to be as useful as possible, possibly - applying a known standard.
Known dataservices: Wrappers around web APIs (such as ABS.Stat)
Query
Preview: Download, process, do whatever required to generate a visualisation.

Provide feedback

Saved searches