vlg

Dgraph's Very Large Graph (WIP) Project

Introduction

This work-in-progress project aims to create a public, large graph to showcase Dgraph's power and ease of use.

Why

Unlike our competitors, Dgraph does not maintain a large, publicly available graph. Further, the current graph implementations used in benchmarking and "the tour" do not use new features and capabilities of Dgraph. And they exclude some of the difficult design problems (authentication, rate limiting, etc) that developers face when building production applications with Dgraph.

What

The Dgraph VLG will be a publicly-available graph that presents the Offshore Data Leaks dataset from the International Consortium of Investigative Journalists (ICIJ), which compromises over 2 million nodes and 3 million relationships encompassing the Panama Papers, Pandora Papers, and others.

How

We'll build the graph using practices common in production graphs:

Analysis of the data, schema design, import and update strategies
Create the GraphQL SDL for the schema and document the reasons for decisions made
Use of the bulk loader for initial loading
Use of a batch updater (maybe several languages?) to load new data into the graph
Deploy the graph to HA production (Dgraph Cloud or maybe instead a k8s cluster to illustrate how that's best accomplished). If possible, provide simple deployment mechanism (scripts, terraform, etc) so that a potential Dgraph user could easily deploy the graph to their own production space.
Monitor the health of the graph using best available tools
Creation of a Docker container (separate repo) with a subset of the data that developers can use to quickly start with Dgraph (one that's more complex than the 'million-movie' dataset).
Creation of a simple UI that allows for search and visualization of the graph (Linkurious, Graphistry?)

This repo will contain folders for each step, and will include detailed documentation descibing the 'whys' and 'hows' behind the step. This README will be replaced by an Introduction page.

Goals

Publicize the existence of the VLG to showcase Dgraph's performance
Use the repo to instruct developers on best practices in schema, ETL and query design, as well as best deployment practices
Afford new Dgraph employees a chance to use Dgraph in the 'real-word'
Serve as a test fixture for upcoming releases of Dgraph

Contributing

All Dgraph staff are invited to work on this project. If you've always wanted to learn a particular aspect of Dgraph, here's your chance. It also might be a good opportunity to "pair-program" with another member of staff to work together on an issue. When proposing new content for this repo, please create a branch and submit a pull request.

Future (possible) work

Extend the graph to allow user accounts. Extend the sample UI to allow accounts to save searches and make personal notes associated with graph entities.
Bring in entity data from other sources, such as from Google's GDELT project and the OCCRP.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
notes		notes
rdf-subset		rdf-subset
schema		schema
tools		tools
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
go.work		go.work
go.work.sum		go.work.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vlg

Introduction

Why

What

How

Goals

Contributing

Future (possible) work

About

Releases

Packages

Contributors 3

Languages

dgraph-io/vlg

Folders and files

Latest commit

History

Repository files navigation

vlg

Introduction

Why

What

How

Goals

Contributing

Future (possible) work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages