Horizontal Scaling of Graph Databases
Presented by: Ryan Hendrickson
A very real problem is choosing what data to prioritize when storing it in a graph database. Most graph databases, including Neo4J, the number 1 graph database according to db-engines.com, do not scale horizontally. This leaves the size of the database constrained to how much disk space and ram is allocated on the single server that the graph database exists on. Due to these limitations and constraints, graph databases, in practice, only store a subset of the data - only the data which is the most valuable to be connected between nodes, edges, and properties, with the rest of the data being stored in another database solution, like MongoDB, Elasticsearch, or others. In a world of big data, cheap hard drives, and near-infinite self-provisioned cloud resources, having a single-server graph database solution is unacceptable.
This talk will explore the current state of horizontally scaling graph databases. What are the limitations? What kind of performance can be expected? Do horizontally scaling graph databases sacrifice features such as consistent replication, high availability, and fault tolerance?
This talk will show Dgraph in use on economy-sized cloud infrastructure, horizontally scaling with big data, and acting as a primary database, mitigating the need to prioritize field storage and use two databases for full data storage.