Graph databases represent an emergent niche in the database landscape and offer a potential alternative to other database options available—they should be on any CIOs tech radar screen.
Graph databases are a horizontal or general-purpose technology much like relational databases, and are well suited for storing the relationships between users, their behaviors and products.
While the data model is different (lends itself to simpler models and much better performance), the general idea of building a high-fidelity model and asking it questions remains.
“That said, if you’ve ever tried to shoehorn data into a relational database and found it full of null entries and convoluted schemas, then you were probably trying to wrangle a graph,” explained Jim Webber, chief scientist at Neo4j. “In a graph database that same model would likely have been quite pleasantly tame.”
So, what can you do with a graph database? Webber explained that as a general-purpose database, graphs can be used for lots of things.
“In truth, every CIO has many problems that can be sensibly managed as graphs in their businesses,” he said. “Choosing a graph for those business problems where connections between data is as important as the data itself should be a top priority in these cases.”
He added that fortunately for the CIO, graphs are quite mainstream nowadays and implementing a graph solution is very straightforward when it comes to tools, skills, staffing and operations.
Webber explained the competitive advantage offered by a graph is the most compelling point here, for two reasons.
First, graphs cut through complexity: The model is flexible and humane, what you’d draw on a whiteboard is what you’d store in the database so that the business owners and the systems owners talk the same language.
“There’s no opaque denormalization or wizardry, just a simple model of nodes connected by relationships that build a high-fidelity model of your problem,” he said. “Things that are hard with RDBMS or NOSQL are not hard with graphs, and that leads to sustained competitive advantage.”
The second advantage, Webber pointed out, is that graphs are fast–they’re based on graph theory not set theory (as RDBMS) and so works sympathetically with modern computers.
“When querying a graph, you don’t build huge intermediate sets and then filter, you simply follow the links,” he said.
When it comes to the skill sets organizations need to build in order to make use of graph database technology and the types of IT professionals CIOs should be looking to hire, Webber explained the graph world is bifurcated into graph data and graph data science folks, and they meet in the middle where the data lives.
“There is also a strong recent trend for graph data science,” he said. “These are folks who are data scientists and understand that the relationships between records are just as valuable as the records themselves.”
Webber also pointed out there are incredibly powerful techniques for extracting insight from graphs via unsupervised algorithms: e.g. you can ask a graph to find neighborhoods, busy paths, popular nodes, similar nodes, and so on, with minimal effort from the data scientist.
“The data scientist can then take those features and mix them into their machine learning model training and get better predictive outcomes,” he said.
He added that while there’s structure in a graph–it’s a data structure after all– adding, changing, updating that graph is all quite friendly.
“Sure, you can choose to lock down parts of the graph for governance reasons, but you can also let the graph grow organically otherwise, so that adapting to change is just a normal part of operations,” Webber said. “This combination of performance, scale, and simple ops makes graph systems successful now and in the future.”