For years, data has been stored in relational tables that are linked to each other with indexes in relational database systems (RDBMS). With the advent of big data technologies, RDBMS have proven too slow to process the amount of data and noSQL databases have taken over to. While noSQL DBMS are capable of processing large and unstructured amounts of data, they lack the semantics of data, the reference back to master data in enterprise RDBMS and additional meta data. That is where graphs come into play.
Graphs are a more natural representation of data. They physically connect highly heterogenous data points across an organization. Graphs come with a variety of use cases:
- Recommendation engines
Companies like Ebay use graphs to calculate recommendations for products. They map users, their purchases and other properties in a graph and infer new relationships based on existing ones. For instance, users who bought/looked at this product also bought/looked at that product.
- Fraud detection
Graphs help to visualize quickly suspicious relationships in payment transactions. In addition, they help to visualize the relationships of the parties, such as complex company transactions (see Panama papers).
- Master data management
Multi-domain master data management is taken to new heights by linking the right suppliers to the products to the customers. The graph also allows to run auto-classification easier and clusters similar data accordingly.
- Knowledge graph
Due to its high elasticity, graphs can hold much more meta data, such as provenance and lineage to create knowledge graphs across the enterprise. The knowledge graph can evolve to become a digital mirror of an organization.
- Advanced analytics
Through its linkage of heterogenous data, a graph is a rich source for analysis but also offers a lot of functionality to easy apply machine-learning (ML) functionality on top. Most graph databases have native support for advanced analytics.
Graphs are paving the way for AI. The semi-structured nature of graphs is ideal for advanced data analytics using AI functionality or provide interfaces to e.g. Tensorflow. Especially property graphs provide the right detail to run ML models on them. For instance, Neo4j comes with out-of-the-box graph algorithms for community clustering, similarity or centrality. These make use of other frameworks unnecessary and speed up the process significantly.
A knowledge graph is a good starting point for entering the topic. The following three steps will help you to get started:
- Set up a governance for your graph data initiative. A knowledge graph in a company usually spans several departments or organizational units and require buy in from a lot of stakeholders.
- Make an inventory of the data sources you want to include. Based on the data sources, build a data model that suits your needs. The graph is very elastic, i.e. future changes can be implemented at any time, but the consumers have to be planned carefully.
- Start building your knowledge graph with all integration sources and onboard the first users.
Thank you for reading this short article. At Acumacon, we help our customers to make their data more intelligent. Talk to us. Your enterprise is a graph too.