System Support for Managing Graph Data
Wednesday November 6, 2013
3:00 pm - Sennott Square - Room 6516
Refreshments at 2:45 pm in room 6323 (Faculty Lounge)
Hosted by Panos Chrysanthis
AbstractMany applications employ large graphs to model their data. Graph data management is hard: Developers want to use a declarative language to express both reachability queries (e.g., what is the shortest path between two nodes) and pattern matching queries (e.g., find all matches of a small graph in a larger graph). Today, we do not have a graph engine that can process both types of queries on a general graph. From the system point of view, the graph query processor visits a large number of nodes and edges in a random access pattern with little locality while executing queries. We therefore manage the graph in main memory to provide low-latency access, and if the graph is too large we partition it to fit in the main memory of a cluster of machines, forming a distributed system.
In this talk I will discuss three techniques which address these challenges. First, I will present a system that processes declarative reachability queries on a distributed graph. Second, I will extend the reachability algorithm to handle pattern matching queries, forming a unified framework for processing both types of queries. Finally, I will show how to support updates to the graph efficiently while providing consistent distributed snapshots to queries in a lock-free manner.