back to list

Project: Implementing the Graph Pattern Matching Language (GPML) Fragment for GQL on AvantGraph

Description

Graph databases have emerged as a powerful contender to traditional relational databases, especially in areas where complex relationships and interconnections are required, such as social networks and knowledge graphs. This has led to the development of various query languages to interact with graph databases, each designed to leverage the unique structure of graph data. Among the most prominent query languages are Cypher, used by Neo4j; Gremlin, part of the Apache TinkerPop framework; and SPARQL, which is widely used for querying RDF data in semantic web contexts. Given the diverse landscape of graph query languages, the Graph Query Language (GQL) has been developed as a new ISO standard [5] to unify and standardize the capabilities of existing languages. One of the core components of GQL is the Graph Pattern Matching Language (GPML), which enables users to express complex graph patterns for querying purposes [3]. GPML has been designed to provide advanced pattern matching capabilities that are essential for various applications, including fraud detection, social network analysis, and recommender systems. It supports the specification of different patterns in the graph with constraints such as repetitions, trails, and acyclic paths. Additionally, GPML facilitates the incorporation of conditions on nodes and edges, enabling complex filtering and retrieval of subgraphs that meet specific requirements [1, 2]. This project’s aim is to implement the GPML fragment of GQL on AvantGraph [4], a state of the art graph database designed and developed in the database group of DAI cluster, focusing on developing a functional and efficient query processor capable of handling various graph pattern matching scenarios.


Objectives:

  1. Design: Design the architecture for the GPML query processor based on the current structure of AvantGraph, including components for parsing, optimization, and execution, ensuring support for different filtering and graph patterns such as simple paths, trails, and acyclic paths.
  2. Implementation: Develop the GPML fragment on AvantGraph based on the architectural design, including developing a parser, query optimizer, and execution engine [6] or implementing changes on currently existing parts.
  3. Testing and Validation: Conduct testing of the GPML implementation against currently supported graph queries on AvantGraph. Validate the correctness, efficiency, and scalability of the implementation.


Expected Deliverables:

  • A fully functional GPML fragment implementation within the GQL framework on AvantGraph.
  • A detailed project report documenting the design, implementation, testing, and evaluation processes.
  • Source code, datasets used for testing, and any scripts or tools developed during the project.


References:

[1] Francis N, Gheerbrant A, Guagliardo P, Libkin L, Marsault V, Martens W, Murlak F, Peterfreund L, Rogova A, Vrgoc D. “GPC: A pattern calculus for property graphs.” In Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems 2023 Jun 18 (pp. 241-250).
[2] Francis N, Gheerbrant A, Guagliardo P, Libkin L, Marsault V, Martens W, Murlak F, Peterfreund L, Rogova A, Vrgoc D. “A Researcher’s Digest of GQL.” In The 26th International Conference on Database Theory, 2023 Mar 17 (pp. 1-1). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[3] Deutsch A, Francis N, Green A, Hare K, Li B, Libkin L, Lindaaker T, Marsault V, Martens W, Michels J, Murlak F. “Graph pattern matching in GQL and SQL/PGQ.” In Proceedings of the 2022 International Conference on Management of Data 2022 Jun 10 (pp. 2246-2258).
[4] https://avantgraph.io/index.html
[5] https://www.gqlstandards.org/
[6] https://ldbcouncil.org/pages/opengql-announce/

Details
Supervisor
Nick Yakovets
Secondary supervisor
Sepehr Sadoughi
Interested?
Get in contact