SLURP: An interactive query planner.


Overview

tl;dr

Here, you find the supplemental material to our Demo Paper "SLURP: An Interactive SPARQL Query Planner" including a demonstration video. Check out the live demo of SLURP here!

Abstract

Triple Pattern Fragments (TPFs) allow for querying large RDF graphs with high availability by offering triple pattern-based access to the graphs. This limited expressivity of TPFs leads to higher client-side querying cost and the challenge of devising efficient query plans when evaluating SPARQL queries. Different heuristics and cost-model based query planning approaches have been proposed to obtain such efficient query plans. However, we require means to visualize, modify, and execute alternative query plans, to better understand the differences between existing planning approaches and their potential shortcomings. To this end, we propose SLURP, an interactive SPARQL query planner that assists RDF data consumers to visualize, modify, and compare the performance of different query execution plans over TPFs.

Online Demo

Check out the live demo of SLURP here!

Video Demonstration


Application and Architecture

Example Use Case

We consider the following example SPARQL query to demonstrate our tool:

Query
Listing 1: Example Query.

Different query plans can be devised for this SPARQL query. For example, the implemented query processing engine by Verborgh et al. [4] produces a left-linear plan with Nested Loop Joins which can be seen in Figure 1 on the left side. The leaves of these plans represent triple patterns and are identified by their line number from Listing 1. The number of matching triples for each triple pattern and number of intermediate results are indicated in parentheses.

Other query processing engines [1, 2] use different optimization algorithms that are capable to devise more efficient query plans. The implementation of Acosta and Vidal [1] executes the query from Listing 1 with a bushy tree plan, as depicted in Figure 1 on the right side.

Plans
Figure 1: Query Plans Example.

The results for executing the example query of Listing 1 are shown in Table 1. The execution of the left-linear plan shows that many requests are necessary and the overall execution time is rather long. The bushy tree plan enables the simultaneous execution of subtrees and makes use of another join type, which reduces the execution time and number of requests as shown in Table 1 [1, p. 114].

Table 1: Execution results [1, p. 113]

The following figures illustrate the execution of the above mentioned query plans with SLURP. Note: We have specified a time out after 60 seconds for this demonstration. The number of requests are only obtained when a query does not reach the timeout.

Figure 2: SLURP Left-linear Plan.
Figure 3: SLURP Bushy Plan.

Our tool enables the visual analysis and modification of query plans generated by client-side query processing engines. Furthermore, SLURP enables the execution of (modified) query plans, of which the results and statistics can be reviewed afterwards. Our tool allows researchers to efficiently study the impact of different query plans on the performance of the query execution using a specific engine.

Architecture

An overview of the SLURP architecture is provided in the following Figure. The architecture can be separated in the Web application frontend and an API provided by the backend.

Architecture
Figure 1: SLURP Architecture.

Frontend

The frontend consist of a Main Page, an Editor page, and a Result Page. The Main Page provides an overview of the recently executed query plans. On the Editor page, users can specify a new query and the TPF server to execute the query over. As of now, queries with basic graph patterns are supported. SLURP allows the user to choose a query plan optimizer to obtain an initial query plan for the query. This enables users to (i) get an initial starting point when modifying the query plan, and (ii) compare query plans devised by different query planning approaches. For this demonstration, SLURP provides a basic left-linear query planner that uses the triple patterns' count metadata for join ordering. Moreover, SLURP also supports the query planner from nLDE and the planner from CROP with parameters settings used in [2]. For the initial query plan, the users can inspect the cardinalities of the individual triple patterns as well as the join cardinalities estimated by the query planning approach. The Editor Page provides an interactive execution plan editor that allows users to modify the initial plan or build a new plan in a drag-and-drop fashion. The leaves of the execution plan correspond to triple patterns evaluated over the selected TPF server and the inner nodes represent join operators. Users can build query plans with an arbitrary join order and place nested loop join (NLJ) or symmetric hash join (SHJ) operators in the execution plan. Upon modification, the plan can be executed and the user is redirected to the Result Page. On this page, which is automatically refreshed during query execution, the user can analyze the execution plan's performance with respect to its execution time, requests performed, intermediate result (i.e., join cardinalities), and answers produced.

Backend

The backend consists of three components that the frontend interacts with via an API. The query planer is used by the frontend to obtain the query plan for a given query, TPF server, and planning approach (GET /plan). The task queue manages the execution plans to be executed by the engine in a first-in-first-out queue (POST /plan). For this demonstration, a single execution plan is executed at once and we set a timeout to $60$ seconds. In a local deployment of the tool, these parameters can be adjusted accordingly. The database stores the queries, execution plans, and statistics of their execution (GET /task). Given a SPARQL query, a TPF server URL, and the name of an optimizer, the query planning component in the backend obtains an initial query plan which is serialized as a JSON object and provided to the frontend. The planning component currently implements the nLDE optimizer, the CROP optimizer, and a left-linear optimizer. Further optimizers may be implemented in the planning component and exposed to the frontend. When a user submits a query plan to be executed, the frontend serializes the plan as a JSON object and the SLURP backend creates the corresponding physical query plan. We use the network of Linked Data Eddies (nLDE) [1] as the engine to execute physical query plans.

Interoperability and Reuse

The source code of SLURP is publicly available on GitHub and licensed under the open source MIT License. SLURP is developed in a containerized fashion using docker and docker-compose to facilitate installation and deployment. With the SLURP API, any application can interact with the query planner (GET /plan) and the query execution engine (POST /plan). The query plans are represented in a binary tree with the triple patterns as the tree's leaves and join operators as its inner nodes. The query plans are serialized in JSON. An example query plan would look like the following.
{
  "estimated_tuples": 2, 
  "left": {
    "estimated_tuples": 2, 
    "left": {
      "estimated_tuples": 2, 
      "left": {
        "cardinality": "2", 
        "tpf": "?university <http://www.w3.org/2000/01/rdf-schema#label> \"Stanford University\"@en .", 
        "type": "Leaf"
      }, 
      "right": {
        "cardinality": "86088", 
        "tpf": "?student <http://dbpedia.org/ontology/almaMater> ?university .", 
        "type": "Leaf"
      }, 
      "type": "NLJ"
    }, 
    "right": {
      "cardinality": "1187", 
      "tpf": "?student <http://dbpedia.org/property/thesisTitle> ?title .", 
      "type": "Leaf"
    }, 
    "type": "NLJ"
  }, 
  "right": {
    "cardinality": "4885", 
    "tpf": "?student <http://dbpedia.org/ontology/doctoralAdvisor> ?advisor .", 
    "type": "Leaf"
  }, 
  "type": "NLJ"
}
    
In future work, we want to investigate a more generic and flexible query plan representation and serialization to support the integration of additional query planning approaches and query execution engines.

Evaluation

We have evaluated SLURP in a small user study where the participants had to perform two tasks and filled out the System Usabiity Scale (SUS) [5]. SUS is an effective and inexpensive tool for assessing the usability of interactive systems. The overall SUS score from the study is 83.5 This means the usability of SLURP is well above average. However, only five participants have evaluated SLURP in total. If you want to evaluate SLURP, you can also participate in the study: SLURP User Study


About

Code

The source code of SLURP is available on GitHub.

Authors

References