Direct Access for Conjunctive Queries with Negation

Oliver Irwin

14/11/2023 - Speed-Dating Seminar

Context

Join Query : \(Q(x_1, \dots, x_n) = \bigwedge_{i=1}^k R_i(\vec{z_i})\)

where \(\vec{z_i}\) is a tuple over \(X = \{x_1,\dots,x_n\}\)

Example: \(Q(city, country, name, id) = People(id, name, city) \wedge Capitals(city, country)\)

People
id name city
1 Alice Paris
2 Bob Lens
3 Chiara Rome
4 Djibril Berlin
5 Émile Dortmund
6 Francesca Rome
Capitals
city country
Berlin Germany
Paris France
Rome Italy
\(Q(\mathbb{D})\)
city country name id
Paris France Alice 1
Rome Italy Chiara 3
Berlin Germany Djibril 4
Rome Italy Francesca 6

Direct Access

We want to access the \(k\)-th element of \(Q(\mathbb{D})\) for a given order.

Make \(Q(\mathbb{D})\) an array, sort it and then we have direct access?

Direct Access

We want to access the \(k\)-th element of \(Q(\mathbb{D})\) for a given order.

Make \(Q(\mathbb{D})\) an array, sort it and then we have direct access?

\(Q(\mathbb{D})\)
city country name id
Berlin Germany Djibril 4
Paris France Alice 1
Rome Italy Chiara 3
Rome Italy Francesca 6

\(Q(\mathbb{D})[4] = (Rome, Italy, Francesca, 6)\)

Direct Access

We want to access the \(k\)-th element of \(Q(\mathbb{D})\) for a given order.

Make \(Q(\mathbb{D})\) an array, sort it and then we have direct access?

\(Q(\mathbb{D})\)
city country name id
\(\dots\) \(\dots\) \(\dots\) \(\dots\)
Berlin Germany Djibril 4
\(\dots\) \(\dots\) \(\dots\) \(\dots\)
Paris France Alice 1
\(\dots\) \(\dots\) \(\dots\) \(\dots\)
Rome Italy Chiara 3
Rome Italy Francesca 6
\(\dots\) \(\dots\) \(\dots\) \(\dots\)

\(Q(\mathbb{D})[1432] =\) ??

Precomputation : very costly

Access : nearly free

We need another way to represent the data

Previous work

Research focuses on algorithms with reasonable preprocessing and fast access time

FO logical formulas

Bagan, Durand, Grandjean, Olive (2008)

Preprocessing : linear
Access : constant

MSO formulas

Bagan (2009)

Preprocessing : linear
Access : constant

Acyclic CQs

Carmeli, Zeevi, Berkholz, Kimelfeld, Schweikart (2020)

Preprocessing : linear
Access : polylog

for certain lexicographical orders

We propose a new, unifying method for DA on ACQ based on relational circuits

Problem? Solution!

We want to access the \(k\)-th solution to a query for a given database

Make a table, sort it, and done?

\(Q(\mathbb{D})\)
city country name id
Berlin Germany Djibril 4
Paris France Alice 1
Rome Italy Chiara 3
Rome Italy Francesca 6

We need another solution 😢

Approaches with ok preprocessing and fast access exist




But only for restricted queries and databases 😢

We propose a new approach!



Unifies former results 🥳



Extends considered query classes 😍

Relational Circuits

\(x_1\) \(x_2\) \(x_3\)
0 0 0
0 0 1
0 1 0
0 1 1
1 0 1
1 0 2
1 1 1
1 1 2
1 2 0
1 2 1
2 0 1
2 0 2
2 2 1
2 2 2

Relational Circuits

factorised representation of relations

circuit with 3 kinds of gates :

  • inputs : \(\top\) & \(\bot\)
  • decision gates
  • \(\times\)-gates

paths from decision gates are labelled by the domain values

Ordered Relational Circuits

factorised representation of relations

circuit with 3 kinds of gates :

  • inputs : \(\top\) & \(\bot\)
  • decision gates
  • \(\times\)-gates

paths from decision gates are labelled by the domain values

+ order \(\prec\) on the variables

Ordered Relational Circuits

For \(C\) an ordered relational circuit, we can perform direct access tasks in time \(\mathcal{O}(\mathsf{poly}(|X|)\mathsf{polylog}(|D|)\) after a preprocessing in time \(\mathcal{O}(|C|\cdot\mathsf{poly}(|X|)\mathsf{polylog}(|D|))\)

Preprocessing

Idea : for each gate \(v\) over \(x_i\) and for each domain value \(d\)

compute the size of the relation where \(x_i\) is set to a value \(d'\leqslant d\)

Preprocessing

Direct Access

Compute the 7th solution \(\to\) 111

Direct Access

Compute the 13th solution \(\to\) 221

From CQ to circuit

\(Q\) a CQ and \(x_1\prec\dots\prec x_n\) an order over the variable set

\(Q(\mathbb{D}) = \biguplus_{d\in D} Q[x_1 = d](\mathbb{D})\)

\[ \text{if} \begin{cases} Q & = & Q_1 \land Q_2 \\ \mathsf{var}(Q_1) \cap \mathsf{var}(Q_2) & = & \emptyset \end{cases} \]

then \(Q(\mathbb{D}) = Q_1(\mathbb{D}) \times Q_2(\mathbb{D})\)

From CQ to circuit

\(Q\) a CQ and \(x_1\prec\dots\prec x_n\) an order over the variable set

\(Q(\mathbb{D}) = \biguplus_{d\in D} Q[x_1 = d](\mathbb{D})\)

\[ \text{if} \begin{cases} Q & = & Q_1 \land Q_2 \\ \mathsf{var}(Q_1) \cap \mathsf{var}(Q_2) & = & \emptyset \end{cases} \]

then \(Q(\mathbb{D}) = Q_1(\mathbb{D}) \times Q_2(\mathbb{D})\)

recursive implementation + cache \(\implies\) ordered relational circuit computing \(Q(\mathbb{D})\)

Recap

Our method:

  1. compile an ordered relational circuit \(C\) computing \(Q(\mathbb{D})\) ;
  2. annotate said circuit with the number of solutions ;
  3. solve DA tasks by induction over the circuit.

For a query \(Q(x_1,\dots,x_n)\) and an order on the variables of “complexity” \(k\), we can solve DA tasks with a preprocessing in time \(\mathcal{O}(|\mathbb{D}|^k\mathsf{poly}(|Q|))\) and access in time \(\mathcal{O}(\mathsf{poly}(|Q|)\mathsf{polylog}(|\mathbb{D}|))\).

Signed CQs

Signed Conjunctive Queries : \(Q = \bigwedge R_i(\vec{x_i})\) \(\bigwedge \lnot S_j(\vec{x_j})\)

Signed CQs

Signed Conjunctive Queries : \(Q = \bigwedge R_i(\vec{x_i})\) \(\bigwedge \lnot S_j(\vec{x_j})\)

\(R_i\)
\(x_1\) \(x_2\) \(x_3\)
0 1 0
\(\lnot R_i\)
\(x_1\) \(x_2\) \(x_3\)
0 0 0
0 0 1
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

Results

the circuit approach recovers the known tractable classes from the literature (for CQ+)

we generalise and unify tractability results about CQ-

Next steps

Going further with circuits

study the tractability of the circuit approach for DA on CQs with aggregation

\(Q(p, c, g, \mathsf{count()}) = \mathsf{Teams}(p, c) \land \mathsf{Games}(g, c, \cdot) \land \mathsf{Tries}(g, p)\)

How should we integrate the aggregation in the lexicographical order?

How does the aggregation fit in to the compiled circuits?

generalise the circuit approach to queries over annotated databases (FAQ and AJAR queries)

recent (Oct. 23) works at Madison University (WI, USA)