Worst-case optimal join algorithms are attractive from a theoretical point of view, as they
offer asymptotically better runtime than binary joins on certain types of queries. In particular,
they avoid enumerating large intermediate results by processing multiple input relations in a
single multi-way join. However, existing implementations incur a sizable overhead in practice,
primarily since they rely on suitable ordered index structures on their input. Systems that support
worst-case optimal joins often focus on a specific problem domain, such as read-only graph analytic
queries, where extensive precomputation allows them to mask these costs.
In this paper, we present a comprehensive implementation approach for worst-case optimal joins that
is practical within general-purpose relational database management systems supporting both
hybrid transactional and analytical workloads. The key component of our approach is a novel
hash-based worst-case optimal join algorithm that relies only on data structures that can be built
efficiently during query execution. Furthermore, we implement a hybrid query optimizer that
intelligently and transparently combines both binary and multi-way joins within the same query
plan. We demonstrate that our approach far outperforms existing systems when worst-case optimal
joins are beneficial while sacrificing no performance when they are not.
«
Worst-case optimal join algorithms are attractive from a theoretical point of view, as they
offer asymptotically better runtime than binary joins on certain types of queries. In particular,
they avoid enumerating large intermediate results by processing multiple input relations in a
single multi-way join. However, existing implementations incur a sizable overhead in practice,
primarily since they rely on suitable ordered index structures on their input. Systems that support
worst-case...
»