Exponential growth in data generation, emerged from an increase in the number of devices connected to the World Wide Web, and also an increase in the precision of data gathered pose a threat to data centers while opening new potential for data analysis when the processing is kept tractable. To deal with this ever growing and already enormous amount of data, hardware acceleration is receiving much attention due to its great efficiency and parallel capabilities. Hardware acceleration provides the best in class performance and power consumption properties; however, it is hindered by high research and development costs and long times-to-market that are repeated for each new application.
We propose a reconfigurable hardware-based streaming architecture, namely flexible query processor (FQP), that constitutes a family of stream processing blocks that support dynamic changes to queries and streams, as well as static changes to the processor-internal fabric in order to maximize performance for given workloads. While processing incoming tuples, FQP can accept new queries, a key characteristic distinguishing FQP from related approaches.
FQP is designed to support the stream join parallelization, as one of the most resource-intensive operations. As a result, it utilizes a bidirectional data-flow model that was the only available stream join parallelization model at the time. This model complicated the design of FQP by forcing streams into two separate data-paths, imposing large controlling logics in each processing block to prevent race conditions and to ensure the correct execution. To address this fundamental issue, we introduce a novel unidirectional data-flow model for low-latency stream join processing parallelization, referred to as SplitJoin, that operates by splitting the join operation into independent storing and processing steps that gracefully scale with respect to the number of cores.
In the context of stream processing, it is common to have queries that use multiple data streams simultaneously. This marks the multiway stream joins operator as one of the essential blocks of our query processor. The challenges for multiway stream joins come from the required real-time join operator reordering when intermediate results are not materialized (due to their potentially large sizes). We propose a scalable circular pipeline design, namely Circular-MJ, which realizes the various necessary join trees using a fixed operator ordering and an arbitrary tuple input mechanism. Circular-MJ reduces the reordering challenge to an input reordering problem, which is later addressed by a pipeline distribution chain. We further propose an optimized pipeline stream join (Stashed-MJ) that uses a best-effort buffering technique to maintain intermediate results. Lastly, we present a parallelized version of our multiway stream join by integrating our proposed pipelines into a parallel unidirectional flow-based architecture (Parallel-MJ).
As the last part in this dissertation, we present our simplex stream processor (SSP), a successor to our custom flexible query processor that targets real-time stream processing while providing modular components. SSP benefits from our stream customized network-on-chip which uses a unidirectional data-flow model. By benefiting from our proposed solutions (i.e., SplitJoin, Stashed-MJ, Parallel-MJ, HB-SJ), SSP introduces libraries for the communication network and processing blocks, with a consistent interface that allows the further addition of components. As a proof of concept, we benchmark a modified version of the TPC-H third query on our SSP, realized in VHDL, while presenting the query mapping, programming, and processing steps in detail.
«
Exponential growth in data generation, emerged from an increase in the number of devices connected to the World Wide Web, and also an increase in the precision of data gathered pose a threat to data centers while opening new potential for data analysis when the processing is kept tractable. To deal with this ever growing and already enormous amount of data, hardware acceleration is receiving much attention due to its great efficiency and parallel capabilities. Hardware acceleration provides the...
»