Content on this page
In the previous part of the walkthrough we learned how to
store the data in object storage (i.e. AWS S3) and to ensure that the data is
highly available and stored securely. In this part we will move away from using
sdb as a query engine and start using the Sneller daemon instead.
Using the Sneller daemon requires a bit more knowledge and some decisions, so this walkthrough is more theoretical and doesn’t provide any sample queries. But don’t worry, the next part is hands-on again.
The Sneller daemon uses the exact same query engine as
sdb, but it allows
distributing the workload to execute a query across multiple nodes. The daemon
can be split in two parts:
- The daemon provides an HTTP endpoint that allows query execution. This part also performs query planning and distributes the query across the other daemons. Finally, it combines all the partials results and passes it back to the client.
- The worker part is waiting for workloads that are distributed by the daemon. It does the actual data crunching for the query execution.
The Sneller daemon combines both roles in a single executable and all nodes are considered equal. To provide high availability, the Sneller daemon is often using a load-balancer in front to distribute the load across the various instances.
Using multiple Sneller daemons ensures that Sneller is highly available and makes it a cloud-native and robust query engine that scales linearly and can handle huge amounts of data. Another benefit is that each Sneller node caches data in memory to reduce the network I/O traffic between the Sneller nodes and object storage.
Sneller also allows you to run your own cluster. The most convenient method is to use our Kubernetes Helm package to deploy Sneller in your Kubernetes cluster. It’s technically possible to run it on barebone servers, but using Kubernetes is the only supported scenario.
A full example on how to use Sneller in your own EKS cluster is shown in this tutorial.
In the next part of the series we will show you how to run a cluster of Sneller daemons using your own infrastructure via Kubernetes.