AVX-512

Integer division is an arithmetic operation that is not provided natively by SIMD instruction set extensions. In this article we provide a vectorized solution to successfully divide signed 64-bit integers by taking advantage of AVX-512

Branchless Code With AVX-512

by Henk-Jan Lebbink

May 10, 2023

Sneller uses 16 parallel data lanes for almost all tasks, including loading and decompressing data, all without the use of branches. We heavily rely on predicated instruction execution provided by the AVX-512 instruction set to achieve this. In this post, we will explain a simple example of converting a string to uppercase, which is frequently used in our string processing functions.

Accelerating Fuzzy Search using AVX-512

by Henk-Jan Lebbink

May 8, 2023

We present our SQL fuzzy string compare and contains functionality that allows multi GiB/s processing without any need for preprocessing or indexing. Yes, that is right, fuzzy functionality yet no planning needed!

Sneller: Querying terabytes of JSON per second

by Frank Wessels

May 3, 2023

Learn how Sneller is capable of querying terabytes of JSON per second on medium sized clusters.

Accelerating Regular Expressions with AVX-512

by Henk-Jan Lebbink

April 24, 2023

We present a high-performance regular expression engine that uses 16 parallel lanes, that does not need branching or backtracking. This engine is developed for the Intel Icelake processor, and is written in AVX-512 assembly.

Blazing Fast Unicode-aware ILIKE in AVX-512

by Henk-Jan Lebbink

April 17, 2023

We present a method to perform case-insensitive comparison of UTF-8 encoded strings using 16 parallel lanes and no branching. This method is used to implement the ILIKE operator for the Intel SkylakeX processor, written in AVX-512 assembly.

64-bit Integers to Strings with AVX-512

by Petr Kobalicek

March 31, 2023

This article explores the possibility of branchlessly converting multiple signed 64-bit integers to strings by taking advantage of AVX-512 extensions. Most research and implementations focus on improving the performance of converting a single value instead of performing multiple conversions at once. At Sneller, we use AVX-512 to process 16 values in parallel, and thus we would like to describe how we have done it in our query engine.

Building a SQL VM in AVX-512 Assembly

by Phil Hofer

March 22, 2023

One of Sneller’s novel features is a bytecode-based virtual machine written almost entirely in AVX-512 assembly. While Sneller is far from the first project to incorporate SIMD acceleration into a query engine, our interpreter is unusual in that it is implemented entirely in assembly.

Intro

SQL

Onboarding

Cloud

The Hard Way

AVX-512