Accelerating Lakehouse Reads with Quanton Engine

Overview

Data engineers spend considerable effort tuning Spark applications—adjusting partition counts, optimizing joins, and refining execution plans. Yet many workloads continue to spend the majority of their runtime simply reading Apache Hudi, Apache Iceberg, or Parquet tables.

As organizations scale their lakehouse environments, the efficiency of the scan path becomes increasingly important. Before a query can execute business logic, it must retrieve data from object storage, decode columnar files, and prepare data for distributed execution. These foundational operations often become the primary source of latency.

Quanton Engine was designed to accelerate this first mile of data processing by optimizing the components that dominate scan-stage performance.

The problem: why table reads remain slow

Even after extensive Spark tuning, data engineers frequently observe long query execution times when reading Hudi, Iceberg, or Parquet datasets. The root cause is often found in three key stages of query execution.

Object storage access

Cloud object stores such as Amazon S3, Google Cloud Storage, and Azure Blob Storage introduce significantly higher latency than local file systems. Engines must perform numerous metadata requests, file listings, and remote reads before processing begins — and as datasets grow to millions of files, these operations become a major contributor to overall query latency.

Parquet decoding overhead

Parquet is highly efficient for storage, but decoding it requires substantial CPU: data must be decompressed, decoded, and converted into in-memory structures. For many analytical workloads, scan-stage CPU is dominated by Parquet deserialization rather than the actual query execution.

Shuffle amplification

Once scanned, data must often be repartitioned across the cluster to satisfy joins and aggregations. This shuffle generates additional network traffic, disk I/O, and serialization overhead — so a large share of runtime is spent moving and transforming data rather than executing business logic.

How Quanton solves these challenges

Quanton Engine optimizes the entire scan pipeline end to end instead of treating storage, decoding, and shuffle as independent problems.

01 · STORAGE

Optimized object-store client

A storage access layer engineered specifically for cloud object stores. By reducing metadata overhead, improving request efficiency, and optimizing read patterns, Quanton minimizes remote-access latency — so scan stages begin processing sooner and sustain higher throughput throughout execution.

02 · DECODE

Native Rust vectorized Parquet reader

Quanton replaces traditional JVM-based decode paths with a native Rust implementation that processes data in batches. Vectorized execution cuts CPU overhead while improving cache utilization and memory efficiency — dramatically speeding up Parquet decompression and deserialization.

03 · SHUFFLE

Arrow-based columnar shuffle

Beyond the scan phase, Quanton keeps data in a columnar Apache Arrow representation during movement instead of repeatedly serializing and deserializing row-oriented structures. That reduces CPU consumption, network overhead, and memory pressure during repartitioning.

Together, these optimizations create a highly efficient path from object storage to query execution.

Benchmarking results

To evaluate performance, we benchmarked Quanton Engine against Apache Spark using TPC-DS Query 23, a representative analytical workload that includes substantial scan and shuffle activity.

The results show that Quanton significantly accelerates the scan phase of execution — approximately 7× lower execution time across two major scan stages.

Apache Spark stage execution times on TPC-DS Query 23 — the two major scan stages take 3.7 and 3.5 minutes. — Apache Spark — the two major scan stages on TPC-DS Query 23 take 3.7 and 3.5 minutes.

Quanton stage execution times on TPC-DS Query 23 — the same two scan stages take 34 and 35 seconds. — Quanton — the same two scan stages complete in 34 and 35 seconds, roughly 7× faster.

These improvements translate directly into faster query completion times, better infrastructure utilization, and lower overall processing costs.

Conclusion

As data lakehouse deployments continue to scale, optimizing the scan path becomes increasingly important. Traditional Spark optimizations can improve downstream execution, but they do not address the fundamental bottlenecks introduced by object storage access, Parquet decoding, and shuffle operations.

Quanton Engine tackles these challenges holistically, delivering substantial performance gains by accelerating every stage of the data read pipeline. The result is a faster, more efficient execution engine that allows organizations to extract greater value from their lakehouse architectures while reducing both latency and infrastructure costs.