Overview
Data engineers spend considerable effort tuning Spark applications—adjusting partition counts, optimizing joins, and refining execution plans. Yet many workloads continue to spend the majority of their runtime simply reading Apache Hudi, Apache Iceberg, or Parquet tables.
As organizations scale their lakehouse environments, the efficiency of the scan path becomes increasingly important. Before a query can execute business logic, it must retrieve data from object storage, decode columnar files, and prepare data for distributed execution. These foundational operations often become the primary source of latency.
Quanton Engine was designed to accelerate this first mile of data processing by optimizing the components that dominate scan-stage performance.
The problem: why table reads remain slow
Even after extensive Spark tuning, data engineers frequently observe long query execution times when reading Hudi, Iceberg, or Parquet datasets. The root cause is often found in three key stages of query execution.
Object storage access
Cloud object stores such as Amazon S3, Google Cloud Storage, and Azure Blob Storage introduce significantly higher latency than local file systems. Engines must perform numerous metadata requests, file listings, and remote reads before processing begins — and as datasets grow to millions of files, these operations become a major contributor to overall query latency.
Parquet decoding overhead
Parquet is highly efficient for storage, but decoding it requires substantial CPU: data must be decompressed, decoded, and converted into in-memory structures. For many analytical workloads, scan-stage CPU is dominated by Parquet deserialization rather than the actual query execution.
Shuffle amplification
Once scanned, data must often be repartitioned across the cluster to satisfy joins and aggregations. This shuffle generates additional network traffic, disk I/O, and serialization overhead — so a large share of runtime is spent moving and transforming data rather than executing business logic.
How Quanton solves these challenges
Quanton Engine optimizes the entire scan pipeline end to end instead of treating storage, decoding, and shuffle as independent problems.
Optimized object-store client
A storage access layer engineered specifically for cloud object stores. By reducing metadata overhead, improving request efficiency, and optimizing read patterns, Quanton minimizes remote-access latency — so scan stages begin processing sooner and sustain higher throughput throughout execution.
Native Rust vectorized Parquet reader
Quanton replaces traditional JVM-based decode paths with a native Rust implementation that processes data in batches. Vectorized execution cuts CPU overhead while improving cache utilization and memory efficiency — dramatically speeding up Parquet decompression and deserialization.
Arrow-based columnar shuffle
Beyond the scan phase, Quanton keeps data in a columnar Apache Arrow representation during movement instead of repeatedly serializing and deserializing row-oriented structures. That reduces CPU consumption, network overhead, and memory pressure during repartitioning.
Together, these optimizations create a highly efficient path from object storage to query execution.
Benchmarking results
To evaluate performance, we benchmarked Quanton Engine against Apache Spark using TPC-DS Query 23, a representative analytical workload that includes substantial scan and shuffle activity.
The results show that Quanton significantly accelerates the scan phase of execution — approximately 7× lower execution time across two major scan stages.
These improvements translate directly into faster query completion times, better infrastructure utilization, and lower overall processing costs.
Conclusion
As data lakehouse deployments continue to scale, optimizing the scan path becomes increasingly important. Traditional Spark optimizations can improve downstream execution, but they do not address the fundamental bottlenecks introduced by object storage access, Parquet decoding, and shuffle operations.
Quanton Engine tackles these challenges holistically, delivering substantial performance gains by accelerating every stage of the data read pipeline. The result is a faster, more efficient execution engine that allows organizations to extract greater value from their lakehouse architectures while reducing both latency and infrastructure costs.