Announcing support for GROUP BY, SUM, and other aggregation queries in R2 SQL | Endigest
Cloudflare
|Data EngineeringTags:R2
Data
Edge Computing
Rust
Serverless
SQL
Get the latest tech trends every morning
Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Cloudflare announces support for GROUP BY, SUM, and other aggregation queries in R2 SQL, its serverless analytics query engine over R2 Data Catalog.
- •Aggregations split into two phases: pre-aggregate computation on worker nodes, then final merge at the coordinator (scatter-gather)
- •Pre-aggregates allow horizontal scaling: e.g., count(*) pre-aggregate is a partial row count, avg(value) stores sum and count separately
- •Scatter-gather fails for ORDER BY/HAVING on aggregates when grouping by high-cardinality columns, as local top-N results can miss global leaders
- •Shuffling solves this via deterministic hash partitioning: each worker routes rows to the same destination worker based on the GROUP BY key hash
- •A synchronization barrier ensures all workers finish sending data before any worker computes final aggregates
This summary was automatically generated by AI based on the original article and may not be fully accurate.