v.21.9Improvement

Apply LIMIT on Shards with Distributed Push Down Limit for Queries

Apply LIMIT on the shards for queries like SELECT * FROM dist ORDER BY key LIMIT 10 w/ distributed_push_down_limit=1. Avoid running Distinct/LIMIT BY steps for queries like SELECT DISTINCT shading_key FROM dist ORDER BY key. Now distributed_push_down_limit is respected by optimize_distributed_group_by_sharding_key optimization. #26466 (Azat Khuzhin).
Enhance query performance by applying LIMIT directly on shards in distributed queries when distributed_push_down_limit=1 is enabled, and optimizing the handling of Distinct and LIMIT BY in distributed queries.

Why it matters

This feature improves efficiency for distributed queries, particularly those using ORDER BY with LIMIT and queries with DISTINCT or LIMIT BY. By pushing down LIMIT to individual shards and respecting this limit in the optimize_distributed_group_by_sharding_key optimization, it reduces unnecessary processing and data transfer, leading to faster execution times and lower resource consumption.

How to use it

Enable the feature by setting the setting distributed_push_down_limit=1 in your query or session. This will allow the LIMIT clause to be pushed down and applied on each shard during distributed query execution. No additional changes to queries are necessary.