v.24.1Performance Improvement

More Cache-Friendly FINAL Implementation with Behavior Change on Sorting Output

More cache-friendly final implementation. Note on the behaviour change: previously queries with FINAL modifier that read with a single stream (e.g. max_threads = 1) produced sorted output without explicitly provided ORDER BY clause. This is no longer guaranteed when enable_vertical_final = true (and it is so by default). #54366 (Duc Canh Le).