Why do we need it?

How this things work in a traditional way (MySql, Postgres and so on), the way you might expect:
You execute a update or delete query and it changes the data in the table immedialy. It's don't affect the overall performance of the database becasuse of it row-storage nature, but it's not the case for ClickHouse.

In ClickHouse the data stored per column in a sorted way, so to update a single column of one record we have to rewrite the whole part - a file where column data actually stored.

It's pretty heavy operation that definetly whould degrate overall perfomance of the CLickHouse which mean to be able to handle huge amount real-time data.

But the good news is - ClickHouse has a solution for this - Lightweight Mutations.

How are they work?

Best practice

  • Be sure you really understand how much resources each mutations take. If it's accidental from time to time case - they are totaly fine. But if you have a lot of them, it's better to use a different approach. Original docs states more categorically: Avoid Mutations.
are deleted in 8 minutes? Why? Because some data can be in dirty pages of OS. And operating system trys to flush this data to disk for something lless than 8 minutes. And for this reason we use 8 minutes.delete uses mask column.