Automatic Iceberg Table Maintenance in Chango

Kidong Lee
2 min readJun 21, 2024

--

Everytime data committed to iceberg tables, many files will be created like data files, snapshots, metadata files which should be maintained manually later. In order to maintain iceberg tables manually, you need for example, to develop spark job to run iceberg maintenance queries, or to send iceberg maintenance queries to spark thrift server through JDBC/Thrift.

Chango provides iceberg REST Catalog called Chango REST Catalog which maintains iceberg tables in Chango automatically for you with the followings.

  • Compacts small files.
  • Expires snapshots.
  • Remove old metadata files.
  • Remove orphan files.
  • Rewrite manifest files.
  • Rewrite position delete files

Let’s explore more details of iceberg table maintenance in Chango.

Move to Iceberg Maintenance Page

In order to move to iceberg maintenance page, click the button of Go to Iceberg Maintenance in the section of action in Chango REST Catalog page. Then, you may see the following picture with selecting schema and table. You will see the current table size and record count like this.

Show Maintenance History

You may see the history of table maintenance by Chango REST Catalog automatically with clicking the tab of Maintenance History.

Show Internal Iceberg Table Status and Statistics

You may also see all the internal table status and statistics with clicking another tabs, for example, click the tab of History to show table history.

Simple Query Runner

There is a simple query runner in which you can run any spark sql queries. Click the tab of Run Query.

That’s all.

--

--

Kidong Lee
Kidong Lee

Written by Kidong Lee

Founder of Cloud Chef Labs | Chango | Unified Data Lakehouse Platform | Iceberg centric Data Lakehouses https://www.cloudchef-labs.com/

No responses yet