Query Audit in Chango

Kidong Lee
2 min readAug 8, 2024

--

All the queries executed by query engines like Chango Trino Gateway, Chango Spark Thrift Server and Chango Spark SQL Runner in Chango will be logged to explore history of all the executed queries later.

This article shows the architecture of Query Audit in Chango and internals of it.

Query Audit Architecture

There are two types of queries in Chango, namely, Trino Queries and Spark SQL Queries.

  • Trino Queries are sent to Chango Trino Gateway which routes trino queries to the backend trino clusters to execute trino queries.
  • Spark SQL Queries are sent to Chango Spark Thrift Server through JDBC/Thrift, and Spark SQL queries will be executed in Spark.
  • Spark SQL Queries can also be sent to Chango Spark SQL Runner through REST, and Spark SQL queries will be executed in Spark.

All the queries executed through Chango Trino Gateway , Chango Spark Thrift Server, and Chango Spark SQL Runner will be logged to iceberg table in Chango.

Internals of Query Audit

Let’s see the internals of how query logs will be saved to iceberg table in Chango.

  • All the queries executed through Chango Trino Gateway , Chango Spark Thrift Server, and Chango Spark SQL Runner will be sent to Chango Authorizer .
  • Chango Authorizer will put the incoming query log to internal queue.
  • Every one minute or everytime queue size reaches to 1000, all the query logs in internal queue will be inserted into iceberg table through Chango Spark Thrift Server using INSERT INTO <table> VALUES (..), (..), (..), ... in batch mode.
  • At that time, iceberg table <table> will be locked by distributed lock controlled by Zookeeper in order to avoid conflict of iceberg table commits.
  • Iceberg table maintenance like small data file compaction, snapshot expiration, and old metadata removal for the data and metadata files created everytime query logs are inserted to iceberg table will be done by Chango REST Catalog automatically.

That’s it.

--

--

Kidong Lee
Kidong Lee

Written by Kidong Lee

Founder of Cloud Chef Labs | Chango | Unified Data Lakehouse Platform | Iceberg centric Data Lakehouses https://www.cloudchef-labs.com/

No responses yet