Query Audit in Chango
All the queries executed by query engines like Chango Trino Gateway
, Chango Spark Thrift Server
and Chango Spark SQL Runner
in Chango will be logged to explore history of all the executed queries later.
This article shows the architecture of Query Audit in Chango and internals of it.
Query Audit Architecture
There are two types of queries in Chango, namely, Trino Queries and Spark SQL Queries.
- Trino Queries are sent to
Chango Trino Gateway
which routes trino queries to the backend trino clusters to execute trino queries. - Spark SQL Queries are sent to
Chango Spark Thrift Server
through JDBC/Thrift, and Spark SQL queries will be executed in Spark. - Spark SQL Queries can also be sent to
Chango Spark SQL Runner
through REST, and Spark SQL queries will be executed in Spark.
All the queries executed through Chango Trino Gateway
, Chango Spark Thrift Server
, and Chango Spark SQL Runner
will be logged to iceberg table in Chango.
Internals of Query Audit
Let’s see the internals of how query logs will be saved to iceberg table in Chango.
- All the queries executed through
Chango Trino Gateway
,Chango Spark Thrift Server
, andChango Spark SQL Runner
will be sent toChango Authorizer
. Chango Authorizer
will put the incoming query log to internal queue.- Every one minute or everytime queue size reaches to 1000, all the query logs in internal queue will be inserted into iceberg table through
Chango Spark Thrift Server
usingINSERT INTO <table> VALUES (..), (..), (..), ...
in batch mode. - At that time, iceberg table
<table>
will be locked by distributed lock controlled by Zookeeper in order to avoid conflict of iceberg table commits. - Iceberg table maintenance like small data file compaction, snapshot expiration, and old metadata removal for the data and metadata files created everytime query logs are inserted to iceberg table will be done by
Chango REST Catalog
automatically.
That’s it.