Hello Kim Ted,
I tried to answer your questions.
1. Was wondering if the deployment could be just made with a yaml.
...
As you referenced from https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1116,
Because I am not sure, but I think, there is spark-submit already installed in the image of `gcr.io/spark-operator/spark:v3.0.0`, spark thrift server can be run with deployment manifest.
You have tested spark thrift server with deploy mode of `cluster` using this deployment manifest, which I am curious because as long as I know, spark thrift server with the version of 3.x cannot be run as `cluster` mode, that is why I have written a wrapper class for that.
2. Are there any reasons u made it with a spark-operator?
...
I have also benchmarked spark-on-k8s-operator before, but I think, spark-on-k8s-operator is not suit for my open source data platform, DataRoaster to more control spark applications run with spark operator, thus I have written dataroaster spark operator.
As deployment manifest approach you did, you can run spark applications on kubernetes. But I want to have dataroaster more control to spark applications like spark thrift server with custom resources.
3. Parameter change would be a burden since it has to go through mvn and docker build.
...
I think, generally, there are two things to do to run spark applications.
First, you should build spark container image for yourself, or you can use prebuilt spark image like cloudcheflabs/spark:v3.0.3.
Second, you have to build your spark application, for instance with maven to package uber jar of spark application in case of java or scala.
Cheers,
- Kidong