Tech blog

Scheduling tasks on Google cloud platform

At travel audience we use Google Cloud Platform in our data engineering stack. Every day, we process terabytes of data using Apache Beam streaming workflows (on Cloud Dataflow). Our analytic workload is powered by batch jobs using Beam/Dataflow or Spark/Dataproc. All this works well, but there are some workloads which are not continuous streaming applications but needs […]

At travel audience we use Google Cloud Platform in our data engineering stack. Every day, we process terabytes of data using Apache Beam streaming workflows (on Cloud Dataflow). Our analytic workload is powered by batch jobs using Beam/Dataflow or Spark/Dataproc. All this works well, but there are some workloads which are not continuous streaming applications but needs to be executed on a regular basis. This demands reliable tasks scheduling. But at the moment there is no task scheduler service on Google Cloud Platform that fits our workloads. In this article, our Data Engineer Ziyad Muhammed Mohiyudheen describes a use case that demands scheduling with some of the options which we considered and how (and more importantly why) we decided on one option.

click here to continue to our tech blog

Share:

Related articles

About travel audience

We provide integrated data-driven solutions for travel advertising and connect destinations and travel brands with potential travelers at scale.
Learn more