This solution assisted with migration of Spark workloads from Cloudera Hadoop Cluster to Cloud Dataproc. Successful activities included:

  • Migration of 1PB data from existing cloud storage to chosen provider using Storage Transfer Service.
  • Implementation of dual writes from Kafka to cloud storage
  • Fully automated infrastructure and code deployment through GitlabCI, Helm, and Terraform Migration of ~4TB
  • Implementation of multiple Redis+Sentinel Clusters on Kubernetes engine (~400k ops per second)
  • Stackdriver Custom Metrics collection using Prometheus on Kubernetes engine
  • Migration of all Analytics services from EC2 to Kubernetes engine
  • Migration of Airflow jobs
  • Snowflake to BigQuery migration
  • Presto to BigQuery migration
  • Periscope to Data Studio migration

Benefits

Successfully prove multiple data migration methods across industry standard tools

Successfully prove multiple data migration methods across industry standard tools


Project Accelerators

  • Team workflow methodology
  • Architecture design templates
  • Dev-ops recipes
  • Concept roadmap templates
  • Product demo feedback loops

Compentencies

  • Hadoop
  • Kubernetes
  • Spark
  • AWS
  • Kafka
  • Terraform
  • CloudSQL
  • BigQuery
  • StackDrivers
  • Dockers
  • Data/Cloud Architect
  • Data/Cloud Engineer
  • Cloud Provider SME
  • DevOps Engineers