Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
Design and architect scalable ETL/ELT pipelines using Apache Airflow, dbt, and Kafka for real-time and batch data processing.
Build and maintain data lakes and data warehouses on platforms such as AWS S3, Snowflake, or Databricks, leveraging Delta Lake for ACID-compliant data management.
Develop and implement data quality frameworks using Great Expectations and monitor data systems using Prometheus and Grafana.
Optimize data infrastructure for performance and cost using Kubernetes, Terraform (Infrastructure as Code), and serverless services such as AWS Lambda and AWS Glue.
Collaborate with cross-functional teams to implement CI/CD pipelines using GitHub Actions or GitLab CI, integrating with dbt Cloud for data transformations.
Develop and manage real-time data processing solutions using Apache Spark (PySpark/Databricks).Work with vector databases such as Pinecone to support AI and machine learning applications.
Required Skills
Strong programming skills in Python and SQL.
Hands-on experience with cloud platforms such as AWS, GCP, or Azure (e.g., EMR, Redshift, BigQuery).
Experience with modern data tools:
Kafka or Flink for streaming
dbt for data transformation
Airflow for workflow orchestration
Familiarity with containerization technologies like Docker and Kubernetes.
Understanding of observability tools and frameworks such as ELK Stack.
Experience with machine learning pipelines using MLflow.
Knowledge of data governance tools such as Collibra.
Experience working with HIPAA-compliant data environments.
Exposure to Scala or Java for Apache Spark development.
Auto-Apply to Data Engineer Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.