Number of Applicants
:000+
Position type: Contract
Duration: 6 months +
Location: Denver, CO (will consider 100% remote) As a Data Engineer IV (Scala/Spark/Kafka):
You will play a critical role in designing, developing, and maintaining large-scale data pipelines for real-time and batch processing Responsibilities:
· Design, develop, and deploy scalable and fault-tolerant data pipelines using Apache Spark and Scala.
· Spark structured streaming within Scala.
· Utilize Apache Kafka to efficiently ingest and stream real-time data.
· Implement and maintain CI/CD pipelines for automated deployments and testing.
· Leverage AWS services (e.g., S3, EC2, EMR) to build and manage big data infrastructure.
· Collaborate with data scientists and analysts to understand data requirements and translate them into technical solutions.
· Write and maintain clean, well-documented, and efficient code.
· Monitor and troubleshoot data pipelines to ensure optimal performance and data quality.
Qualifications:
·5+ years of experience in big data engineering with a focus on streaming data pipelines (2-3 years minimum).
· Proven experience with Scala and Apache Spark for large-scale data processing.
· Expertise in Apache Kafka for real-time data ingestion and streaming.
· Strong understanding of CI/CD principles and experience with tools like Jenkins or GitLab CI/CD.
· Solid experience with AWS cloud platform and its services
Share this job with your friends
Copyright © 2024 Grabjobs Pte.Ltd. All Rights Reserved.