Logo-of-Gsstech-Group-hiring-for-jobs-in-UAE-on-GrabJobs

Data Engineer (PySpark)

icon building Company : Gsstech Group
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Data Engineer (PySpark)

We are seeking a highly skilled Data Engineer with strong expertise in PySpark and the Cloudera Data Platform (CDP). The ideal candidate will design, develop, and maintain scalable data pipelines while ensuring high data quality, performance, and availability across the organisation.

This role requires hands-on experience in big data ecosystems, cloud-native technologies, and advanced data processing frameworks. You will collaborate with cross-functional teams to build reliable and high-performance data solutions that drive business insights.

Key Responsibilities

1. Data Pipeline Development

  • Design, develop, and maintain scalable ETL/ELT pipelines using PySpark on CDP
  • Ensure data integrity, reliability, and performance optimisation

2. Data Ingestion

  • Develop ingestion frameworks to collect data from relational databases, APIs, streaming sources, and file systems
  • Load structured and unstructured data into Data Lake/Data Warehouse environments

3. Data Transformation & Processing

  • Process, cleanse, and transform large-scale datasets using PySpark
  • Build reusable data processing components

4. Performance Optimisation

  • Tune Spark jobs and Cloudera components for optimal performance
  • Optimise memory, partitioning, and execution plans
  • Reduce ETL runtime and improve cluster efficiency

5. Data Quality & Validation

  • Implement data validation checks and monitoring mechanisms
  • Ensure end-to-end data quality and governance standards

6. Automation & Orchestration

  • Automate workflows using tools such as Apache Oozie, Apache Airflow, or similar orchestration frameworks
  • Maintain CI/CD integration for data pipelines

7. Monitoring & Support

  • Monitor pipeline health and troubleshoot failures
  • Provide production support and continuous improvements

Required Skills & Qualifications

  • 5+ years of experience in Data Engineering
  • Strong hands-on experience in PySpark
  • Experience working on Cloudera Data Platform (CDP)
  • Strong knowledge of Hadoop ecosystem (HDFS, Hive, Impala, YARN)
  • Proficiency in SQL and data modelling concepts
  • Experience with workflow orchestration tools (Airflow, Oozie, etc.)
  • Good understanding of data warehousing concepts
  • Experience with performance tuning and optimisation

Good to Have

  • Experience with cloud platforms (AWS, Azure, GCP)
  • Knowledge of streaming tools (Kafka, Spark Streaming)
  • Exposure to DevOps practices and CI/CD pipelines
  • Banking/Financial Services domain experience
Original job Data Engineer (PySpark) posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

Auto-Apply to Data Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Data Engineer Jobs in the UAE

GrabJobs is the no1 job portal in the UAE, connecting you to thousands of jobs fast! Find the best jobs in the UAE, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.