Job Title: Data Engineer
Job Description:
• Design and build scalable data pipelines using PySpark and Python
• Develop and optimize complex SQL queries for large datasets
• Implement and maintain ETL/ELT processes ensuring data quality and reliability
• Build and manage data warehouse solutions
• Work with Hadoop/Big Data ecosystems for large -scale data processing
• Collaborate with stakeholders to translate business requirements into data solutions
• Contribute to modern data/AI workflows including vector embeddings and agentic frameworks
• Work with orchestration tools like Airflow (if applicable)
Required Skills:
• Strong hands -on experience in PySpark and Python
• Advanced proficiency in SQL
• Solid understanding of Data Engineering concepts
• Experience in ETL processes and data warehousing
• Familiarity with Hadoop and Big Data technologies
• Good communication and business understanding
Good to Have:
• Experience with Apache Airflow
• Exposure to modern data platforms / MCP servers
• Understanding of agentic frameworks (LLM workflows)
• Experience with vector embeddings and semantic search
• Exposure to cloud platforms (AWS, GCP, Azure)