Job Description - Sr Data Engineer, Data Governance
Opportunity:\n\nWe are seeking a highly skilled and experienced Senior Data Engineer to join our growing data team. In this role, you will be instrumental in designing, building, and maintaining robust and scalable data pipelines and infrastructure that power critical data-driven initiatives across our organization. You will work with vast datasets, cutting-edge technologies, and collaborate closely with AI researchers, other data engineers, data scientists, machine learning engineers, and product teams to deliver insights that shape the future of our products and user experiences.\n\nWhat You\u0027ll Do:\n\nBuild and Optimize Data Infrastructure:\n\n * Develop, construct, test, and maintain large-scale data ingest architecture consisting of diverse cloud-based services (messaging, storage, Kubernetes, persistent data store, serverless functions, etc). \n * Create tooling like SDK, APIs to enable user self-service.\n * Contribute to the design and evolution of our core data platform,ensuring its scalability, reliability, and cost-effectiveness.\n * Implement robust monitoring, alerting, and logging solutions for data pipelines and infrastructure to proactively identify and resolve issues.\n\n\n\nDesign and Implement Scalable Data Pipelines:\n\n * Design and implement highly reliable and efficient ETL/ELT processes to ingest, transform, and load data from diverse sources (e.g., real-time events, third-party APIs, rich media datasets) into our data lake and data warehouses.\n * Utilize distributed data processing frameworks like Spark or similar to handle large scale data volumes with high throughput and low latency.\n\n\n\nEnsure Data Quality and Governance:\n\n * Describe and annotate datasets using industry standard schemas and internal specifications\n * Cultivate data catalogs and metadata management solutions to improve data discoverability and understanding across the organization.\n * Implement data validation, cleansing, and reconciliation processes to ensure the accuracy and integrity of our data assets.\n\n\n\nCollaborate and Mentor:\n\n * Work closely with stakeholders (research, engineering, product, and peers more broadly) to translate their data needs into robust data solutions.\n * Provide technical leadership and mentorship to junior data engineers, fostering a culture of technical excellence and continuous learning.\n\n\n\nDrive Innovation:\n\n * Contribute to the evolution of our data architecture and engineering best practices.\n\n\n\nWhat You\u0027ll Bring:\n\n * Extensive Experience: 5+ years of experience in data engineering, with a strong focus on building and maintaining large-scale data pipelines and infrastructure.\n * Programming Proficiency: Expert-level proficiency in at least one major programming language such as Python, Scala, or Java.- Distributed Data Processing: Deep experience with distributed data processing frameworks (e.g., Apache Spark, Apache Beam). Strong foundation in event-based approaches and systems including messaging/topics, pub/sub, queues, etc.\n * Data Warehousing/Lakes: Hands-on experience with data warehousing solutions (e.g., Databricks, Snowflake, Redshift, BigQuery) and data lake technologies (e.g., S3, HDFS). Deep experience with managing large scale, heterogeneous datasets on Databricks is highly preferred.\n * SQL Mastery: Advanced SQL skills for data manipulation, analysis, and optimization.\n * Cloud Platforms: Strong experience with one or more major cloud providers (AWS, GCP, Azure) and their data-related services.\n * Orchestration and DevOps: Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes). Proficient at CI/CD-based deployment.\n * Database Knowledge: Solid understanding of relational and NoSQL databases.\n * Data Modeling: Expertise in data modeling, schema design, and data architecture principles.\n * Problem-Solving: Excellent analytical and problem-solving skills, with a track record of tackling complex data challenges.\n * Communication: Strong communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams.\n * Master\u2019s degree in Computer Science, Data Science, or a related quantitative field.\n\n\n\nBonus Points:\n\n * Strong theoretical understanding of distributed computing concepts such as concurrency, parallelism, queueing, consistency, coordination protocols, etc.- Experience with machine learning pipelines and MLOps principles.\n * Contributions to open-source data projects.\n\n\n
All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.
Be the first to receive the latest Others Full-Time Jobs in India.
Setup your job alert:
By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime.
Skip