Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
JOB DESCRIPTION
Key Responsibilities & Scope of Work
A. Architecture Assessment & Strategic Roadmap
● Evaluate the current data engineering framework end-to-end: medallion architecture layering, naming conventions, ingestion patterns, processing logic, security controls, and data quality mechanisms.
● Benchmark the current state against industry best practices and produce a prioritized improvement roadmap with clear effort-vs-impact trade-offs.
B. Data Estate Governance
● Build and maintain a comprehensive inventory of the data estate — cataloging all source
systems (onboarded and prospective) and the subject areas each covers (ingested and
not yet ingested).
● Establish this inventory as a living artifact that informs onboarding decisions, coverage
analysis, and platform planning.
C. Standards Definition & Enforcement
● Design, integrate, or refactor naming conventions for schemas, tables, views, orchestration jobs, and pipelines — along with the migration approach for transitioning to new standards where needed. ● Define standardized ingestion and processing patterns spanning the full medallion architecture, including sub-layering strategy, format standardization (Parquet, Avro, Delta), secure PII ingestion, data normalization, technical data quality tracking, row- and column-level access controls, late-arriving dimension management, and data export workflows.
● Establish clear pattern selection criteria so engineers know which approach to apply for a given source type or use case.
● Define and operationalize the exception management process for handling justified deviations from established standards.
D. Hands-On Implementation
● Build production-grade boilerplate code for each standardized pattern using the existing GCP toolchain (BigQuery, CloudSQL,Cloud Composer, Dataflow, Dataproc, Cloud Storage, Pub/Sub, and related services).
● Ensure templates are modular, well-documented, and immediately adoptable by the engineering team.
E. CI/CD & Developer Experience
● Support the integration of data engineering pipelines with the CI/CD solution, aligning with the broader CI/CD modernization initiative's timeline and tooling decisions.
● Contribute to developer experience improvements that reduce friction in pipeline development, testing, and deployment.
F. Knowledge Transfer & Enablement
● Author the "Source Onboarding Playbook" — a repeatable, step-by-step guide for bringing new data sources into the platform, covering initial assessment, pattern Page 3 selection, naming convention application, quality gates, access control setup, and production release.
● Mentor and upskill data engineers on the new standards, patterns, and tooling through documentation, walkthroughs, and hands-on pairing.
Resource Requirements (What We're Looking For)
Must-Have
● Substantial progressive experience in data engineering, data architecture, or analytics platform development, with a significant portion spent in hands-on, code-level roles — not purely advisory or managerial positions.
● Deep, demonstrable expertise in designing and operating large-scale analytical solutions (data warehouses, data lakes, lakehouses) serving enterprise-grade workloads.
● Strong hands-on proficiency with GCP data services — BigQuery, CloudSQL(Federated Query), Cloud Composer (Airflow), Dataflow (Apache Beam), Dataproc (Spark), Cloud Storage, and Pub/Sub.
● Proven track record of implementing medallion architecture (Bronze/Silver/Gold) or equivalent layered data platform patterns at scale.
● Experience defining and enforcing data engineering standards, naming conventions, and governance frameworks across multiple teams and workstreams.
● Experience with dbt, Apache Iceberg, Delta Lake, or similar transformation and open table format technologies.
● Practical experience with PII handling, data masking, tokenization, and implementing row- and column-level security in cloud data platforms.
● Strong background in CI/CD for data pipelines (Terraform, Cloud Build, GitHub Actions, dbt, or equivalent).
● A track record of building reusable templates, frameworks, and boilerplate code that engineering teams actually adopt and rely on.
● Solid understanding of data quality frameworks, data contracts, and pipeline observability.
Nice-to-Have
● Experience in the logistics industry or adjacent supply chain-intensive sectors, with exposure to high-volume transactional data, shipment tracking, fleet management, or warehouse and distribution analytics.
● Familiarity with data cataloging and metadata management tools (Dataplex, Purview, Alation, or equivalent).
● GCP Professional Data Engineer certification or equivalent.
# Deliverable Description
1 Current State Assessment & Gap Analysis A comprehensive evaluation of the existing data engineering framework, medallion architecture layering, and naming conventions — benchmarked against industry best practices with a prioritized improvement roadmap.
2 Data Estate Inventory A complete catalog of source systems (onboarded and not) and subject areas (ingested and not), serving as the single source of truth for coverage and onboarding decisions.
3 Naming Convention Standards & Migration Plan Integrated and standardized naming conventions for schemas, tables, views, jobs, and pipelines — with a defined migration approach for transitioning existing assets where applicable.
4 Standardized Ingestion & Processing Patterns Documented and codified patterns covering medallion sub-layering, format standards, secure PII ingestion,
normalization, data quality tracking, access controls, late-arriving dimensions, and data export — each with clear application criteria.
5 Exception Management Process A formal, operationalized process for requesting, reviewing, approving, and documenting deviations from data engineering standards.
6 GCP Boilerplate Implementation Production-ready, modular boilerplate code for each standardized pattern, built on the existing GCP toolchain and ready for team adoption.
7 CI/CD Integration Support Active contribution to integrating data engineering pipelines with the CI/CD solution, aligned with the modernization initiative's timeline.
8 Source Onboarding Playbook A step-by-step, repeatable playbook for onboarding new data sources — from initial assessment through production deployment, including pattern selection, quality gates, and access control setup.
AVENSYS CONSULTING PTE. LTD.
Avensys is among the leaders in providing technology enabled business solutions and services. Since inception, Avensys has helped clients use IT more efficiently to improve their operations and profitability, focus on core competencies and achieve business results such as increased agility, innovati...
Read more about the companyCopyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.