Senior Data Engineer
Experience: 4–8 years | 2+ years building
data -intensive pipelines
Location: Office – Coimbatore/Bengaluru
About Aivar Innovations
Aivar is an AI -first technology partner where cutting -edge technology meets industry expertise to supercharge your projects. Our AI -augmented teams accelerate development, reduce time -to -market, and deliver exceptional code quality. We bring together the best minds in tech to craft scalable, repeatable solutions that drive real momentum for your business.
Technical Focus
Build the data foundation powering our
accelerators’ autonomous agents. Design large -scale ingestion, processing, and
feature engineering systems that transform unstructured enterprise data
(invoices, documents, transactions, RFQs) into structured, high -quality
datasets. Enable agentic AI to make accurate, compliance -aware decisions with
full data lineage and auditability.
Functional Expectations\
- Design end -to -end data pipelines processing large
volumes of unstructured enterprise data (documents, PDFs, transaction records,
email)
- Build data ingestion frameworks supporting multiple
sources and formats with automated validation and quality checks
- Implement large -scale processing using distributed
computing frameworks (Spark, Flink, AWS Glue) handling terabytes efficiently
- Develop advanced feature engineering pipelines —
document classification, entity extraction, semantic tagging from unstructured
data
- Design data warehousing architecture supporting both
near -real -time operational and analytical queries for agentic AI reasoning
- Build data quality frameworks ensuring accuracy
critical for agent decision -making and regulatory compliance
- Implement data governance — lineage tracking, metadata
management, audit trails for regulated environments
- Lead data security for sensitive information (PII,
financial data, healthcare records) with encryption and access controls
Must -Have Technical Skills
- Unstructured data expertise — production ingestion and
processing of documents, PDFs, images, text, logs at scale
- Distributed data processing — Apache Spark, Flink, or
AWS Glue at production scale
- Feature engineering — advanced techniques for ML
systems, automated feature extraction and transformation
- Expert Python — data processing, ETL pipeline
development, data science workflows; not notebook -level
- NLP/text processing — document understanding, entity
extraction, semantic processing (spaCy, transformers)
- Data architecture — data warehouses, data lakes, or
lakehouse architectures supporting batch and real -time processing
- ETL/ELT pipeline design — production -grade with error
handling, retry logic, and monitoring
- AWS data services — S3, Athena, Glue, RDS, DynamoDB,
MSK
- Data quality & governance — metadata management,
lineage tracking, compliance frameworks (GDPR, HIPAA, SOC2)
Core Tech Stack
Python, Apache Spark/Flink, AWS
(S3, Glue, Athena, RDS, DynamoDB, MSK), Kafka/Redis Streams,
spaCy/transformers/LangChain, Pinecone/Weaviate/pgvector, dbt, Great
Expectations, Terraform/CDK, Prometheus/Grafana/OpenTelemetry
Benefits
Why You’ll Love Working at Aivar
- Learn from Experts: Work directly with former AWS leaders and AI pioneers.
- Direct Ownership: Lead high -impact "greenfield" projects from concept to global launch.
- Modern Tech: Master the latest Generative AI frameworks and cloud -native architectures.
- Real -World Impact: Build mission -critical systems used by major global enterprises.
- Rapid Growth: Scale your career quickly in a high -speed
Diversity and Inclusion
Aivar Innovations is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, gender identity, sexual orientation, religion, disability, age, marital status, caste, or any other protected characteristic, and we are committed to building a diverse, inclusive, and respectful workplace for everyone.