We are looking for an AI Platform & Intelligence Architect to design and scale next-generation intelligent systems powering conversational interfaces, contextual search, and personalized user experiences. This role goes beyond traditional model building — it focuses on architecting production-ready AI platforms that integrate large language models, retrieval systems, and adaptive learning pipelines into high-traffic environments.
You will play a foundational role in shaping scalable AI infrastructure, optimizing model performance, and ensuring reliable, low-latency deployments across multilingual and dynamic use cases.
Key Responsibilities
Intelligent Conversational Systems
Architect multi-turn conversational frameworks with contextual memory and personalization layers.
Design model orchestration strategies including prompt pipelines, dynamic routing, fallback handling, and guardrails.
Build systematic evaluation frameworks to measure response accuracy, relevance, safety, and tone consistency.
Continuously optimize inference workflows for latency, reliability, and cost efficiency.
Retrieval-Augmented Intelligence
Design and implement embedding pipelines and semantic retrieval architectures.
Develop robust RAG frameworks integrating structured databases and unstructured knowledge sources.
Optimize indexing, chunking strategies, and retrieval ranking mechanisms to enhance contextual grounding.
Evaluate and improve hallucination mitigation and answer grounding techniques.
Personalization & Adaptive Learning
Develop ML-driven personalization systems including segmentation, ranking, and content adaptation.
Implement recommendation pipelines and intelligent content generation systems.
Fine-tune and adapt multilingual NLP models for diverse linguistic use cases, including Indian regional languages.
Build feedback-driven learning loops to continuously improve system performance.
Model Optimization & Production Engineering
Deploy and manage open-source and proprietary LLMs in scalable production environments.
Apply fine-tuning, parameter-efficient training, and optimization techniques to balance quality and compute cost.
Implement orchestration pipelines using workflow management tools and event-driven architectures.
Apply MLOps best practices including model versioning, CI/CD integration, monitoring, and rollback strategies.
Design cloud-native AI infrastructure leveraging containerization and distributed systems.
Required Qualifications
5+ years of experience in AI/ML engineering, applied NLP, or large-scale model deployment.
Strong programming expertise in Python and experience with modern ML frameworks (PyTorch, HuggingFace, etc.).
Hands-on experience building and deploying LLM-powered systems in production.
Deep understanding of embeddings, semantic search, vector databases, and RAG architectures.
Experience designing scalable, low-latency AI systems in cloud environments (AWS preferred).
Familiarity with containerization (Docker) and workflow orchestration tools.
Strong foundation in system design, distributed systems, and performance optimization.
Preferred Qualifications
Experience with multilingual conversational systems or culturally contextual AI applications.
Exposure to advanced fine-tuning methods such as LoRA, PEFT, or reinforcement-based optimization.
Experience building recommendation engines or intelligent assistant systems.
Contributions to research, open-source AI projects, or experimentation with frontier generative AI techniques.
What You’ll Gain
Opportunity to architect core AI systems powering intelligent user experiences.
Exposure to frontier technologies in generative AI and retrieval systems.
Ownership of scalable AI infrastructure from experimentation to production.
A collaborative environment focused on innovation, impact, and long-term technical growth.
Core Competencies
AI Systems Architecture · Large Language Models · Retrieval-Augmented Generation · Conversational AI · Multilingual NLP · Vector Search · Personalization Engines · MLOps · Cloud-Native ML Infrastructure · Scalable Production Systems