Responsibilities
• Design and build production -grade Generative AI applications such as AI assistants, enterprise chatbots, document intelligence systems, knowledge copilots, and AI -powered automation platforms.
• Develop and deploy LLM -powered applications using orchestration frameworks such as LangChain, LangGraph, LlamaIndex, Strands, AutoGen, CrewAI, Haystack, DSPy, and Semantic Kernel.
• Build advanced Retrieval Augmented Generation (RAG) systems including Graph RAG, Hybrid RAG, multi -hop retrieval, Agentic RAG, and knowledge graph–based retrieval pipelines.
• Develop AI agents and multi -agent systems capable of reasoning, tool usage, and task orchestration using frameworks such as LangGraph, AutoGen, CrewAI, and Strands.
• Build Python backend services and scalable APIs using FastAPI and modern backend frameworks while following microservices architecture principles.
• Design scalable backend architectures for AI applications with asynchronous processing, task queues, and distributed workloads.
• Integrate foundation models and LLM providers such as OpenAI, Anthropic Claude, LLaMA and open -source LLMs, Google Gemini, and Hugging Face models.
• Implement document ingestion pipelines including chunking strategies, embedding generation, metadata enrichment, indexing, and semantic retrieval.
• Build semantic search and vector retrieval systems using vector databases such as Pinecone, Weaviate, Milvus, FAISS, ChromaDB, Qdrant, and OpenSearch vector search.
• Implement embedding pipelines using embedding models from OpenAI, Hugging Face, Sentence Transformers, or similar providers.
• Develop AI pipelines for document processing, summarization, knowledge extraction, and conversational interfaces.
• Deploy AI applications on AWS cloud services including Amazon Bedrock, SageMaker, EC2, Lambda, ECS, EKS, S3, DynamoDB, RDS, OpenSearch, API Gateway, and CloudWatch.
• Build containerized applications using Docker and deploy them using Kubernetes, ECS, or EKS.
• Implement scalable AI inference infrastructure using modern model serving technologies such as vLLM, Hugging Face TGI (Text Generation Inference), Triton Inference Server, or Ray Serve.
• Build robust CI/CD pipelines and automate deployments for AI applications.
• Implement observability, monitoring, and evaluation for AI systems using tools such as LangSmith, LangFuse, TruLens, Arize, Ragas, and DeepEval.
• Optimize AI systems for latency, throughput, cost efficiency, and reliability in production environments.
• Integrate AI applications with enterprise systems, APIs, data platforms, and external services.
• Mentor engineering teams and establish best practices for building scalable AI applications and backend systems.
Required Experience
• At least 2 years of hands -on experience building Generative AI or LLM -based applications in production.
• min 5+ years of experience designing and developing Python applications and backend systems.
• Strong experience developing REST APIs and microservices using FastAPI.
• Hands -on experience integrating backend applications with AWS cloud services.
• Experience building RAG pipelines, AI agents, and LLM orchestration workflows.
Required Skills
• Strong programming expertise in Python.
• Experience with backend frameworks such as FastAPI, Pydantic, and asynchronous programming.
• Experience with Generative AI frameworks such as LangChain, LangGraph, LlamaIndex, Strands, AutoGen, CrewAI, Haystack, DSPy, or Semantic Kernel.
• Experience implementing advanced RAG architectures including Graph RAG and hybrid retrieval pipelines.
• Experience working with vector databases and semantic search systems.
• Familiarity with machine learning and AI libraries such as PyTorch, TensorFlow, Hugging Face Transformers, Sentence Transformers, NumPy, Pandas, and Scikit -learn.
• Experience deploying applications on AWS cloud infrastructure.
• Experience building containerized services using Docker and deploying using Kubernetes or container orchestration platforms.
• Strong understanding of scalable backend architecture, distributed systems, and cloud -native application development.
Nice to Have
• Experience with model serving frameworks such as vLLM, Triton Inference Server, Ray Serve, or Hugging Face TGI.
• Experience building Agentic AI workflows and autonomous AI systems.
• Familiarity with AI evaluation frameworks, guardrails, and LLM safety mechanisms.
• Experience building enterprise AI platforms or internal AI developer tooling.