AI Evaluation Engineer

Company : Weekday Ai

Job Type : Full Time

Pune, Maharashtra

Number of Applicants

000+

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - AI Evaluation Engineer

This role is for one of the Weekday's clients

We are seeking an AI Evaluation Engineer to evaluate, validate, and ensure the quality of AI/ML systems working with complex, real-world data. This role focuses on assessing component mapping, retrieval-augmented generation (RAG) based Q&A systems, and feature extraction from structured and unstructured sources such as repair records, catalogs, free-text inputs, and technical documentation.

This is a hands-on engineering role centered on designing custom evaluation frameworks, datasets, and automated pipelines (including LLM-as-a-judge approaches) to measure quality, detect regressions, and support release readiness. While domain training will be provided, strong ownership in building evaluation intuition and maintaining high-quality test datasets is essential.

Key Responsibilities

AI Evaluation & Quality Assurance

Evaluate ML and LLM outputs using defined metrics, benchmarks, and acceptance criteria.
Design and maintain automated evaluation pipelines to assess model accuracy, consistency, and reliability.
Develop and own high-quality evaluation datasets, golden test cases, and benchmarks.

Testing & Release Validation

Execute evaluation-driven smoke tests and regression tests prior to releases.
Track quality metrics and provide clear go/no-go signals for production deployments.
Detect regressions and unexpected model behavior across releases and data changes.

Analysis & Insights

Analyze evaluation results to identify trends, inconsistencies, and failure patterns.
Provide actionable insights to improve model performance and system behavior.

System & API Validation

Validate AI services at the API level for correctness, robustness, and stability.
Monitor system performance, latency, and error rates under production-like workloads.

Cross-Functional Collaboration

Work closely with ML, backend, and product teams to define expected AI behavior.
Ensure evaluation coverage aligns with real-world use cases and business requirements.

Skills & Experience

Core Skills

Strong proficiency in Python for evaluation scripting and automation.
Solid understanding of Machine Learning and AI systems, including LLM-based workflows.
Experience with data analysis to interpret evaluation metrics and model outputs.

Nice to Have

Experience with LLM evaluation frameworks or LLM-as-a-judge techniques.
Familiarity with RAG pipelines, NLP systems, or large-scale data processing.
Experience building CI/CD-style evaluation or testing pipelines for AI systems.

Skills

Python · Machine Learning · Artificial Intelligence · Data Analytics

Original job AI Evaluation Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to AI Evaluation Engineer Jobs with your AI JobCopilot

Auto-Apply with AI

Similar AI Evaluation Engineer Jobs in India

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip