J

Human Data Evals Lead

icon building Empresa : Jobgether
icon briefcase Tipo de Emprego : Periodo Integral
icon remote-alt Remote / Work from Home

Número de Aplicantes

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Aplique agora
icon loader Aplique agora

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Descrição do Emprego - Human Data Evals Lead


This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Human Data Evals Lead based in Brazil.


This role sits at the core of frontier AI data operations, owning how high-quality evaluation datasets and benchmarks are designed, validated, and delivered to leading AI labs. You will be responsible for translating ambiguous evaluation needs into structured, high-signal data proposals and production-ready sample packages that demonstrate model performance with rigor and clarity. The work blends technical judgment, quality design, and commercial awareness, requiring close collaboration with subject-matter experts and research stakeholders. You will shape how “frontier-grade” quality is defined and enforced, ensuring every dataset meets the standards expected by advanced model developers. Acting as a key interface with AI lab partners, you will help convert pilots into scaled production engagements. This is a high-ownership role at the intersection of AI evaluation, data quality, and applied research operations.


Accountabilities:


Own the design, development, and delivery of high-quality AI evaluation data initiatives, from initial proposals through pilot execution and production readiness.



  • Develop data proposals and sample packages based on lab requests, benchmarks, and evaluation targets, translating them into structured, high-signal datasets.

  • Design frontier-grade evaluation samples across reasoning, coding, agents, tool use, and multimodal tasks, ensuring measurable model discrimination and headroom.

  • Define and enforce rigorous quality control frameworks, including expert verification, calibration layers, rubrics, and deterministic validation approaches.

  • Recruit, onboard, and manage subject-matter experts across technical domains, ensuring consistent output quality aligned with benchmark standards.

  • Own pilot engagements end-to-end, including scoping, staffing, SOW definition, QC execution, and final delivery to AI lab partners.

  • Act as a key point of contact for lab stakeholders, aligning expectations and surfacing technical requirements in collaboration with internal leadership.

  • Continuously refine evaluation methodologies and sample design standards to improve signal quality and benchmark reliability.


Requirements:


You are an experienced operator in AI evaluation or technical delivery, with strong expertise in building structured, high-quality data systems for model assessment.



  • 5+ years of experience in technical program management, data operations, quality engineering, or ML evaluation roles.

  • Proven experience working with AI labs or enterprise ML teams, delivering datasets, benchmarks, or evaluation frameworks.

  • Strong understanding of LLM evaluation concepts such as benchmarks, rubrics, pass rates, headroom, and model discrimination.

  • Hands-on experience designing or managing QC processes and ensuring high-quality annotated or evaluated datasets.

  • Demonstrated ability to recruit, manage, and calibrate subject-matter experts or external contributor pools.

  • Strong problem-solving skills in ambiguous environments with evolving requirements and fast iteration cycles.

  • Excellent English communication skills; Spanish is a plus.


Benefits:



  • Competitive compensation aligned with senior-level AI and data roles

  • Remote-first setup with flexibility across LATAM and US time zones

  • Opportunity to work directly with leading AI labs and frontier model development teams

  • High-ownership role with significant influence over evaluation standards and methodologies

  • Collaboration with top-tier subject-matter experts across technical domains

  • Exposure to cutting-edge AI benchmarking and evaluation practices

  • Fast-paced, research-driven environment with strong learning potential

  • Opportunity to shape how frontier model quality is measured and improved


How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!


 

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

 

 

#LI-CL1
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Original job Human Data Evals Lead posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Aplique agora
Share Job
Share Job

Auto-Apply to Human Data Evals Lead Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Human Data Evals Lead Jobs in Brazil

O GrabJobs é o portal de empregos número 1 em Brazil, conectando você rapidamente a milhares de empregos de ! Encontre os melhores empregos de em Brazil, candidate-se com apenas 1 clique e consiga um emprego hoje!

Aplicativos de Celular

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.