A

Self-evolving evaluation benchmarks research Internship

icon building Company : Astrazeneca
icon briefcase Job Type : Internship

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Self-evolving evaluation benchmarks research Internship

Self-evolving evaluation benchmarks research Internship

Cambridge

AstraZeneca is a global, science-led biopharmaceutical business and its innovative medicines are used by millions of patients worldwide! AstraZeneca Summer Internships introduce you to the world of ground-breaking drug development, embedding you in highly dedicated teams, committed to delivering life-changing medicines to patients. Our 10–12-week program is designed for undergraduate, master's, and doctoral students. We offer exciting opportunities across Research & Development, Operations, and Enabling Units (Corporate functions).
Our internships immerse students in the pharmaceutical industry, allowing the opportunity to contribute to our diverse pipeline of medicines whether in the lab or outside of it. You will feel trusted and empowered to take on new challenges, but with all the help and guidance you need to succeed. This internship will help you develop essential skills, expand your knowledge, and build a network that will set you up for future success. You will be surrounded by curious, passionate, and open-minded professionals eager to learn and follow the science, fostering your growth in a truly collaborative and global team.

Introduction to role
Join us at the Center for Artificial Intelligence (CAI), where we design next‑generation evaluation methods for advanced agentic AI systems used across scientific workflows. In this role, you will contribute to a research project focused on developing self‑evolving benchmarking frameworks, where evaluation criteria continuously adapt based on model behaviour, evidence quality, and observed failure modes. You will explore how dynamic criteria, evidence‑grounded scoring, and adversarial testing can maintain benchmark discriminative power as AI systems improve. Working closely with experts in machine learning, scientific reasoning, and evaluation science, you will gain hands‑on experience building tools that support trustworthy and scalable assessment of AI systems used in multi‑agent scientific workflows.

Accountabilities
As an intern, you will be engaged with several key responsibilities, including:
  • Developing a self-evolving benchmarking framework, incorporating dynamic rubric criteria.
  • Designing and implementing evidence-grounded scoring mechanisms, ensuring that model claims and reasoning steps are supported by verifiable traces, tool outputs, or retrieved evidence.
  • Investigating robustness and anti-gaming strategies, including adversarial testing to detect behaviours where models optimize the score without improving real-world quality.
  • Building lightweight benchmarking tools, following solid software engineering practices to ensure reproducibility, traceability, and modularity.
  • Analyzing model behaviour across multiple scientific task families, such as protocol drafting, reasoning chains, and multi-agent planning, to assess the generality of the evolving benchmark.
  • Collaborating with scientists to identify key failure modes, highvalue assessment signals, and opportunities to integrate the benchmarking framework into scientific workflows.

Essential Skills/Experience
The ideal candidate will possess the following skills and experience:

Essential:
  • Currently pursuing a PhD in computer science, machine learning, computational sciences, AI evaluation/robustness, or a related field.
  • Strong experience with machine learning and deep learning methods, ideally including evaluation or alignment related work.
  • Excellent Python programming skills; familiarity with frameworks such as PyTorch, JAX, or TensorFlow.
  • Strong analytical mindset with enthusiasm for evaluation science, reliability, and AI governance
  • Ability to work collaboratively in a team environment and communicate scientific ideas effectively.
  • Must be at least 18 years of age at time of application.
  • Must have UK right-to-work status.
  • Must return to schooling at program close (candidates graduating before/during the programmes are ineligible)

Desirable:
  • Experience with benchmarking, evaluation rubrics, reinforcement learning from human/AI feedback, or model auditing.
  • Familiarity with agentic AI systems, tool using models, multi-agent workflows, or long context reasoning analysis.
  • Knowledge of rubric-based scoring, checklists, or structured evaluation frameworks.
  • Experience with adversarial testing, generative model safety, or failure mode taxonomy development.
  • Interest in applying evaluation science to scientific, biomedical, or protocol generation tasks.
This internship is a valuable opportunity to immerse yourself in cutting‑edge research on AI evaluation and robustness, with access to the necessary computational resources and mentorship from leading experts in the field. If you are ready to transform your technical knowledge into real-world applications, we encourage you to apply and become a part of our team driving innovation at AstraZeneca. Our collaborative environment is designed to help you grow professionally and personally, surrounded by passionate individuals eager to make a difference.
AstraZeneca is where you can immerse yourself in groundbreaking work with real patient impact.
Trusted to work on important projects, you’ll have the independence to take on new challenges while receiving all the guidance you need to succeed. Our collaborative environment is designed to help you grow professionally and personally, surrounded by passionate individuals eager to make a difference.
Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca, starting with the recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics.
We offer reasonable adjustments/accommodations to help all candidates to perform at their best. If you have a need for any reasonable adjustments/accommodations, please complete the section in the application form.
Ready to make an impact? Apply now and join us on this exciting journey!

#Earlytalent

Date Posted

30-Jan-2026

Closing Date

13-Feb-2026Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca and Alexion, starting with our recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics. We offer reasonable adjustments/accommodations to help all candidates to perform at their best. If you have a need for any adjustments/accommodations, please complete the section in the application form.
Original job Self-evolving evaluation benchmarks research Internship posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

Auto-Apply to Self-evolving evaluation benchmarks research Internship Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Self-evolving evaluation benchmarks research Internship Jobs in the UK

GrabJobs is the no1 job portal in the UK, connecting you to thousands of jobs fast! Find the best jobs in the UK, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.