Self-evolving evaluation benchmarks research Internship

Company : Astrazeneca

Job Type : Internship

Uk - Cambridge

Job Description - Self-evolving evaluation benchmarks research Internship

Self-evolving evaluation benchmarks research Internship

Cambridge

AstraZeneca is a global, science-led biopharmaceutical business and its innovative medicines are used by millions of patients worldwide! AstraZeneca Summer Internships introduce you to the world of ground-breaking drug development, embedding you in highly dedicated teams, committed to delivering life-changing medicines to patients. Our 10–12-week program is designed for undergraduate, master's, and doctoral students. We offer exciting opportunities across Research & Development, Operations, and Enabling Units (Corporate functions).

Our internships immerse students in the pharmaceutical industry, allowing the opportunity to contribute to our diverse pipeline of medicines whether in the lab or outside of it. You will feel trusted and empowered to take on new challenges, but with all the help and guidance you need to succeed. This internship will help you develop essential skills, expand your knowledge, and build a network that will set you up for future success. You will be surrounded by curious, passionate, and open-minded professionals eager to learn and follow the science, fostering your growth in a truly collaborative and global team.

Introduction to role

Join us at the Center for Artificial Intelligence (CAI), where we design next‑generation evaluation methods for advanced agentic AI systems used across scientific workflows. In this role, you will contribute to a research project focused on developing self‑evolving benchmarking frameworks, where evaluation criteria continuously adapt based on model behaviour, evidence quality, and observed failure modes. You will explore how dynamic criteria, evidence‑grounded scoring, and adversarial testing can maintain benchmark discriminative power as AI systems improve. Working closely with experts in machine learning, scientific reasoning, and evaluation science, you will gain hands‑on experience building tools that support trustworthy and scalable assessment of AI systems used in multi‑agent scientific workflows.

Accountabilities

As an intern, you will be engaged with several key responsibilities, including:

Developing a self-evolving benchmarking framework, incorporating dynamic rubric criteria.
Designing and implementing evidence-grounded scoring mechanisms, ensuring that model claims and reasoning steps are supported by verifiable traces, tool outputs, or retrieved evidence.
Investigating robustness and anti-gaming strategies, including adversarial testing to detect behaviours where models optimize the score without improving real-world quality.
Building lightweight benchmarking tools, following solid software engineering practices to ensure reproducibility, traceability, and modularity.
Analyzing model behaviour across multiple scientific task families, such as protocol drafting, reasoning chains, and multi-agent planning, to assess the generality of the evolving benchmark.
Collaborating with scientists to identify key failure modes, highvalue assessment signals, and opportunities to integrate the benchmarking framework into scientific workflows.

Essential Skills/Experience

The ideal candidate will possess the following skills and experience:

Essential:

Currently pursuing a PhD in computer science, machine learning, computational sciences, AI evaluation/robustness, or a related field.
Strong experience with machine learning and deep learning methods, ideally including evaluation or alignment related work.
Excellent Python programming skills; familiarity with frameworks such as PyTorch, JAX, or TensorFlow.
Strong analytical mindset with enthusiasm for evaluation science, reliability, and AI governance
Ability to work collaboratively in a team environment and communicate scientific ideas effectively.
Must be at least 18 years of age at time of application.
Must have UK right-to-work status.
Must return to schooling at program close (candidates graduating before/during the programmes are ineligible)

Desirable:

Experience with benchmarking, evaluation rubrics, reinforcement learning from human/AI feedback, or model auditing.
Familiarity with agentic AI systems, tool using models, multi-agent workflows, or long context reasoning analysis.
Knowledge of rubric-based scoring, checklists, or structured evaluation frameworks.
Experience with adversarial testing, generative model safety, or failure mode taxonomy development.
Interest in applying evaluation science to scientific, biomedical, or protocol generation tasks.

This internship is a valuable opportunity to immerse yourself in cutting‑edge research on AI evaluation and robustness, with access to the necessary computational resources and mentorship from leading experts in the field. If you are ready to transform your technical knowledge into real-world applications, we encourage you to apply and become a part of our team driving innovation at AstraZeneca. Our collaborative environment is designed to help you grow professionally and personally, surrounded by passionate individuals eager to make a difference.

AstraZeneca is where you can immerse yourself in groundbreaking work with real patient impact.

Trusted to work on important projects, you’ll have the independence to take on new challenges while receiving all the guidance you need to succeed. Our collaborative environment is designed to help you grow professionally and personally, surrounded by passionate individuals eager to make a difference.

Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca, starting with the recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics.

We offer reasonable adjustments/accommodations to help all candidates to perform at their best. If you have a need for any reasonable adjustments/accommodations, please complete the section in the application form.

Ready to make an impact? Apply now and join us on this exciting journey!

#Earlytalent

Date Posted

30-Jan-2026

Closing Date

13-Feb-2026Our mission is to build an inclusive and equitable environment. We want people to feel they belong at AstraZeneca and Alexion, starting with our recruitment process. We welcome and consider applications from all qualified candidates, regardless of characteristics. We offer reasonable adjustments/accommodations to help all candidates to perform at their best. If you have a need for any adjustments/accommodations, please complete the section in the application form.

Original job Self-evolving evaluation benchmarks research Internship posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Share Job

Get your Resume Reviewed for Free

Similar Self-evolving evaluation benchmarks research Internship Jobs in the UK

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip

Self-evolving evaluation benchmarks research Internship

Job Description - Self-evolving evaluation benchmarks research Internship

Similar Self-evolving evaluation benchmarks research Internship Jobs in the UK

Mobile Apps