Job Description - Scientific AI Evaluation & Computational Problem Designer
This role is for one of our clients
Compensation: $45-$100 per hour
We are building a large-scale evaluation benchmark to test advanced AI reasoning across scientific and engineering domains. This role focuses on designing rigorous, research-grade computational problems that assess how effectively AI systems can leverage real scientific software tools to solve complex challenges.
Unlike traditional annotation roles, this position requires creating original, graduate-level problems rooted in real-world scientific workflows. You will iteratively refine these problems through calibration against state-of-the-art AI models, ensuring the right balance of difficulty, depth, and reasoning complexity.
What You’ll Do
Design advanced computational problems requiring the use of domain-specific scientific software
Create tasks that test both precise execution (multi-step workflows, simulations) and strategic reasoning (experiment design, inference from partial data)
Develop problem setups, solution pathways, and validation mechanisms
Calibrate and refine tasks based on model performance to achieve target difficulty levels
Ensure problems emphasize reasoning strategy over brute-force computation
Domains & Tools of Interest We are particularly seeking candidates with hands-on experience in:
All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.
Be the first to receive the latest Others Full-Time Jobs in the US.
Setup your job alert:
By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime.
Skip
GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast!
Find the best jobs in the US, apply in 1 click and get a job today!