V

Kernel Engineer Scientific Computing (SPU)

icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Kernel Engineer Scientific Computing (SPU)

Vorticity is building the world’s first Scientific Processing Unit (SPU), a new class of silicon purpose-built to accelerate scientific computing beyond the limits of GPUs. We are designing tightly coupled software–hardware systems around applied mathematics workloads to deliver order-of-magnitude performance gains. Unlocking its full potential requires early, deep engagement from applied mathematics–driven software engineers who can translate real-world scientific workloads into executable models, kernels, libraries, and applications that inform both architecture and tooling decisions.

As a Kernel Engineer, you will work at the intersection of applied mathematics, scientific computing, parallel programming, and low-level performance engineering. You will help shape how numerical kernels are implemented, optimized, and eventually mapped onto the SPU. Your work may include building early numerical kernels and libraries, developing prototype applications, and writing Python-based workload models and simulators, all to support and inform the evolving hardware and compiler stack.

This requires both strong applied math fundamentals and deep low-level implementation ability. You should be comfortable moving from mathematical formulations to efficient kernels, reasoning about accuracy, stability, data movement, memory hierarchy, parallel execution, and compiler behavior along the way. This position is ideal for someone who combines strong scientific computing instincts with the low-level habits of a performance engineer.

 

Responsibilities

  • Prototyping and implementing core kernels and low-level numerical primitives for the SPU.

  • Translating mathematical formulations into executable, performance-relevant kernel implementations.

  • Analyzing and optimizing memory-access patterns, including coalescing, locality, shared memory usage, cache behavior, register pressure, and host-device data movement.

  • Collaborating closely with hardware architects to evaluate algorithm–architecture tradeoffs around memory hierarchy, synchronization, vector/SIMT execution, instruction behavior, and parallel scheduling.

  • Working with compiler and runtime teams to ensure kernels map cleanly to the SPU programming model.

  • Designing microbenchmarks, correctness tests, numerical accuracy tests, and performance models, then iteratively refining kernels based on hardware evolution, compiler behavior, profiler output, and measured performance.

Core Skills:

  • Strong applied mathematics and scientific computing judgment, with the ability to understand numerical workloads deeply enough to implement them correctly and efficiently.

  • Strong proficiency in C++ and CUDA, HIP, SYCL, or an equivalent accelerator programming model.

  • Experience writing custom kernels, not just using existing frameworks or vendor libraries.

  • Ability to translate mathematical formulations into low-level implementations while balancing accuracy, stability, precision, data movement, and performance.

  • Deep understanding of GPU execution and memory hierarchy, including global memory, shared memory, registers, caches, coalescing, atomics, reductions, scans, warp-level execution, and occupancy.

  • Experience using profiling and performance tools to identify bottlenecks, test hypotheses, and validate improvements.

  • Ability to reason from profiler output to concrete code changes, rather than treating performance debugging as guesswork.

  • Solid concurrency fundamentals, including race conditions, atomicity, synchronization, and thread/process execution behavior.

Nice to Have Skills:

  • Familiarity with performance analysis tools or modeling techniques (profilers, roofline models)

  • Exposure to compilers, runtimes, or code generation frameworks

  • Experience in applied scientific domains such as physics, geophysics, CFD, climate, materials, fusion, or finance.

  • Experience with low-level GPU assembly or intermediate representations.

  • Familiarity with low-level system software or drivers.

Non-Technical Qualities:

  • Excellent written and verbal communication skills

  • Strong ability to work independently and collaboratively in a team.

  • Comfort operating in an early-stage environment where the hardware, compiler, and software stack are evolving together.

  • Willingness to put in the hard work needed to bring the SPU to life.

  • Above all: low ego.

As passionate scientists and engineers, we are well aware of the plethora of critical problems in the world that cannot be solved because humanity simply does not have enough computing power. To address this, Vorticity is developing a radically new silicon chip architecture and system to dramatically accelerate scientific computing problems.

Vorticity’s mission is to expand human ingenuity. To do that we are building a team of exceptional people to work together on big problems. Join us!

Original job Kernel Engineer Scientific Computing (SPU) posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Kernel Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Kernel Engineer Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.