P

Software Engineer, Inference

salary Salary :

$150,000 - 230,000 yearly

icon building Company : Pulse
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Software Engineer, Inference

Overview


Pulse is tackling one of the most persistent challenges in data infrastructure: extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to document understanding that combines intelligent schema mapping with fine-tuned extraction models where legacy OCR and other parsing tools consistently fail.

We are a small, fast-growing team of engineers in San Francisco powering Fortune 100 enterprises, YC startups, public investment firms, and growth-stage companies. We are backed by tier 1 investors and growing quickly.

What makes our tech special is our multi-stage architecture:

  • Layout understanding with specialized component detection models

  • Low-latency OCR models for targeted extraction

  • Advanced reading-order algorithms for complex structures

  • Proprietary table structure recognition and parsing

  • Fine-tuned vision-language models for charts, tables, and figures

If you are passionate about the intersection of computer vision, NLP, and data infrastructure, your work at Pulse will directly impact customers and shape the future of document intelligence.

What we are looking for

  • 5 days in-office at our San Francisco office

  • Eager to learn and adapt quickly

  • Prior startup or founding experience is a plus

What we are looking for

  • 5 days in-office at our San Francisco office

  • Eager to learn and adapt quickly

  • Prior startup or founding experience is a plus

About the Role
Specialize in low-latency, high-throughput inference for OCR and multimodal models. Own profiling, batching, and autoscaling across single-tenant and multi-tenant environments.

Responsibilities

  • Build inference services with smart batching and caching

  • Optimize kernels, tokenization, and model graphs

  • Evaluate vLLM, TensorRT LLM, and Triton tradeoffs

  • Implement autoscaling and admission control with clear SLOs

  • Own performance dashboards and capacity planning

Requirements

  • 3+ years in performance engineering or ML systems

  • Strong Python, plus C++ or CUDA exposure

  • Experience with GPU profiling and model serving

Nice to have

  • Experience reducing p95 and cost in production ML systems

Sponsorship
Sponsorship available.

Compensation and benefits
Competitive base salary plus equity, performance-based bonus, relocation assistance for Bay Area moves, daily meal stipend, medical, vision, and dental coverage.

Original job Software Engineer, Inference posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

Auto-Apply to Software Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Software Engineer Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.