Ripjar specialises in the development of software and data products that help governments and organisations combat serious financial crime. Our technology is used to identify criminal activity such as money laundering and terrorist financing, enabling organisations to enforce sanctions at scale to help combat rogue entities and state actors.
Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an ever-growing archive of billions of news articles covering most languages going back over 30 years, sanctions and watchlist data provided by governments, and vast organisation and ownership datasets.
We are a remote first team, with a head office based in Cheltenham. This position is open to UK wide candidates. If you are based near Cheltenham, you are more than welcome to work from our office at any time.
About the Role
We see a Data Engineer as a software engineer who specialises in distributed data systems. You’ll join the Data Engineering team, whose prime responsibility is the development and operation of the Data Collection Hub, a platform that ingests data from many sources, processes/enriches it, and distributes it to multiple downstream systems.
We’re looking for someone with 2+ years of industry experience building and operating production software who enjoys working across data pipelines, distributed systems, and operational reliability.
What you’ll do
Engineer distributed ingestion services that reliably pull data from diverse sources, handle messy real-world edge cases, and deliver clean, well-structured outputs to multiple downstream products.
Build high-throughput processing components (batch and/or near-real-time) with a focus on performance, scalability, and predictable cost, using strong profiling and measurement practices.
Design and evolve data contracts (schemas, validation rules, versioning, backward compatibility) so downstream teams can build with confidence.
Own production quality: write maintainable code, strong unit/integration tests, and add the observability you need (metrics/logs/tracing) to diagnose issues quickly.
Improve platform reliability by hardening pipelines against partial failures, retries, rate limits, data drift, and infrastructure issues—then codify those learnings into better tooling and guardrails.
Contribute to CI/CD and developer experience: faster builds, better test signal, safer releases, and automated operational checks.
Participate in design reviews, code reviews, incident retrospectives, and iterative delivery—making pragmatic trade-offs and documenting them clearly.
Technology Stack
Languages: Predominantly Python and Node.js
Distributed/data platforms: HDFS, HBase, Spark, plus increasing use of Kubernetes and cloud services
All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.
Be the first to receive the latest Others Full-Time Jobs in the UK.
Setup your job alert:
By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime.
Skip
GrabJobs is the no1 job portal in the UK, connecting you to thousands of jobs fast!
Find the best jobs in the UK, apply in 1 click and get a job today!