About the Company

At Redpin we simplify life's most important payments. Buying a new property overseas can be a stressful time, especially when it comes to moving your money. Through our Currencies Direct and TorFX brands we've been helping people do just that for over 25 years. With recent investment we're now on a mission to build a new range of digital products and services that will make moving money Internationally for Real Estate purchases even easier

We’re on a mission to become the solution for Real Estate payments everywhere. To do this, we are transitioning our business from a horizontal FX platform to a verticalized, embedded software company, as we look to the future and Redpin 2.0.

About the Role

As a Staff Site Reliability Engineer (SRE), you will be a technical leader and architect for our production ecosystem. You will operate at the intersection of software engineering and systems architecture, driving the reliability strategy across multiple engineering value streams. This is not just an operational role; it is a leadership position focused on building a resilient, self-healing platform that empowers our developers and protects our customers.

What You'll Do

Technical Leadership & Strategic Influence

Ownership: Lead the end-to-end reliability, availability, and performance strategy for large-scale, business-critical distributed systems.

Advisory: Act as a "Principal Consultant" to engineering and product leadership, identifying systemic risks and defining production-readiness standards.

Cross-Team Impact: Influence architectural decisions across the organization to ensure services are designed for observability and failure tolerance.

Platform & Infrastructure Excellence

Cloud Architecture: Design and evolve highly available, secure, and cost-effective infrastructure on AWS, utilizing Infrastructure as Code (IaC).

Kubernetes Strategy: Lead the evolution of our EKS environment, focusing on multi-tenant scaling, networking, and security.

Incident Mastery: Serve as a lead Incident Commander for high-severity issues, moving beyond "fixes" to drive deep systemic root cause analysis (RCA) and long-term prevention.

Reliability Engineering & "Platform as a Product"

Strategic Automation: Shift from manual scripting to building internal tooling and "golden paths" that allow development teams to self-serve safely.

Governance: Establish and enforce SLOs, SLIs, and Error Budgets, translating technical metrics into meaningful business health indicators.

Deployment Velocity: Work closely with DevOps to standardize CI/CD patterns to improve deployment frequency while maintaining a high change success rate.

Collaboration & Mentorship

Consulting: Partner with Dev teams on design reviews, resilience testing (Chaos Engineering), and planning.

Culture: Foster a blameless post-mortem culture and mentor senior/mid-level engineers to elevate the organization's technical maturity.

What You’ll Need

Experience & Background

Tenure: 10+ years of experience in Site Reliability, Platform, or Systems Engineering in high-growth, distributed environments.

Education: Bachelor’s or Master’s degree in Computer Science, or equivalent deep industry experience.

Scale: Proven track record of managing platforms at scale preferably in any FinTech Organisation.

Technical Core Competencies

Cloud & Orchestration: Knowledge of AWS and Kubernetes (EKS).

Automation: Proficiency in Python or Go (for tool building) and Bash (for systems tasks).

Networking: Deep understanding of VPCs, Load Balancing, DNS, BGP, and TCP/IP performance tuning.

Data Systems: Operational experience with a mix of SQL (PostgreSQL/SQL Server) and NoSQL/Search (Elasticsearch/Kafka).

Application & Tooling Stack

CI/CD & Config: Understanding of Ansible, Jenkins, Artifactory Terraform/CloudFormation or similar tools

Observability: Mastery of the modern monitoring stack—Prometheus, Grafana, CloudWatch, and distributed tracing (e.g., Coralogix).

Polyglot Awareness: While not a developer, you must be comfortable debugging applications written in Java (Spring Boot), .NET, React etc.

Legacy Integration: Experience bridging modern containerized workloads with traditional middleware (Tomcat, JBoss, MQ).

Soft Skills & Mindset

Influence without Authority: Ability to drive change and adoption of best practices across disparate teams.

Decision Making: Comfortable making high-stakes decisions under pressure during production outages.

Communication: Able to translate complex technical failures into clear business-impact summaries for executive stakeholders.

We welcome people from all backgrounds who seek the opportunity to help build a future where we connect the dots for international property payments. If you have the curiosity, passion, and collaborative spirit, work with us, and let’s move the world of PropTech forward, together.

Redpin, Currencies Direct and TorFX are proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, colour, religion, national origin, disability, protected veteran status, age, or any other characteristic protected by law.

Staff Engineer – SRE

Job Description

Similar Jobs

Similar Jobs