AI Platform/ DevOps Engineer


Details:
  • Salary: £70,000 - 80,000 - Annum
  • Job Type: Permanent
  • Job Status: Full-Time
  • Salary Per: Annum
  • Location: City of London London
  • Date: 3 weeks ago
Description:

Join an award-winning B2B consultancy at the forefront of enterprise AI, building and owning the cloud-native platform infrastructure that powers production-grade conversational and generative AI products at scale.

The role

This is a platform and infrastructure engineering role - not a data science or ML engineering position. You'll own the runtime, infrastructure, and operational layers that RAG pipelines, LLM orchestration, vector search, and evaluation workflows run on, across AWS and Databricks. The focus is on building scalable, observable, secure, and cost-efficient platform infrastructure that enables AI engineering teams to ship and operate AI products reliably in production.

What you'll do

Design, build, and operate cloud-native AI platform infrastructure across AWS (Lambda, API Gateway, DynamoDB, S3, CloudWatch) and Databricks
Deploy and operate containerised services on Kubernetes using Terraform for infrastructure-as-code
Own and scale vector search infrastructure (OpenSearch, Algolia, AWS Bedrock Knowledge Bases) and embedding pipelines
Build and maintain CI/CD pipelines for inference services, retrievers, ingestion workflows, and RAG components
Implement observability across AI workloads using CloudWatch, MLflow, and OpenTelemetry - covering latency, throughput, cost, and system health
Apply secure-by-design principles including IAM, encryption, network controls, and audit logging
Work closely with AI engineers to translate prototypes and proof-of-concepts into production-ready, well-architected platform componentsWhat we're looking for

Proven experience in platform, infrastructure, or software engineering roles delivering production-grade systems on AWS
Strong hands-on Kubernetes experience, specifically with EKS (Elastic Kubernetes Service) and ECS (Elastic Container Service) in production environments
Strong Terraform experience for infrastructure-as-code, provisioning and managing cloud infrastructure at scale
Experience operating containerised services, managing CI/CD pipelines, and owning observability and reliability
Familiarity with vector databases or search infrastructure (OpenSearch, Algolia) is a strong advantage
Python proficiency for scripting, automation, and deploying production services
Solid grasp of distributed systems, cloud-native architecture, microservices, and API design
Ownership mindset - comfortable operating autonomously across reliability, performance, cost, and securityWhy join? You'll own the foundational platform infrastructure behind a growing suite of generative AI products, working directly with senior AI and engineering leaders. This is a deep technical ownership role with long-term architectural impact, within an organisation investing heavily in AI at scale.

INDAM

The Portfolio Group are acting on behalf of our client in recruiting for this position

Report this job

By sending this message I agree to GrindJob’s Terms and Conditions and Privacy Policy.

Enter your email to get a notification when similar jobs become available.

Create a job alert for Development Operations Manager in City of London London ()

By continuing, you agree to GrindJob’s T&Cs and Privacy Policy.

When applying for a job, do not provide bank account details or any other financial information.
Never make any form of payment. GrindJob is not responsible for any external website content.

Enter your email to get a notification when similar jobs become available.

Your browser does not support Cookies or JavaScript or this option is turned off in your browser settings.

How to enable Cookies and JavaScript

Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Please wait...
There was an error loading the page. Would you like to reload the page?