We're looking for an ML Engineer who can ship — from classical pipelines to LLM-powered features — on AWS. You'll design, deploy, and maintain ML systems in production. This is an engineering role first; research experience alone won't be enough.
Responsibilities
-
Build end-to-end ML pipelines: data ingestion, training, evaluation, deployment, and monitoring.
-
Design and implement RAG pipelines, prompt engineering systems, and LLM-based features with proper evaluation — not vibe-based iteration.
-
Fine-tune open-weight models (LoRA/QLoRA) when API calls aren't the right answer.
-
Deploy and serve models on AWS — SageMaker, Bedrock, Lambda, or ECS depending on requirements.
-
Write infrastructure as code (CDK or Terraform); no manual console configuration in production.
-
Monitor deployed models for drift, quality degradation, and cost; own issues through to resolution.
-
Translate ambiguous business problems into concrete ML problem framings.
Must-Have
|
Area |
Requirement |
|---|---|
|
Python |
Engineering-level — testable, reviewable code, not just scripts |
|
Classical ML |
Supervised/unsupervised methods; knows when not to use a neural network |
|
LLM Fundamentals |
Genuine understanding of transformers, tokenization, context windows, inference behaviour |
|
RAG |
Has built and evaluated at least one production or near-production RAG system |
|
AWS Core |
S3, IAM, Lambda, EC2, VPC — comfortable without a handbook |
|
AWS ML |
SageMaker (Training Jobs + Endpoints) and/or Bedrock |
|
Docker |
Containerising ML workloads for deployment |
|
SQL |
Comfortable writing queries for data extraction and validation |

