Reflections Info Systems (P) Ltd

9A2, Carnival Technopark, Kariyavattom P.O, Thiruvananthapuram, Kerala, India , 695581

http://www.reflectionsglobal.com

Kubernetes Orchestration Engineer – GPU Hypercomputing & AI Workloads

Closing Date:25,July 2025

Job Published: 18,July 2025

Contact Email: Careers@reflectionsinfos.com

Brief Description

Introduction

We are seeking a highly skilled Kubernetes Orchestration Engineer to lead the deployment and management of GPU-optimized Kubernetes environments that power AI/ML and hypercomputing workloads. This role is critical to ensuring scalable, reliable, and high-performance infrastructure across on-premises and hybrid cloud environments. As a core member of our infrastructure engineering team, you will work at the intersection of container orchestration, GPU resource management, and AI application scaling, enabling large scale distributed training and inference across GPU clusters

Job Description

Must Have

Strong experience with Kubernetes (K8s) and container orchestration in production environments.
Expertise in managing GPU workloads in Kubernetes using NVIDIA GPU Operator, vGPU, and device plugin configurations.
Proficiency with container runtimes such as Docker and CRI-O, and orchestration tools like Helm and Kubernetes Operators.
Solid understanding of networking within Kubernetes and service mesh integration (e.g., Istio, Linkerd).
Familiarity with hybrid/multi-cloud Kubernetes platforms (e.g., GKE, EKS, AKS).
Strong scripting and automation skills (e.g., YAML, Helm templating, Bash, Python).

Responsibilities include:

AI Infrastructure Design & Deployment with multi-GPU clusters using NVIDIA or AMD platforms.
Configure GPU environments using CUDA, DGX Systems, and NVIDIA Kubernetes Device Plugin.
Deploy and manage containerized environments with Docker, Kubernetes, and Slurm.
AI Model Support & Optimization for training, fine-tuning, and inference pipelines for LLMs and deep learning models.
Enable distributed training using DDP, FSDP, and ZeRO, with support for mixed precision.
Tune infrastructure to optimize model performance, throughput, and GPU utilization.
Design and operate high-bandwidth, low-latency networks using InfiniBand and RoCE v2.
Integrate GPUDirect Storage and optimize data flow across Lustre, BeeGFS, and Ceph/S3.
Support fast data ingestion, ETL pipelines, and large-scale data staging.
Leverage NVIDIA’s AI stack including cuDNN, NCCL, TensorRT, and Triton Inference Server.
Conduct performance benchmarking with MLPerf and custom test suites

Certifications :

Certified Kubernetes Administrator (CKA) –Must
Certified Kubernetes Application Developer (CKAD)
NVIDIA Certified Kubernetes Specialist

Educational Qualifications

Batchlors in Computer Science/Applications/BTech Computer
Science/MCA

Preferred Skills

Primary Skills :

Kubernetes Cluster Management for AI/ML Workloads
NVIDIA GPU Operator & Device Plugin Configuration in K8s
Container Orchestration using Docker, CRI-O, and Helm
Kubernetes Operators for Lifecycle Automation & Scaling
Pod Networking with CNI Plugins – Calico, Flannel, Cilium
Monitoring & Observability with Prometheus, Grafana, Kibana
GPU Workload Scheduling & Optimization in Kubernetes
Deployment of Distributed AI Frameworks (PyTorch, TensorFlow, Hugging Face)
Service Mesh Integration – Istio or Linkerd
Hybrid/Multi-Cloud Kubernetes Deployments (EKS, GKE, AKS)

Secondary Skills :

Helm Templating & YAML Scripting for Deployment Automation
Infrastructure Scripting using Bash, Python, or Ansible
Kubernetes Custom Resource Definitions (CRDs) & API Extensions
GPU Virtualization (vGPU) and Multi-Tenant GPU Allocation
Kubeflow or MLflow Integration for MLOps Pipelines
K8s Security (RBAC, Network Policies, Pod Security Standards)
CI/CD Integration with GitOps Tools (ArgoCD, Flux)
GPU Monitoring via NVIDIA DCGM or NVIDIA Cloud Native Stack
Advanced Troubleshooting in Kubernetes (control plane, etcd, kubelet)
Cloud-Native Storage for AI – CSI Drivers, NFS, Ceph

Interested candidates may forward their detailed resumes to Careers@reflectionsinfos.com along with their notice period, current and expected CTC details. This is to notify jobseekers that some fraudsters are promising jobs with Reflections Info Systems for a fee. Please note that no payment is ever sought for jobs in Reflections. We contact our candidates only through our official website or LinkedIn and all employment related mails are sent through the official HR email id. Please contact careers@reflectionsinfos.com for any clarification/ alerts on this subject.