LogoLanguage
TECHVANTAGE SYSTEMS (P) Ltd

Module L1A, Level -1, Thejaswini, Technopark, Trivandrum and 2nd floor, Amstor House, Technopark campus, phase-1, Trivandrum , 695581

AI Infrastructure Engineer (DevOps/MLOps)

Closing Date:14,June 2025
Job Published: 31,May 2025

Brief Description

Techvantage.ai is a next-generation technology and product engineering company at the forefront of innovation in Generative AI, Agentic AI, and autonomous intelligent systems. We build intelligent, cutting-edge solutions designed to scale and evolve with the future of artificial intelligence.

Role Overview:

We are looking for a skilled and versatile AI Infrastructure Engineer (DevOps/MLOps) to build and manage the cloud infrastructure, deployment pipelines, and machine learning operations behind our AI-powered products. You will work at the intersection of software engineering, ML, and cloud architecture to ensure that our models and systems are scalable, reliable, and production-ready.

Key Responsibilities:

  • Design and manage CI/CD pipelines for both software applications and machine learning workflows.
  • Deploy and monitor ML models in production using tools like MLflow, SageMaker, Vertex AI, or similar.
  • Automate the provisioning and configuration of infrastructure using IaC tools (Terraform, Pulumi, etc.).
  • Build robust monitoring, logging, and alerting systems for AI applications.
  • Manage containerized services with Docker and orchestration platforms like Kubernetes.
  • Collaborate with data scientists and ML engineers to streamline model experimentation, versioning, and deployment.
  • Optimize compute resources and storage costs across cloud environments (AWS, GCP, or Azure).
  • Ensure system reliability, scalability, and security across all environments.

Preferred Skills

Requirements:

  • 5+ years of experience in DevOps, MLOps, or infrastructure engineering roles.
  • Hands-on experience with cloud platforms (AWS, GCP, or Azure) and services related to ML workloads.
  • Strong knowledge of CI/CD tools (e.g., GitHub Actions, Jenkins, GitLab CI).
  • Proficiency in Docker, Kubernetes, and infrastructure-as-code frameworks.
  • Experience with ML pipelines, model versioning, and ML monitoring tools.
  • Scripting skills in Python, Bash, or similar for automation tasks.
  • Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK, CloudWatch, etc.).
  • Understanding of ML lifecycle management and reproducibility.

Preferred Qualifications:

  • Experience with Kubeflow, MLflow, DVC, or Triton Inference Server.
  • Exposure to data versioning, feature stores, and model registries.
  • Certification in AWS/GCP DevOps or Machine Learning Engineering is a plus.
  • Background in software engineering, data engineering, or ML research is a bonus.

What We Offer:

  • Work on cutting-edge AI platforms and infrastructure
  • Cross-functional collaboration with top ML, research, and product teams
  • Competitive compensation package – no constraints for the right candidate.