Module L1A, Level -1, Thejaswini, Technopark, Trivandrum and 2nd floor, Amstor House, Technopark campus, phase-1, Trivandrum , 695581

http://www.techvantagesystems.com

AI Infrastructure Engineer (DevOps/MLOps)

Closing Date:14,June 2025

Job Published: 31,May 2025

Contact Email: jobs@techvabtagesystems.com

Brief Description

Techvantage.ai is a next-generation technology and product engineering company at the forefront of innovation in Generative AI, Agentic AI, and autonomous intelligent systems. We build intelligent, cutting-edge solutions designed to scale and evolve with the future of artificial intelligence.

Role Overview:

We are looking for a skilled and versatile AI Infrastructure Engineer (DevOps/MLOps) to build and manage the cloud infrastructure, deployment pipelines, and machine learning operations behind our AI-powered products. You will work at the intersection of software engineering, ML, and cloud architecture to ensure that our models and systems are scalable, reliable, and production-ready.

Key Responsibilities:

Design and manage CI/CD pipelines for both software applications and machine learning workflows.
Deploy and monitor ML models in production using tools like MLflow, SageMaker, Vertex AI, or similar.
Automate the provisioning and configuration of infrastructure using IaC tools (Terraform, Pulumi, etc.).
Build robust monitoring, logging, and alerting systems for AI applications.
Manage containerized services with Docker and orchestration platforms like Kubernetes.
Collaborate with data scientists and ML engineers to streamline model experimentation, versioning, and deployment.
Optimize compute resources and storage costs across cloud environments (AWS, GCP, or Azure).
Ensure system reliability, scalability, and security across all environments.

Preferred Skills

Requirements:

5+ years of experience in DevOps, MLOps, or infrastructure engineering roles.
Hands-on experience with cloud platforms (AWS, GCP, or Azure) and services related to ML workloads.
Strong knowledge of CI/CD tools (e.g., GitHub Actions, Jenkins, GitLab CI).
Proficiency in Docker, Kubernetes, and infrastructure-as-code frameworks.
Experience with ML pipelines, model versioning, and ML monitoring tools.
Scripting skills in Python, Bash, or similar for automation tasks.
Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK, CloudWatch, etc.).
Understanding of ML lifecycle management and reproducibility.

Preferred Qualifications:

Experience with Kubeflow, MLflow, DVC, or Triton Inference Server.
Exposure to data versioning, feature stores, and model registries.
Certification in AWS/GCP DevOps or Machine Learning Engineering is a plus.
Background in software engineering, data engineering, or ML research is a bonus.

What We Offer:

Work on cutting-edge AI platforms and infrastructure
Cross-functional collaboration with top ML, research, and product teams
Competitive compensation package – no constraints for the right candidate.