Veeam Software logo
ML/AI Ops Engineer
full-timeCosta Rica

Summary

Location

Costa Rica

Type

full-time

Explore Jobs

About this role


Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world’s biggest brands. The future of data resilience is here - go fearlessly forward with us.


About the Role 



We’re looking for an ML/AI Ops Engineer with 7+ years of experience to own the end-to-end operationalization of our ML/AI solutions, ensuring models move smoothly from development to scalable, reliable production products. In this role, you’ll design and automate CI/CD pipelines, build and optimize model lifecycle workflows, monitor deployed models for performance, drift, and reliability, and integrate intelligence products into various digital tools such as Copilot, Salesforce, and Tableau. You’ll collaborate closely with Data Scientists, Data Engineers, and Data Architects to transform high-quality research models into robust, production-grade products. This is a key role in shaping a modern ML/AI lifecycle—from CI/CD to high governance. 


 


What You’ll Do 



  • Own the end-to-end operationalization of ML and AI solutions—from development to scalable, reliable production systemsthat integrateseamlessly with other digital tools. 
    • Design, automate, and maintain CI/CD pipelines for model training, testing, deployment, and retraining (Azure DevOps, Databricks). 
    • Build, optimize, and version model lifecycle workflows, ensuring reproducibility, lineage, and governance across the ML/AI platform. 
    • Monitor production models for performance, drift, reliability, and resource usage; implement automated retraining workflows. 
    • Optimize compute, storage, and orchestration across the Databricks platform to ensure efficient, cost-effective operations. 
    • Collaborate closely with ML/AI Scientists, Data Engineers, and DWH team to transform research-grade models into production-ready services. 
    • Contribute to advancing our ML/AI platform, tooling, automation standards, and best practices. 


 


What You’ll Bring 



  • Solid experience in operationalizing ML/AI models, including deployment, automation, monitoring, and lifecycle management.

  • Strong programming skills in Python,PySpark, and SQL with clean, efficient, production-ready code.

  • Experienced in feature engineering with a practical understanding of data engineering fundamentals - designing, validating, and optimizing feature pipelines, and ensuring feature consistency

  • Experience in building Vector embeddings & RAG systems.

  • Familiarity in ML and LLM models development and libraries used.

  • Experience with MLflow (or similar tools) for model tracking, registry management, and lifecycle operations.

  • Familiarity with CI/CD pipelines (Azure DevOps preferred)

  • Strong grasp of data versioning, model versioning, reproducibility, and data lineage within governed ML/AI environments.

  • Experience designing, consuming, or integrating REST APIs to expose ML/AI models as services and support real-time or near-real-time inference.

  • Experience monitoring production models, identifying drift or performance issues, and implementing corrective workflows.

  • A collaborative, systems-thinking mindset, working closely with ML/AI Scientists, Data Engineers, and Data Warehouse team. 


 


Bonus Skills 



  • Understanding ofdata quality frameworks and how they integrate into ML pipelines.

  • Comfort with infrastructure-as-code for provisioning and managing ML/AI platform components. 

  • Working knowledge of Unix environments and general DevOps principles; exposure to Docker/Kubernetes is beneficial.

  • Experience with real-time or near-real-time serving architectures, event-driven systems, or streaming-based inference.

  • Experience with AI agent tools and MCP servers. 

  • Strong interest in contributing to internal ML/AI platform evolution, including tooling, automation, standards, and best practices.


 


What You’ll Get 



  • Two weeks of paid vacation, 12 statutory holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares

  • Paid parental leave: 8 days for fathers, 122 days for birthing parents, 92 days for adoptive parents

  • Medical, dental, and vision coverage fully funded through INS Premium for employees and dependents

  • Mental health support, therapy sessions, and virtual care via our Employee Assistance Program

  • Retirement and social security contributions through Costa Rica’s statutory programs

  • Life insurance equal to 24x monthly salary, plus disability and funeral coverage

  • Daily cafeteria subsidy

  • Fertility, adoption, and surrogacy support, plus 24 paid volunteer hours through Veeam Cares

  • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning


#LI-CC1


 



Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential.


Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice.  


The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. 


By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice.

By submitting your application, you acknowledge that the information provided in your job application and any supporting documents is complete and accurate to the best of your knowledge. Any misrepresentation, omission, or falsification of information may result in disqualification from consideration for employment or, if discovered after employment begins, termination of employment.



Other facts

Tech stack
ML/AI Operationalization,CI/CD Pipelines,Model Lifecycle Management,Python,PySpark,SQL,Feature Engineering,Vector Embeddings,MLflow,REST APIs,Monitoring Production Models,Collaboration,Data Versioning,Model Versioning,Reproducibility,Data Lineage

About Veeam Software

Welcome to Veeam’s LinkedIn page.

Follow us here for company news, product updates, events and more.

Veeam®, the #1 global market leader in data resilience, believes every business should be able to bounce forward after a disruption with the confidence and control of all their data whenever and wherever they need it. Veeam calls this radical resilience, and we’re obsessed with creating innovative ways to help our customers achieve it.

With Veeam, organizations achieve radical resilience through data security, data recovery, and data freedom for their hybrid cloud.

Veeam solutions are purpose-built for powering data resilience by providing data backup, data recovery, data freedom, data security, and data intelligence. With Veeam, IT and security leaders rest easy knowing that their apps and data are protected and always available across their cloud, virtual, physical, SaaS, and Kubernetes environments.

Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 550,000 customers worldwide, including 67% of the Global 2000, that trust Veeam to keep their businesses running.

Radical resilience starts with Veeam.
Learn more at www.veeam.com or follow Veeam on X @veeam.

Team size: 5,001-10,000 employees
LinkedIn: Visit
Industry: Software Development

What you'll do

  • The ML/AI Ops Engineer will own the end-to-end operationalization of ML and AI solutions, ensuring smooth transitions from development to production. Responsibilities include designing CI/CD pipelines, optimizing model workflows, and monitoring deployed models for performance and reliability.

Ready to join Veeam Software?

Take the next step in your career journey

Frequently Asked Questions

What does a ML/AI Ops Engineer do at Veeam Software?

As a ML/AI Ops Engineer at Veeam Software, you will: the ML/AI Ops Engineer will own the end-to-end operationalization of ML and AI solutions, ensuring smooth transitions from development to production. Responsibilities include designing CI/CD pipelines, optimizing model workflows, and monitoring deployed models for performance and reliability..

Why join Veeam Software as a ML/AI Ops Engineer?

Veeam Software is a leading Software Development company.

Is the ML/AI Ops Engineer position at Veeam Software remote?

The ML/AI Ops Engineer position at Veeam Software is based in Costa Rica, Costa Rica. Contact the company through Clera for specific work arrangement details.

How do I apply for the ML/AI Ops Engineer position at Veeam Software?

You can apply for the ML/AI Ops Engineer position at Veeam Software directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about Veeam Software on their website.