Synechron logo
AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards
full-timePune

Summary

Location

Pune

Type

full-time

Claim this Company

Are you the employer? Manage your company page directly.

Explore Jobs

About this role

Job Summary

Synechron is seeking a skilled AI Agent Test Engineer to lead validation efforts for conversational and agentic AI applications. In this role, you will develop evaluation frameworks, test harnesses, and safety controls to ensure AI agents deliver accurate, secure, and compliant experiences for users. Your work will support the organization’s commitment to responsible AI, high-quality outputs, and robust system performance, primarily within financial and banking domains. Your contributions will enhance client trust and operational excellence through rigorous testing and continuous monitoring of AI agent behaviors.


Software Requirements

Required:

  • Experience with QA automation frameworks and tools such as Selenium, TestNG, Maven, Jenkins, and JIRA

  • Strong programming skills in Java and Python, with experience in automating API and UI tests

  • Knowledge of AI evaluation pipelines, including prompt validation, safety checks, and agent output assessment

  • Familiarity with chatbot and conversational AI frameworks and agent architectures

  • Experience designing and executing end-to-end test scenarios and safety protocols for AI systems

  • Experience with CI/CD integration, telemetry, and observability tools


Preferred:

  • Exposure to experiment tracking and version control systems for managing prompts, datasets, and configurations

  • Knowledge of vector databases, embeddings, and retrieval metrics for RAG systems

  • Familiarity with safety tooling, responsible-AI frameworks, and governance standards (e.g., fairness, bias, PII privacy)


Overall Responsibilities

  • Design, develop, and execute automated evaluation harnesses to validate agent responses, safety, and performance

  • Build test scenarios that evaluate multi-turn conversations, task success, helpfulness, and policy adherence

  • Validate tool and function call schemas, error handling, retries, and resilience to failures

  • Assess retrieval-augmented generation (RAG) quality, including accuracy, grounding, citations, and indexing

  • Conduct safety testing, including prompt injection, jailbreak, content moderation, and escalation logic

  • Monitor runtime KPIs such as accuracy, resolution rate, latency, and token usage; develop dashboards and trend analyses

  • Track prompt, configuration, and safety rule changes, and validate new agent versions via shadow testing and evaluation thresholds

  • Develop and maintain automated tests for APIs, UI, and databases where applicable

  • Participate in Agile ceremonies, including sprint planning, backlog refinement, and retrospectives

  • Document testing strategies, results, and safety audit reports for compliance and governance purposes

  • Support continuous improvement initiatives to strengthen test coverage, reliability, and compliance


Technical Skills (By Category)

Programming Languages (Essential):

  • Java, Python for automation and evaluation scripting


Preferred:

  • Other languages like JavaScript or notebooks for data analysis and report generation


Testing & Evaluation Tools:

  • Selenium, TestNG, Maven, Jenkins for automation pipelines

  • API validation tools, e.g., Postman, RestAssured (preferred)

  • Evaluation frameworks for AI model assessment, version control, and experiment tracking tools


AI & Retrieval Systems:

  • Knowledge of retrieval-augmented generation (RAG) architecture, embeddings, and retrieval metrics

  • Experience testing content grounding, citation correctness, and index coverage


Data & Monitoring:

  • SQL for database querying and validation

  • Telemetry tools: Prometheus, Grafana, JFR, JMC, or similar for performance monitoring

  • Dashboard creation and trend analysis for runtime KPIs


Security & Compliance:

  • Familiarity with responsible-AI principles, bias mitigation, privacy standards, and content moderation policies


Experience Requirements

  • Minimum 5 years experience in QA/test automation environments, with specific focus on AI, NLP, or conversational agents

  • Proven success in designing, implementing, and maintaining evaluation harnesses for AI systems

  • Experience in testing safety, fairness, and compliance aspects of AI functionality

  • Hands-on in API and UI automation, with strong scripting and programming capabilities

  • Knowledge of enterprise AI tools, telemetry, and observability in regulated settings


Day-to-Day Activities

  • Develop and enhance automated evaluation and safety testing frameworks for conversational agents

  • Create multi-turn test scenarios, validate outputs, and track performance metrics

  • Investigate and troubleshoot issues related to agent safety, accuracy, and grounding

  • Collaborate with data scientists, product managers, and security teams to ensure high standards

  • Monitor system KPIs and create dashboards for ongoing performance analysis

  • Conduct shadow testing for new agent versions and validate against evaluation thresholds

  • Keep updated on responsible-AI standards, safety techniques, and emerging evaluation metrics

  • Document testing procedures, safety checklists, and compliance reports for audits


Qualifications

  • Bachelor’s or Master’s degree in Computer Science, AI, Data Science, or related fields

  • 5+ years of experience in QA automation, particularly with conversational AI systems

  • Proven expertise in evaluation methodologies, safety testing, and model validation

  • Experience with API, UI, and database automation tools in enterprise environments

  • Certifications or training in AI ethics, safety, or responsible-AI frameworks (preferred)


Professional Competencies

  • Strong analytical and critical thinking skills for complex AI validation tasks

  • Excellent communication skills for cross-team collaboration and documentation

  • Leadership ability to guide junior engineers and foster best testing practices

  • Adaptability to rapidly evolving AI safety standards and regulatory landscapes

  • Detail-oriented approach ensuring thorough testing coverage and compliance

  • Proactive learning attitude towards responsible-AI principles and emerging evaluation tools

S​YNECHRON’S DIVERSITY & INCLUSION STATEMENT
 

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.


All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

Candidate Application Notice

Other facts

Tech stack
QA Automation,Java,Python,Selenium,TestNG,Maven,Jenkins,JIRA,API Testing,UI Testing,Conversational AI,Safety Testing,CI/CD Integration,Telemetry,Data Analysis,SQL

About Synechron

At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines creativity and innovative technology to deliver industry-leading digital solutions. Synechron’s progressive technologies and optimization strategies span end-to-end Artificial Intelligence, Consulting, Digital, Cloud & DevOps, Data, and Software Engineering, servicing an array of noteworthy financial services and technology firms. Through research and development initiatives in our FinLabs we develop solutions for modernization, from Artificial Intelligence and Blockchain to Data Science models, Digital Underwriting, mobile-first applications and more. Over the last 20+ years, our company has been honored with multiple employer awards, recognizing our commitment to our talented teams. With top clients to boast about, Synechron has a global workforce of 14,000+, and has 55 offices in 20 countries within key global markets. For more information on the company, please visit our website:www.synechron.com.

Team size: 10,001+ employees
LinkedIn: Visit
Industry: Technology, Information and Internet

What you'll do

  • The AI Evaluation & Safety Test Engineer will design, develop, and execute automated evaluation frameworks for conversational AI applications. This role involves validating agent responses, conducting safety testing, and monitoring system performance metrics.

Join Clera's Talent Pool

Get matched with similar opportunities at top startups

This role is hosted on Synechron's careers site.
Join our talent pool first to get notified about similar roles that match your profile.

Frequently Asked Questions

What does a AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards do at Synechron?

As a AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards at Synechron, you will: the AI Evaluation & Safety Test Engineer will design, develop, and execute automated evaluation frameworks for conversational AI applications. This role involves validating agent responses, conducting safety testing, and monitoring system performance metrics..

Why join Synechron as a AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards?

Synechron is a leading Technology, Information and Internet company.

Is the AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards position at Synechron remote?

The AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards position at Synechron is based in Pune, India. Contact the company through Clera for specific work arrangement details.

How do I apply for the AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards position at Synechron?

You can apply for the AI Evaluation & Safety Test Engineer – Conversational AI, Automation, and Responsible AI Standards position at Synechron directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about Synechron on their website.