Apply now »

EY - GDS Consulting - AIA -QE - Engineer- Senior

Location: Kolkata

Other locations: Anywhere in Country

Salary: Competitive

Date: Apr 24, 2026

Job description

Requisition ID: 1703516

At EY, we’re all in to shape your future with confidence.

We’ll help you succeed in a globally connected powerhouse of diverse teams and take your career wherever you want it to go.

Join EY and help to build a better working world.

Career Family AIA – AI Evaluation & QE Engineer

Role Type Full Time

The opportunity

We are the only professional services organization who has a separate business dedicated exclusively to the financial services marketplace. Join the Digital Engineering Team and work with multi‑disciplinary teams from around the world to deliver a global perspective. Aligned to key industry groups including Asset Management, Banking and Capital Markets, Insurance and Private Equity, Health, Government, Power and Utilities, we provide integrated advisory, assurance, tax, and transaction services. Through diverse experiences, world‑class learning, and individually tailored coaching, you will experience continuous professional development. That’s how we develop outstanding leaders who deliver on our promises to clients, communities, and each other.

As the firm accelerates enterprise‑scale adoption of Generative AI, we are establishing a formal AI Quality & Evaluation capability as a critical foundation for trustworthy, production‑grade AI systems. This role plays a pivotal part in building that capability and ensuring AI systems are accurate, grounded, safe, reproducible, and compliant before reaching production.

We are looking for a Senior - AI Evaluation & QE Engineer (SOFTWARE ENGINEER) to join our AI Enabled Automation practice, with 5+ years of experience in Quality Engineering and Test Automation, and strong hands on exposure to GenAI, Large Language Models (LLMs), and RAG based systems.

This role requires deep expertise in Python based automation, LLM evaluation, AI system validation, combined with strong engineering fundamentals to support enterprise AI project deployments across multiple engagements.

In this role, you will contribute to two key capacities:

Serving as the AI Evaluation & Quality Engineering Owner for GenAI and LLM-powered solutions, with responsibilities including defining AI acceptance criteria, benchmarking model behavior, validating Retrieval-Augmented Generation (RAG) pipelines, and implementing automated evaluation gates to ensure production readiness.
Providing support for broader AI software engineering and deployment initiatives, such as CI/CD integration, cloud-native deployments, backend validation, and comprehensive quality engineering across enterprise AI platforms and use cases.

EY’s Artificial Intelligence & Automation (AIA) team is a specialized, intelligence driven business unit that combines advanced AI, automation engineering, and deep domain knowledge to deliver transformative enterprise solutions. The AIA practice partners with clients to analyze, design, and operationalize AI led automation strategies that accelerate digital transformation and unlock measurable business value. We address organizations’ most critical challenges across process optimization, intelligent workflows, data driven decisioning, Generative AI adoption, and automation-enabled innovation. With a unique ability to translate AI strategy into actionable architectures, scalable platforms, and production grade automations, EY’s AIA team helps clients build sustainable competitive advantage, enhance productivity, and navigate disruption by embedding intelligence at the core of their business and technology ecosystems.

Your key responsibilities

AI Strategy & Solution Design

Act as the AI Evaluation & Quality engineering owner within AI Factory / POD based delivery models
Partner with AI Engineers, backend teams, product owners, and architects to define AI acceptance criteria and quality gates
Support end to end validation of GenAI application architectures, including API layers, orchestration logic, and backend services
Evaluate AI readiness across environments (DEV / QA / PROD) and ensure quality consistency across deployments
Contribute to enterprise AI solution design discussions with a focus on trust, reliability

Responsible AI

Collaborate closely with product, UX, and platform teams to align AI quality with functional and business goals

GenAI & Agentic AI Development

Evaluate and benchmark AI outputs across multiple LLMs (model agnostic), including GPT, Claude, LLaMA, Gemini, and enterprise models
Validate Retrieval Augmented Generation (RAG) pipelines end to end, including retrieval accuracy, grounding, chunking strategies, and edge cases
Implement automated AI evaluation pipelines using DeepEval and RAGAs to assess:
- Accuracy and correctness
- Faithfulness and hallucination risk
- Relevance and contextual grounding
- Consistency and reproducibility
- Safety, toxicity, and policy compliance
Develop LLM regression test suites covering prompt changes, model upgrades, orchestration logic, and workflow variations
Support synthetic data generation and adversarial testing to stress test AI behavior under non deterministic conditions
Apply structured analytical thinking to reason about ambiguous and probabilistic AI outputs

MLOps, Cloud & Observability

Build and maintain scalable, modular AI test automation frameworks using Python and PyTest
Integrate AI evaluation pipelines into CI/CD workflows (Azure DevOps, GitHub Actions)
Establish quality baselines, benchmarks, and regression thresholds for AI systems prior to production rollout
Support deployment validation for AI workloads on Azure, AWS, or GCP
Contribute to AI observability by validating evaluation metrics, monitoring signals, and responsible AI guardrails
Ensure AI systems meet enterprise standards for reliability, traceability, and reproducibility

Team Leadership & Collaboration

Mentor QE and AI engineers on GenAI quality engineering best practices
Drive quality awareness across engineering teams, shifting AI validation from implicit to governed
Participate in sprint planning, testing cycles, release sign off, and hypercare for AI deployments
Communicate evaluation findings, risks, and recommendations clearly to both technical and business stakeholders
Stay current with advancements in GenAI evaluation frameworks, LLM testing patterns, and Responsible AI practices

Required Skills

5–7+ years of hands-on experience in AI Evaluation Engineering, combining expertise in Quality Engineering and Test Automation for AI and GenAI systems
Strong proficiency in Python, with deep experience building PyTest based automation frameworks
Hands on experience with DeepEval and RAGAs for LLM and RAG evaluation
Solid understanding of LLM behavior, prompt engineering concepts, and RAG architectures
Experience testing web based AI applications, APIs, and backend services
Strong analytical skills to reason about non deterministic and probabilistic AI outputs
Working knowledge of cloud platforms (Azure / AWS / GCP) and CI/CD pipelines, Azure exp is must.

Good to Have

Backend development exposure (FastAPI, Flask, RESTful services)
Experience integrating QE pipelines with Azure DevOps
Familiarity with LLMOps / MLOps concepts such as model versioning and rollout validation
Exposure to Responsible AI frameworks, safety testing, and governance controls
Experience working in AI Factory, POD based, or agile delivery models
Exposure to DevOps tools such as Terraform, GitHub Actions, Docker

Preferred Qualifications

Prior experience supporting multiple enterprise AI project deployments across domains with Azure AI Stack, Foundry is plus.
Certifications in Azure AI, additional AWS Machine Learning, or GCP Professional ML Engineer is plus
Certifications or hands on experience in MLOps, Kubernetes, or DevOps for AI

Education

Degree: Bachelor’s or Master’s degree in computer science, Artificial Intelligence, Data Science, Engineering, Mathematics, or a related field, or equivalent practical experience

What we offer

EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career.

Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next.
Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way.
Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs.
Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs.

EY | Building a better working world

EY is building a better working world by creating new value for clients, people, society and the planet, while building trust in capital markets.

Enabled by data, AI and advanced technology, EY teams help clients shape the future with confidence and develop answers for the most pressing issues of today and tomorrow.

EY teams work across a full spectrum of services in assurance, consulting, tax, strategy and transactions. Fueled by sector insights, a globally connected, multi-disciplinary network and diverse ecosystem partners, EY teams can provide services in more than 150 countries and territories.

Apply now »

Provider	Description	Enabled
AddThis	Google Analytics is a web analytics service offered by Google that tracks and reports website traffic. Cookie Information Privacy Policy Terms and Conditions
LinkedIn	LinkedIn is an employment-oriented social networking service. We use the Apply with LinkedIn feature to allow you to apply for jobs using your LinkedIn profile. Opting out of LinkedIn cookies will disable your ability to use Apply with LinkedIn. Cookie Policy Cookie Table Privacy Policy Terms and Conditions
Google Analytics	Google Analytics is a web analytics service offered by Google that tracks and reports website traffic. Cookie Information Privacy Policy Terms and Conditions
Google Tag Manager	Google Tag Manager is a tag management system for conversion tracking, site analytics, remarketing, and more. Privacy Policy Terms and Conditions