Apply now »

EY - GDS Consulting - AIA -QE - Engineer- Senior

Location:  Kolkata
Other locations:  Anywhere in Country
Salary: Competitive
Date:  Apr 24, 2026

Job description

Requisition ID:  1703516

At EY, we’re all in to shape your future with confidence. 

We’ll help you succeed in a globally connected powerhouse of diverse teams and take your career wherever you want it to go. 

Join EY and help to build a better working world. 

 

Career Family       AIA – AI Evaluation & QE Engineer

Role Type  Full Time

 

The opportunity

We are the only professional services organization who has a separate business dedicated exclusively to the financial services marketplace. Join the Digital Engineering Team and work with multi‑disciplinary teams from around the world to deliver a global perspective. Aligned to key industry groups including Asset Management, Banking and Capital Markets, Insurance and Private Equity, Health, Government, Power and Utilities, we provide integrated advisory, assurance, tax, and transaction services. Through diverse experiences, world‑class learning, and individually tailored coaching, you will experience continuous professional development. That’s how we develop outstanding leaders who deliver on our promises to clients, communities, and each other.

As the firm accelerates enterprise‑scale adoption of Generative AI, we are establishing a formal AI Quality & Evaluation capability as a critical foundation for trustworthy, production‑grade AI systems. This role plays a pivotal part in building that capability and ensuring AI systems are accurate, grounded, safe, reproducible, and compliant before reaching production.

We are looking for a Senior - AI Evaluation & QE Engineer (SOFTWARE ENGINEER) to join our AI Enabled Automation practice, with 5+ years of experience in Quality Engineering and Test Automation, and strong hands on exposure to GenAI, Large Language Models (LLMs), and RAG based systems.

This role requires deep expertise in Python based automation, LLM evaluation, AI system validation, combined with strong engineering fundamentals to support enterprise AI project deployments across multiple engagements.

 

In this role, you will contribute to two key capacities:

  • Serving as the AI Evaluation & Quality Engineering Owner for GenAI and LLM-powered solutions, with responsibilities including defining AI acceptance criteria, benchmarking model behavior, validating Retrieval-Augmented Generation (RAG) pipelines, and implementing automated evaluation gates to ensure production readiness.
  • Providing support for broader AI software engineering and deployment initiatives, such as CI/CD integration, cloud-native deployments, backend validation, and comprehensive quality engineering across enterprise AI platforms and use cases.

 

EY’s Artificial Intelligence & Automation (AIA) team is a specialized, intelligence driven business unit that combines advanced AI, automation engineering, and deep domain knowledge to deliver transformative enterprise solutions. The AIA practice partners with clients to analyze, design, and operationalize AI led automation strategies that accelerate digital transformation and unlock measurable business value. We address organizations’ most critical challenges across process optimization, intelligent workflows, data driven decisioning, Generative AI adoption, and automation-enabled innovation. With a unique ability to translate AI strategy into actionable architectures, scalable platforms, and production grade automations, EY’s AIA team helps clients build sustainable competitive advantage, enhance productivity, and navigate disruption by embedding intelligence at the core of their business and technology ecosystems.

 

Your key responsibilities

AI Strategy & Solution Design

  • Act as the AI Evaluation & Quality engineering owner within AI Factory / POD based delivery models
  • Partner with AI Engineers, backend teams, product owners, and architects to define AI acceptance criteria and quality gates
  • Support end to end validation of GenAI application architectures, including API layers, orchestration logic, and backend services
  • Evaluate AI readiness across environments (DEV / QA / PROD) and ensure quality consistency across deployments
  • Contribute to enterprise AI solution design discussions with a focus on trust, reliability

 

Responsible AI

Collaborate closely with product, UX, and platform teams to align AI quality with functional and business goals

 

GenAI & Agentic AI Development

  • Evaluate and benchmark AI outputs across multiple LLMs (model agnostic), including GPT, Claude, LLaMA, Gemini, and enterprise models
  • Validate Retrieval Augmented Generation (RAG) pipelines end to end, including retrieval accuracy, grounding, chunking strategies, and edge cases
  • Implement automated AI evaluation pipelines using DeepEval and RAGAs to assess:
    • Accuracy and correctness
    • Faithfulness and hallucination risk
    • Relevance and contextual grounding
    • Consistency and reproducibility
    • Safety, toxicity, and policy compliance
  • Develop LLM regression test suites covering prompt changes, model upgrades, orchestration logic, and workflow variations
  • Support synthetic data generation and adversarial testing to stress test AI behavior under non deterministic conditions
  • Apply structured analytical thinking to reason about ambiguous and probabilistic AI outputs

 

MLOps, Cloud & Observability

  • Build and maintain scalable, modular AI test automation frameworks using Python and PyTest
  • Integrate AI evaluation pipelines into CI/CD workflows (Azure DevOps, GitHub Actions)
  • Establish quality baselines, benchmarks, and regression thresholds for AI systems prior to production rollout
  • Support deployment validation for AI workloads on Azure, AWS, or GCP
  • Contribute to AI observability by validating evaluation metrics, monitoring signals, and responsible AI guardrails
  • Ensure AI systems meet enterprise standards for reliability, traceability, and reproducibility

 

Team Leadership & Collaboration

  • Mentor QE and AI engineers on GenAI quality engineering best practices
  • Drive quality awareness across engineering teams, shifting AI validation from implicit to governed
  • Participate in sprint planning, testing cycles, release sign off, and hypercare for AI deployments
  • Communicate evaluation findings, risks, and recommendations clearly to both technical and business stakeholders
  • Stay current with advancements in GenAI evaluation frameworks, LLM testing patterns, and Responsible AI practices

 

Required Skills

  • 5–7+ years of hands-on experience in AI Evaluation Engineering, combining expertise in Quality Engineering and Test Automation for AI and GenAI systems
  • Strong proficiency in Python, with deep experience building PyTest based automation frameworks
  • Hands on experience with DeepEval and RAGAs for LLM and RAG evaluation
  • Solid understanding of LLM behavior, prompt engineering concepts, and RAG architectures
  • Experience testing web based AI applications, APIs, and backend services
  • Strong analytical skills to reason about non deterministic and probabilistic AI outputs
  • Working knowledge of cloud platforms (Azure / AWS / GCP) and CI/CD pipelines, Azure exp is must.

 

Good to Have

  • Backend development exposure (FastAPI, Flask, RESTful services)
  • Experience integrating QE pipelines with Azure DevOps
  • Familiarity with LLMOps / MLOps concepts such as model versioning and rollout validation
  • Exposure to Responsible AI frameworks, safety testing, and governance controls
  • Experience working in AI Factory, POD based, or agile delivery models
  • Exposure to DevOps tools such as Terraform, GitHub Actions, Docker

 

Preferred Qualifications

  • Prior experience supporting multiple enterprise AI project deployments across domains with Azure AI Stack, Foundry is plus.
  • Certifications in Azure AI, additional AWS Machine Learning, or GCP Professional ML Engineer is plus
  • Certifications or hands on experience in MLOps, Kubernetes, or DevOps for AI

 

Education

Degree: Bachelor’s or Master’s degree in computer science, Artificial Intelligence, Data Science, Engineering, Mathematics, or a related field, or equivalent practical experience

 

What we offer

EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career.

  • Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next.
  • Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way.
  • Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs.
  • Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs.

EY | Building a better working world

EY is building a better working world by creating new value for clients, people, society and the planet, while building trust in capital markets.

Enabled by data, AI and advanced technology, EY teams help clients shape the future with confidence and develop answers for the most pressing issues of today and tomorrow.

EY teams work across a full spectrum of services in assurance, consulting, tax, strategy and transactions. Fueled by sector insights, a globally connected, multi-disciplinary network and diverse ecosystem partners, EY teams can provide services in more than 150 countries and territories.

Apply now »