Apply now »

TechOps - DE - CloudOps - Infra - Manager

Location: Chennai

Other locations: Primary Location Only

Salary: Competitive

Date: Mar 17, 2026

Job description

Requisition ID: 1678818

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all.

Infra Support Engineer

The opportunity

As an Infra Support Manager, you will lead a high performing team of Infra & SRE engineers responsible for delivering stable, secure, resilient, and automated cloud/onprem environments.
You will be accountable for operational excellence, engineering maturity, stakeholder engagement, and strategic improvements across infrastructure platforms.

You will drive reliability, automation, proactive risk management, and continuous improvement initiatives while empowering your team through coaching, mentoring, and performance excellence.

Your key responsibilities

Leadership & People Management

Lead, mentor, and grow a team of Senior and Junior Infra/SRE engineers.
Own workforce planning, skill development, hiring, performance reviews, and succession planning.
Promote a culture of accountability, collaboration, and continuous learning.
Provide clear direction, remove blockers, and ensure teams deliver high‑quality outcomes.

Operational Ownership & Service Reliability

Own end‑to‑end uptime, performance, and reliability metrics across supported infrastructure.
Define, track, and report SLOs/SLIs, operational KPIs, observability maturity, and incident metrics.
Lead major incident bridges, coordinate cross‑functional resolution, and ensure strong RCAs.
Build proactive maintenance plans, reliability roadmaps, and ops maturity frameworks.

Strategic Planning & Improvements

Identify long‑term operational risks, tech debt, and improvement areas; create mitigation plans.
Drive automation-first strategy across provisioning, patching, monitoring, and compliance.
Lead modernization initiatives such as cloud migration, DevOps adoption, and infra-as-code.
Partner with architecture teams for platform decisions, capacity planning, and future scaling.

Stakeholder & Cross‑Functional Collaboration

Work closely with application owners, environment management teams, cloud teams, cybersecurity, and leadership.
Act as primary contact for service health, escalations, operational reviews, and client interactions.
Present operational updates, dashboards, improvement actions, and reliability insights to leadership.

Governance, Compliance & Process Excellence

Accountable for adherence to ITSM processes — Incident, Problem, Change.
Ensure audit readiness, risk controls, and compliance across environments.
Standardize runbooks, SOPs, and operational documentation across the team.
Drive maturity in monitoring, alerting hygiene, RCA quality, and patching/vulnerability compliance.

Innovation, Automation & Engineering Excellence

Advocate for modern engineering practices (SRE, DevOps, GitOps, IaC).
Sponsor automation initiatives that eliminate toil and improve speed/quality.
Encourage PoCs, new tool evaluations, and adoption of best‑in‑class observability solutions.

Skills and attributes for success

Leadership & Influence

Strong people‑management, mentoring, and coaching abilities.
Ability to inspire teams, build trust, and drive results through empowerment.
Skilled in conflict management, team motivation, and performance alignment.

Technical Depth

Strong understanding of Linux/Unix, Cloud (Azure/AWS/GCP), networking, monitoring, SRE concepts.
Ability to review technical designs, challenge solutions, and provide SME guidance.
Comfortable with automation standards, code reviews, and architecture discussions.

Strategic Thinking

Ability to balance operational needs with long-term engineering improvements.
Skilled in planning, risk management, prioritization, and capacity forecasting.

Operational Excellence

Strong experience with ITSM, MIM leadership, RCA governance, and operational KPIs.
Ability to drive reliability improvements and uplift monitoring/observability.

Communication & Collaboration

Excellent written and verbal communication.
Skilled in presenting insights, leading discussions, and handling stakeholder escalations.
Ability to influence without authority and coordinate across global teams.

Behavioral Strengths

High ownership, proactive mindset, and calmness under pressure.
Structured problem solver with a process‑driven approach.
Operates with integrity, transparency, and customer‑centric mindset.

To qualify for the role, you must have

9–14 years of experience in Infra Support, Cloud Operations, DevOps, or SRE
At least 3–5 years of people-management or technical‑leadership experience
Strong expertise in Linux/Unix, Cloud Infrastructure (Azure/AWS/GCP), automation & monitoring
Proven experience leading MIM bridges, RCAs, operational governance, and stakeholder interactions
Hands-on exposure to automation, configuration management (Ansible), GitOps or IaC practices
Strong understanding of SRE frameworks (SLIs/SLOs/error budgets)
Experience managing patching, compliance, vulnerability remediation programs
Ability to handle global teams in a follow‑the‑sun or 24x7 model
Strong communication, decision-making, and leadership capabilities
No location constraints; ability to work across time zones
Experience leading cross‑functional technical discussions and providing SME guidance during design, automation, and troubleshooting sessions
Proven ability to drive end‑to‑end operational improvements, including reducing MTTR, improving automation coverage, and strengthening monitoring maturity
Strong experience in planning and executing environment upgrades, migrations, and lifecycle activities with minimal downtime
Ability to create high‑quality technical documentation such as SOPs, runbooks, RCA reports, and automation playbooks
Hands‑on experience in evaluating new tools, technologies, and cloud-native solutions to recommend improvements in stability, cost, and operational efficiency

Technologies and Tools

Must have

Advanced cloud expertise in Azure/AWS/GCP, including governance, cost optimization, IAM, networking, and platform reliability.
Experience managing multi‑region, multi‑environment cloud infrastructures (production + non‑prod)
Deep Linux/Unix expertise with ability to guide teams on performance, debugging, kernel tuning & security hardening.
Strong understanding of enterprise networking — routing, firewalls, load balancers, VPN, proxies, DNS architecture.
SRE Practitioner certifications
Hands‑on experience with monitoring & observability ecosystems — Splunk, Prometheus, Grafana, ELK, Dynatrace, App Insights, CloudWatch, etc.
Ability to define monitoring standards, alert hygiene policies, dashboards, and service health KPIs
Strong automation skills using Ansible, Git/GitHub, Python/Shell/Bash, CI/CD tools.
Experience implementing infrastructure-as-code or GitOps practices (Terraform, ARM/Bicep, CloudFormation exposure preferred).
Strong patching, compliance, security, and vulnerability lifecycle understanding including interaction with Cyber/CTO teams..
Hands‑on experience with ITSM tools – ServiceNow, Jira, Azure DevOps; ability to define processes and governance.
Experience with virtualization & storage platforms (VMware vSphere, Veeam, NetApp, SAN/NAS concepts).
Understanding of backup & DR planning, replication strategies, RPO/RTO design.
Ability to evaluate new tools for observability, automation, compliance, or productivity.
Good command of log analysis, event correlation & root cause isolation..
Experience leading environment migrations, upgrades, modernization, or cloud transformation projects.

Good to have

Cloud Architect or DevOps Professional
Hands‑on Kubernetes knowledge (AKS/EKS/GKE) and container runtime concepts
Knowledge of service mesh, ingress controllers, and container security
Exposure to configuration‑drift tools like Puppet/Chef/SaltStack
Familiarity with Zero Trust Security, MFA, identity governance, and cloud security tools
Knowledge of SOAR/SIEM integrations in operational contexts
Experience with FinOps practices — cost forecasting, budgeting, optimization reports
Familiarity with ETL operations or data platform reliability (if supporting data workloads)
Experience with SLO/SLI frameworks, reliability modeling, and capacity forecasting tools
Knowledge of secrets management systems (Vault, Azure Key Vault, AWS KMS)
Experience with workflow automation using ServiceNow, Logic Apps, Power Automate, Rundeck, Jenkins, or GitHub Actions
Familiarity with compliance frameworks (ISO 27001, SOC2, NIST, CIS Benchmarks)
Hands-on experience with log pipelines, agents, forwarders, and distributed tracing tools
Understanding of API gateways, reverse proxies, and integration services
Experience guiding teams in code review, standards enforcement, and automation best practices

What we look for

Enthusiastic learners with a passion for cloud technologies and DevOps practices.
Problem solvers with a proactive approach to troubleshooting and optimization.
Team players who can collaborate effectively in a remote or hybrid work environment.
Detail-oriented professionals with strong documentation skills.

What we offer

EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career.

Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next.
Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way.
Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs.
Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs.

EY | Building a better working world

EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.

Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate.

Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today.

Apply now »

Provider	Description	Enabled
AddThis	Google Analytics is a web analytics service offered by Google that tracks and reports website traffic. Cookie Information Privacy Policy Terms and Conditions
LinkedIn	LinkedIn is an employment-oriented social networking service. We use the Apply with LinkedIn feature to allow you to apply for jobs using your LinkedIn profile. Opting out of LinkedIn cookies will disable your ability to use Apply with LinkedIn. Cookie Policy Cookie Table Privacy Policy Terms and Conditions
Google Analytics	Google Analytics is a web analytics service offered by Google that tracks and reports website traffic. Cookie Information Privacy Policy Terms and Conditions
Google Tag Manager	Google Tag Manager is a tag management system for conversion tracking, site analytics, remarketing, and more. Privacy Policy Terms and Conditions