TTT | Devops SRE - EY Global Delivery Service
Job description
The EY Foundation teams develop systems and infrastructure for the Reporting & Analysis Platform for Tax and Other Regulations (RAPToR). Our work supports EY's software developers in creating key products for clients.
We seek dedicated Site Reliability Engineers (SREs) to maintain our high service standards. Our services are designed for global scalability, continuous availability, and seamless operation.
The SRE role involves managing and improving our Azure Cloud infrastructure, ensuring our applications are reliable, efficient, and scalable. Key responsibilities include system monitoring, issue resolution, process automation, and collaborating with development teams on cloud operations best practices. Proficiency in Azure, Infrastructure as Code (IaC), CI/CD pipelines, and cloud security is essential.
This role is ideal for those passionate about building and managing systems that benefit thousands of customers. Join us to contribute to reliable, high-performing services.
Key Qualifications
- Azure Cloud Expertise: Extensive knowledge of Azure services, including Azure Virtual Machines, Azure App Services, Azure Kubernetes Service (AKS), Azure SQL Database, and Azure Storage. Skilled in designing and managing scalable, reliable, and secure cloud infrastructure. Proficiency with Azure cloud services such as AKS, Web Application Firewall (WAF), API Management, Service Bus, Event Hub, Log Analytics, and SQL Database. Expertise in KQL queries and familiarity with microservices architecture and container orchestration using Kubernetes and Azure Container Apps (ACA).
- Infrastructure as Code (IaC): Proficient in automating the deployment and management of Azure resources using Azure Resource Manager (ARM) templates, Terraform, and Azure Bicep.
- CI/CD Pipelines: Strong experience in building and managing CI/CD pipelines with Azure DevOps, GitHub Actions, or Jenkins. Capable of automating code testing, integration, and deployment processes to ensure smooth and efficient software delivery.
- Monitoring and Observability: Skilled in using Azure Monitor, Application Insights, and Log Analytics to monitor application performance, identify issues, and ensure system reliability.
- Automation and Scripting: Proficiency in scripting languages such as PowerShell, Python, or Bash to automate operational tasks and enhance system efficiency.
- Security and Compliance: In-depth understanding of Azure Security best practices, including Identity and Access Management (IAM), Azure Policy, and Azure Security Center to ensure compliance and protect cloud environments.
- Disaster Recovery and Backup: Experience in designing and implementing backup, disaster recovery, and business continuity plans using Azure Backup, Azure Site Recovery, and other relevant services.
- Collaboration and Communication: Ability to collaborate closely with development teams, architects, and stakeholders to integrate DevOps practices, provide technical guidance, and ensure seamless operation of cloud-based systems.
- Software Development: Proficient in designing, authoring, and releasing code in .NET and C#.
- Problem-Solving: Excellent troubleshooting and problem-solving skills.
- Additional Skills: Experience with scale testing, disaster recovery, and capacity planning. Expertise in scripting, automation, monitoring, creating dashboards, and using Power BI for data visualization.
Description
EY's RAPToR platform is a distributed, cloud-based ecosystem operating on Microsoft Azure, catering to a diverse client base across multiple regions, presenting distinct challenges. As a Site Reliability Engineer (SRE) at EY, you will be tasked with addressing these challenges through analytical problem-solving, collaborative efforts, and technical acumen. SREs are entrusted with comprehensive oversight of the RAPToR platform's production stack, encompassing application functionality and infrastructure resilience.
The RAPToR platform is built on a microservices architecture and utilizes a combination of open-source, proprietary, and custom-developed tools for provisioning, deployment, logging, and monitoring. In this role, you will master these technologies and drive enhancements. Our team operates in a cooperative environment, working in concert with development teams to optimize outcomes for EY. We strive to strike a balance between implementing optimal solutions and pragmatic execution in tackling each engineering problem.
Education & Experience
BS/MS in Computer Science or equivalent (software development or production operations experience in a large-scale environment).
Additional Requirements
Willingness to participate in on-call rotation.