Senior Site Reliability Engineer


At the core of Epic’s success are talented, passionate people. Epic prides itself on creating a collaborative, welcoming, and creative environment. Whether it’s building award-winning games or crafting engine technology that enables others to make visually stunning interactive experiences, we’re always innovating.

Being Epic means being a part of a team that continually strives to do right by our community and users. We’re constantly innovating to raise the bar of engine and game development.

Cloud Infrastructure

What We Do

Our team’s mission is to keep our games and platform up and running.

Post Incident Review - There is always an interesting form of something not working as we expect. We focus on how we learn from these production surprises and improve our systems and processes to be more reliable over time. We work with a diverse set of development teams on helping understand incidents.

Production, Event and Launch Readiness - We run large scale production events and we work with many teams on readiness and operational excellence.  We own the process and review for service and product launches and game events. 

Development focused on Reliability - While we help with incidents and readiness, we also work on engineering on tooling, services or other systems and processes that can improve our systems reliability.


What You'll Do

In the role of a Site Reliability Engineer you will tackle problems that impact reliability of our products as a whole. Part of this role is analyzing gaps or risk areas for our products and working with engineering to determine the best course of action. You will participate in post incident reviews, readiness programs and engineering and development efforts. This role is expected to have breadth over depth, but depth in building and running reliable systems.


At Epic we embrace a Service Owner (You build it, you run it) mentality.  In this role we are stewards for operational excellence and we are service owners for tools, systems and services that we build.


In This Role You Will

  • Write code and develop systems and services that help us with operational excellence. Most of our tools will require web interfaces and APIs.

  • Contribute to services, tools and code across the organization that focuses on our team goals.

  • Help develop best practices across our organization and tools that help us distribute those.

  • Work with development teams on understanding systems and helping them be successful with service ownership.

  • Work on cloud based services in AWS.


What We're Looking For

  • You have working cross functionally or across a large number of teams in an organization.

  • You have experience working with and building reliable services on AWS.

  • A passion for the reliability engineering space.

  • Strong preference for candidates who are already in, or are willing to relocate to Cary, NC or Seattle, WA


Epic Games spans across 12 countries with 32 studios and 1,800+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before. Epic’s award-winning Unreal Engine technology not only provides game developers the ability to build high-fidelity, interactive experiences for PC, console, mobile, and VR, it is also a tool being embraced by content creators across a variety of industries such as media and entertainment, automotive, and architectural design. As we continue to build our Engine technology and develop remarkable games, we strive to build teams of world-class talent.

Like what you hear? Come be a part of something Epic!

Epic Games deeply values diverse teams and an inclusive work culture, and we are proud to be an Equal Opportunity employer. Learn more about our Equal Employment Opportunity (EEO) Policy here.