Skip to content

Basic Information

Job Name
Sr Site Reliability Engineer - USA
Country
United States
State
NA
Date Published
01-Apr-2026
Job ID
46551
Travel
You may occasionally be required to travel for business
Additional Locations
Detroit - Michigan, Houston - Texas, San Francisco - California, New York - New York, Washington - DC
This role can be based remotely in United States 
Looking for more details about our benefits? You can also learn all about them by clicking HERE

Description and Requirements

#LI-JR1
Remote: #LI-Remote
This Is Helix. Powered by You.

At BMC Helix, we don’t do ordinary. We’re the AI-native engine behind the world’s most forward-thinking IT organizations, helping them focus on what matters most. What are we passionate about? We're here to reset the economics of enterprise IT and help others realize the ROI of AI.

We are a mix of curious minds, creative thinkers, and courageous builders who believe tech should change the game—not just play it. We celebrate wins, support each other, and laugh a lot.

We are the change makers. With decades of leadership and established trust in IT service and operations management, we’re scaling with purpose—through organic innovation, strategic acquisitions, and relentless R&D. Our open-first Agentic AI platform empowers autonomous agents to drive real outcomes with speed, accountability, and precision.

We are laser-focused on delivering real value to our customers by accelerating innovation and the application of applying agentic AI in digital service and operations management for IT organizations around the world.
Our SaaS Ops department focuses on delivering SaaS excellence and a great SaaS experience for our customers. We continuously grow by adding and implementing the most cutting-edge technologies and investing in Innovation! Our team is a global and versatile group of professionals, so if you’re looking for a place where your ideas will be heard – this is the place for you!

Thanks to our ongoing expansion we have the opportunity to grow our Site Reliability Engineering team. We’re looking for people who are just as passionate about solving issues with distributed systems as they are to automate, code and collaborate to tackle problems.

We are seeking a highly skilled Performance SRE Engineer to join our team. This role focuses on ensuring the reliability, scalability, and efficiency of our systems through proactive performance optimization, chaos engineering, and cost management initiatives. The ideal candidate will have strong expertise in Kubernetes administration, monitoring enablement, automation, and FinOps practices, with a passion for building highly reliable and efficient systems.

Here is how, through this exciting role, YOU will contribute to BMC's and your own success:

  • System Reliability & Resilience: Maintain up time of systems and applications within agreed SLOs and reduce Mean Time to Recovery (MTTR)
  • Chaos Testing: Design and execute chaos experiments to validate system reliability and fault tolerance. Collaborate with R&D and Operations teams to identify weaknesses and improve system resilience
  • Performance Optimization: Analyze system performance metrics and identify bottlenecks across infrastructure and applications. Implement tuning strategies for Kubernetes clusters, workloads, and cloud resources.
  • FinOps Initiatives: Drive cloud cost optimization strategies and implement FinOps best practices. Monitor and report on resource utilization and cost trends, ensuring alignment with business goals and objectives.
  • Monitoring Enablement: Develop and improve observability across the environment, including proactive alerts derived from hyperscaler and Kubernetes metrics, logs, and traces. Enable dashboards for performance, reliability and FinOps KPIs.
  • Kubernetes Administration: Manage and optimize Kubernetes clusters, including upgrades, scaling, and architecture improvements. Implement best practices for container orchestration and workload scheduling.
  • Collaboration: Work closely with DevOps, engineering, and product teams to ensure performance and reliability goals are met. Document processes, standards, and findings for continuous improvement.

To ensure you’re set up for success, you will bring the following skillset & experience:

  • Strong experience with Kubernetes administration and container orchestration.
  • Hands-on experience with chaos engineering tools and implementation of best practices.
  • Proficiency in cloud platforms (AWS, OCI, or GCP) and cost optimization strategies.
  • Familiarity with performance testing tools (e.g., JMeter, Locust, k6).
  • Expertise with core DevOps and SRE technologies like: Ansible, Docker, Kubernetes, Helm, Jenkins, Terraform, IaaC via Terraform
  • Review recurring incidents and identify improvement and automation opportunities and collaboration with product feature development teams.
#LI-Remote 
CA-DNP

Why Work Here? Because You’ll Matter.

We’re not hiring for roles—we’re hiring for impact. At Helix, you’ll solve hard problems, build smart solutions, and work with people who challenge and champion you. You’ll see your ideas come to life—and your work make a difference.

We believe in trust, transparency, and grit. Our culture is inclusive, flexible, and built for people who want to stretch themselves - and support others doing the same.  Whether you’re remote or in-office, you’ll find space to show up fully and contribute meaningfully. You won’t be boxed in—you’ll be backed up.

Make Your Mark At Helix

If Helix excites you but you're unsure if you meet every qualification, apply anyway. We value diverse perspectives and believe the best ideas come from everywhere.

EEOC Statement

Helix is committed to equal opportunity employment regardless of race, age, sex, creed, color, religion, citizenship status, sexual orientation, gender,  gender expression,  gender identity, national origin, disability, marital status, pregnancy, disabled veteran or status asa protected veteran.  If you need a reasonable accommodation for any part of the application and hiring process, visit the accommodation request page.

BMC Software maintains a strict policy of not requesting any form of payment in exchange for employment opportunities, upholding a fair and ethical hiring process.

The annual base salary range represents the low and high end of the BMC salary range for this position. Actual salaries depend on a wide range of factors that are considered in making compensation decisions, including but not limited to skill sets; experience and training, licensure, and certifications; and other business and organizational needs. 

The range listed is just one component of BMC's employee compensation package. Other rewards may include a variable plan and country specific benefits.

At BMC, it is not typical for an individual to be hired at /near the top of the range. A reasonable estimate of the current range is $111,525 - $185,875

Min salary
111,525
Mid point salary
148,700
Max salary
185,875
Min Salary - NEW
111,525
Max Salary - NEW
185,875