Site Reliability Engineer

Basic Information

Country

Israel

State

NA

City

Tel Aviv

Date Published

13-Jun-2021

Job ID

30370

Travel Amount

None

Description and Requirements

BMC is looking for a Site Reliability Engineer to join our SaaS production and delivery team to operate and assure SaaS service availability. The production team serves as the center of BMC's advanced SaaS platform and provides 24x7 fast response to ensure maximal service availability and performance.

As part of the SRE team, you will be focusing on production ownership, introducing new technologies and systems, pushing our production excellence, and offering to the next levels. Along with using modern monitoring solutions, such as Datadog. You will be using the latest and top technology services such as K8S, MSK, SQS & various DB, etc.


In this role you will be:

  • Own operational responsibility for the application and platform layers in a critical SLA.
  • Oversee and own overall production - deployments, maintenances, and enhancements.
  • Ensuring Production SaaS platform high availability working with various teams.
  • Managing the production along with being the technical focal.
  • Improve our systems, deployments, operations, and overall cloud activities.
  • Responsible for deployment, tools, troubleshooting, and performance tuning.
  • Manage incidents, guiding our 24x7 SRE team understanding their importance.
  • Focusing on root cause analysis, prevention measures, and knowledge transfer.
  • Develop and maintain processes, documentation, and automation.
  • Support platform maintenance and testing initiatives.


What we are looking for:

  • BSc in engineering or equivalent experience
  • At least 2 years’ experience in running solutions in production (e.g.: Technical lead, Support L3).
  • Hands-on approach - Strong troubleshooting, problem-solving skills.
  • Experience with a cloud provider (AWS preference) – a must.
  • Experience with Docker and K8S.
  • Practice in production incident management principles.
  • Understanding of CI/CD concepts – Jenkins, BitBucket, GitLab.
  • UNIX/Linux experience and system administration knowledge.
  • Experience with major monitoring solutions (e.g. Datadog, logz.io, Prometheus).
  • Write and maintain technical documents and standard operating procedures.
  • Strong verbal and written communications in English and Hebrew.
  • Team player, get-stuff-done attitude with self-learning skills.


It would be an advantage for you to have:

  • Development background, automation scripting.
  • Terraform, Ansible, Helm
  • Understanding of security and networking of production environments.


From core to cloud to edge, BMC delivers the software and services that enable over 10,000 global customers, including 84% of the Forbes Global 100, to thrive in their ongoing evolution to an Autonomous Digital Enterprise.
It is the policy of BMC Software to afford equal opportunity for employment to all individuals regardless of race, color, creed, sex, age, sexual orientation, national origin, disability, ancestry, citizenship status, political affiliation, religion, gender, transgender, gender identity, gender expression, marital status, status as a parent, disabled veteran or status as a protected veteran, genetic information or other factors prohibited by law, and to prohibit harassment or retaliation based on any of these factors.

If you need a reasonable accommodation for any part of the application and hiring process, visit the accommodation request page.