Role: Site Reliability Engineer (SRE)
Location: Palo Alto CA (Onsite from Day 1)
Job Type: Contract (W2)
Skill Matrix:
Name
Required
Programming
Yes
SRE
Yes
Grafana
Yes
Prometheus
Yes
AWS
Yes
Cloud Infrastructure
Yes
Linux
Yes
UNIX
Yes
Top skills required for this role:
Programming: Proficiency in languages like Python Java or Go.
System Administration: Strong understanding of Linux/Unix systems.
Cloud Infrastructure: Experience with AWS
Infrastructure as Code (IaC): Knowledge of tools like Terraform or Ansible.
Monitoring Tools: Proficiency with tools such as Prometheus Grafana or Datadog
Job Description/ Responsibilities:
Automation and Tooling: SREs write code to automate operational tasks such as provisioning configuration changes and system updates to reduce manual work and human error.
System Monitoring and Alerting: Developing and maintaining observability stacks (logs metrics tracing) to proactively detect issues before they impact users.
Incident Response and On-Call: Managing 24/7 on-call rotation to respond to troubleshoot and resolve production incidents.
Post-Incident Reviews (Postmortems): Conducting blameless in-depth reviews of incidents to identify root causes and implement preventive measures.
Capacity Planning: Analyzing system resource utilization to ensure infrastructure can scale to handle future load requirements.
Performance Optimization: Identifying and fixing bottlenecks in software and infrastructure to improve system efficiency and responsiveness.
Error Budget Management: Setting and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to determine if a service is reliable enough to allow new feature deployments.
Chaos Engineering: Testing system resilience by intentionally introducing failures to ensure systems are fault-tolerant
Years of Experience: 8 Years of Experience
...The West Zone Leader AIGRM will be based in Los Angeles or San Francisco and report to the EVP Field Operations AIG Risk Management. This role will lead and be responsible for the development and execution of the West Region AIGRM portfolio and growth strategies including...
...Job Summary High energy, enthusiastic person to join our Recruiting team. This position involves taking incoming calls, making outgoing... ...D1 products and services. MUST BE COMFORTABLE IN A FAST-PACED SALES ENVIRONMENT. Responsibilities Making outbound calls daily...
...Location: Clara Maass Medical Center Department Name: Labor & Delivery Req #: 0000240605 Status: Hourly Shift: Night Pay... ...RWJBarnabas Health is seeking a highly dedicated Registered Nurse for our Labor and Delivery Unit at Clara Maass Medical Center...
...programs, life insurance, disability, retirement plans with matching, and generous paid time off. Position Summary The Instructional Designer plays a key role in designing, developing, and implementing best-in-class customer education experiences that empower users...
Class A CDL A HAZMAT Driver JobCDL A 2+ years experience Hazmat required OTR 4 days out 2-3 days home | Adequate home time Weather Permitting | Hotel Sleeper full amenities 2018 and newer Kenworth T680 Automatic Transmission 11 HR Shift(s) 10 HR Breaks...