Job Postings
Director, Site Reliability Engineering
at
Element Fleet Management
Director, Site Reliability Engineering
  • Company
    Element Fleet Management
  • Location
    Mississauga, Ontario, Canada
  • Type
    Full-time
  • Date Posted
    January 2, 2025
**Job Title: Director, Site Reliability Engineering**

**Company: Element Fleet Management**

**Location: [Insert Location]**

**About Us:**
Element Fleet Management is redefining the fleet management industry with a people-first approach, delivering exceptional client experiences every day. We are the largest pure-play fleet manager in the world, committed to fostering a culture where every employee can make a difference.

**Position Summary:**
We are seeking a Director, Site Reliability Engineering to lead our SRE team. This role requires strong customer focus, adaptability, and a proactive approach to problem-solving. You will collaborate with cross-functional teams to implement SRE practices, minimize downtime, and drive automation for increased efficiency.

**Key Responsibilities:**

- **Team Leadership and Development:**
- Hire, mentor, and develop a high-performing SRE team.
- Foster collaboration and provide ongoing training opportunities.

- **Incident Management and Response:**
- Lead incident response efforts and coordinate with stakeholders for timely resolutions.
- Conduct post-mortems to identify and implement preventive measures.

- **Problem Management:**
- Analyze and address root causes in applications and systems.
- Establish processes for tracking and resolving long-term problems.

- **Change Management and Release Engineering:**
- Oversee safe and reliable release management practices.
- Collaborate with development and QA teams to optimize deployment pipelines.

- **Service Level Objectives (SLOs) and SLAs:**
- Establish and monitor SLOs and SLIs in alignment with business needs.

- **Monitoring, Alerting, and Reporting:**
- Maintain monitoring solutions for system health and performance, and identify areas for improvement.

- **Automation and Tooling:**
- Drive the adoption of automation to improve efficiency and reduce manual intervention.

- **Capacity Planning and Disaster Recovery:**
- Conduct capacity planning and manage resources for system demands.

- **Audit and Compliance:**
- Collaborate with audit teams to ensure compliance with regulatory requirements.

- **Vendor Management:**
- Manage vendor relationships to ensure performance and service level agreements are met.

**Requirements:**

- Bachelor's degree in computer science, engineering, or a related field; advanced degree preferred.
- 10+ years of experience in IT operations, SRE, or a related field.
- In-depth knowledge of cloud infrastructure (AWS, Azure, or GCP), containerization, and infrastructure as code.
- Strong understanding of SRE principles and practices.
- Experience with automation and CI/CD tools.
- Proficiency in observability tools and monitoring systems.
- Strong incident response and post-incident analysis skills.
- Scripting and programming skills in languages such as Python, Go, or Bash.
- Familiarity with regulatory compliance frameworks.

**Preferred Skills:**

- Relevant certifications (e.g., Google Cloud, AWS DevOps, ITIL).
- Experience with advanced SRE practices.
- Strong project management skills.

**Compensation:**
The hiring base salary range for this position is $162,700 - $223,700 annually. Actual compensation will be based on knowledge, skills, experience, and market data.

**Benefits:**
- Comprehensive health and welfare benefits
- Paid time-off programs

Element Fleet Management is an equal opportunity employer committed to diversity, equity, inclusion, and belonging. We welcome all qualified applicants without regard to race, gender identity, age, sexual orientation, disability, or any other legally protected factors.

**For Accommodations:**
If you require assistance during the hiring process, please contact us at talentacquisition@elementcorp.com or call (800) 665-9744.

**Apply Now!**