Dropbox - Site Reliability Engineering Manager

Dropbox is the home for your most important stuff—now we're bringing it to life with a growing family of products. As we scale our global brand, there’s plenty of space for you to grow alongside us and simplify life for millions of people around the world.

Our engineering team is architecting a family of products that handle over a billion files a day. We take on the complexities of technology that affect everyday life, so that people can get back to living and doing their best work.

Dropbox's Site Reliability Engineering (SRE) team is a hybrid software/systems group which works with traditional software engineering, capacity engineering, and infrastructure teams to ensure that Dropbox runs smoothly. Managing a SRE team requires a high degree of technical mastery, the ability to brutally prioritize and execute, and a focus on growing teams both by recruiting and mentorship.

Manage engineers working with infrastructure and product engineering teams. Example services may include our metadata storage infrastructure run on MySQL, Go based processes serving as RPC systems for front-end components, or front-end components themselves such as the Photos tab
Understand technical architectures, failure domains, tooling/automation, product launch plans, disaster recovery/business continuity plans, and other issues
Create plans for prioritizing technical and resourcing challenges within the infrastructure organization
Partner with product management, network engineering, product engineering, and other related groups
Help engineers develop their careers, assigning them to projects tailored to their skill levels, long-term skill development, personalities, and work styles
Build teams and work closely with Dropbox recruiting team members, which involves sourcing/engaging candidates, interviewing, organizing Dropbox participation in conferences/events, and onboarding new employees
Balance the need to "keep things running" with allocating time to long-term, high-impact projects
Assess employee performance frequently by providing feedback on an ongoing basis, address under-performance, and recognize excellent performance

BS/MS in Computer Science, Engineering, or a related technical discipline or equivalent experience
At least three years of direct management and leadership experience at a technology company
Previous experience with hiring and performance management, including working with under-performers
Sound knowledge of Linux and TCP/IP networks
Ability to code well in at least one language
Above average knowledge of basic large-scale internet service architectures (such as load balancing, LAMP, CDNs)
Good understanding of how to think about data durability (think backups, max time to recovery, and generally how to avoid losing data at all costs)

Back to Engineering Team

Software Engineer - Product SecuritySan Francisco, CA

MySQL Site Reliability EngineerSan Francisco, CA

Hadoop Site Reliability EngineerSan Francisco, CA

Quality Assurance EngineerSan Francisco, CA

Site Reliability EngineerSan Francisco, CA

Software EngineerSan Francisco, CA

Software Engineer - University GradSan Francisco, CA

Software Engineer - Computer VisionSan Francisco, CA

Software Engineer - AndroidSan Francisco, CA

Software Engineer - iOSSan Francisco, CA

Software Engineer - ProductivitySan Francisco, CA