Staff Customer Reliability Engineer (Remote)
You have expert Linux systems and network engineering experience, advanced vSphere experience, you’ve built, scaled and managed Kubernetes across a diverse set of use cases (edge, bare metal, cloud). You’ve been a Site Reliability Engineer in a former life and you have a passion for working directly with customers. You love to work with cutting edge technologies, you care about building things reliably, and you have always imagined yourself sharing your deep expertise and experience with others on the team and cross-organizationally. Finally, you are experienced in working remotely and within a fully remote and distributed team and as such, have a keen ability to communicate clearly both verbally and in writing, and care about building cohesion amongst the team.
What is the primary need, technical challenge, and/or problem you will be responsible for?
We’ve built a strong Customer Reliability Engineering function with a solid customer base and are now ready to put ourselves on the map. We need someone to represent us with customers (leading customer interactions and handling highly-visible technical escalations) as well as internally with Product and other Engineering teams who are building future products and want our input.
- You will work with the Director of Customer Reliability Engineering to set our technical direction and partner with colleagues at the associate, mid and senior Customer Reliability Engineering levels, as well as our Program Managers whose role is to deliver Reliability Engineering to customers.
- You will lead the team in their effort to support large scale Kubernetes clusters for customers.
- You have a fundamental understanding of how Kubernetes works including, but not limited to:
- How to implement and maintain a CNI provider
- High Availability cluster design and implementation
- Cluster capacity planning and resource utilization
- Application deployment strategies
- You will also join the CRE team’s on-call rotation and have a great deal of experience with incident and post mortem processes.
- You will drive improvements to our customer interactions and internal processes to reduce Mean Time to Resolution (MTTR).
- You will evaluate new technologies, concepts and use cases, building a roadmap outlining what’s next for our practice.
- You will develop and execute processes to ensure our Product Engineering teams have direct CRE input for their roadmaps.
- You will create, test, and improve customer-facing technical documentation that aligns with lessons learned from our collective experience with customers.
- You will drive assigned projects to completion, being clear when tradeoffs are needed and deadlines need to be adjusted to accommodate higher-priority work, while enabling others on the team to do the same.
- You will support the team in all levels of our work, including triaging tickets, supporting escalated issues, and driving technical challenges to completion. Once a ticket is resolved, you will synthesize this information into an actionable state so we can prevent the issue from happening again. You will do this with MTTR and SLAs in mind at all times.
- You will build and execute on our team’s roadmap in terms of technologies, process improvements, and team enablement.
- You will independently lead Reliability and Production Readiness Reviews of customer environments, ensuring that the information collected is complete, synthesized and actioned, and continuously look for additional ways we can instill reliability practices into our customers’ environments.
- You will partner with Program Managers to ensure the challenges they see with our customer base are addressed systematically.
- You will partner with the Director of Customer Reliability Engineering in a variety of technical aspects including lending expertise on pre-sales calls, collaborating on team direction and mentoring others when needed.
- You will present in front of technical and non-technical audiences, virtually and in-person, across the globe.
- You will lend a hand in creating automations or tooling that increase the efficiencies of your colleagues.
- You will advise customers on capacity planning.
The hiring manager for this role is Megan Bigelow, Director of Customer Reliability Engineering. Megan joined VMware as a part of the Heptio acquisition in 2018. Prior to this role, Megan has spent over a decade leading both large and small customer-facing engineering teams. Megan currently leads the CRE team which consists of a mixture of Reliability Engineers and Program Managers with a shared passion for solving Kubernetes problems reliably.
What are the benefits and perks of working at VMware?
You and your loved ones will be supported with a competitive and comprehensive benefits package. Below are some highlights, or you can view the complete benefits package by visiting www.benefits.vmware.com.
- Employee Stock Purchase Plan
- Medical Coverage, Retirement, and Parental Leave Plans for All Family Types
- Generous Time Off Programs
- 40 hours of paid time to volunteer in your community
- Rethink's Neurodiversity program to support parents raising children with learning or behavior challenges, or developmental disabilities
- Financial contributions to your ongoing development (conference participation, trainings, course work, etc.)
- Healthy and local inspired snacks in all our pantries
Category : Services and Consulting
Subcategory: Solutions Architect
Experience: Business Leadership
Full Time/ Part Time: Full Time
Work From Home: Yes
Posted Date: 2020-03-18
VMware Company Overview: At VMware, we believe that software has the power to unlock new opportunities for people and our planet. We look beyond the barriers of compromise to engineer new ways to make technologies work together seamlessly. Our cloud, mobility, and security software form a flexible, consistent digital foundation for securely delivering the apps, services and experiences that are transforming business innovation around the globe. At the core of what we do are our people who deeply value execution, passion, integrity, customers, and community. Shape what’s possible today at http://careers.vmware.com.
Equal Employment Opportunity Statement: VMware is an Equal Opportunity Employer and Prohibits Discrimination and Harassment of Any Kind: VMware is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. All employment decisions at VMware are based on business needs, job requirements and individual qualifications, without regard to race, color, religion or belief, national, social or ethnic origin, sex (including pregnancy), age, physical, mental or sensory disability, HIV Status, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, past or present military service, family medical history or genetic information, family or parental status, or any other status protected by the laws or regulations in the locations where we operate. VMware will not tolerate discrimination or harassment based on any of these characteristics. VMware encourages applicants of all ages. Vmware will provide reasonable accommodation to employees who have protected disabilities consistent with local law.