A world leading robotics and automation company is looking for a Site Reliability Engineer!
They are developing the world’s first enterprise-level Platform-as-a-Service (PaaS) for robots, creating a rare opportunity for an experienced, product-focused engineering professional. The PaaS aims to aid and offer innovative features to handle every part of the product lifecycle required to support and deliver consumer-facing connected machines and services.
Site Reliability Engineering combines skills of software and systems engineering. Your key responsibility is to focus on optimizing existing systems, building infrastructure, and eliminating work through automation to make them more reliable and ensure the highest possible uptime for a cloud-based robotics system.
- Support services before they go live through activities such as system design consulting, capacity planning, and launch reviews
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
- Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
- Practice sustainable incident response and postmortems
- Build and evolve the operations handbook
【会社概要 | Company Details】
Growing Cloud Robotics company, provides a cloud robotics platform that accelerates solution development and operation.
【就業時間 | Working Hours】
Standard Working Hours: 8 hours（Core Time: 11:00 - 15:00 / Mon - Fri）
【休日休暇 | Holidays】
Saturday, Sunday, and National Holidays, Year-end and New Year Holidays, Paid Holidays, Other Special Holidays
【待遇・福利厚生 | Services / Benefits】
Pension, Social Insurance, Medical Healthcare, Transportation Fee, etc.
- Experience in product development and/or supporting operations
- Mastery of one or more of the following programming languages including but not limited to Python, Golang, Ruby, Bash
- Familiar with Config Management, Docker, IaaS, PaaS, Continuous Delivery, Continuous Integration, DevOps, ChatOps
- Solid understanding of network fundamentals and practical experience troubleshooting networked services
- Demonstrated proficiency with: Linux systems, public cloud platforms, and associated tools/technologies
- Experience in designing, analyzing and troubleshooting distributed systems
- Ability to debug and optimize code and automate routine tasks