A leading Robotics Firm’s Engineering team is developing the world’s first enterprise-level Platform-as-a-Service (PaaS) for robots, creating a rare opportunity for an experienced, product-focused engineering professional. The PaaS aims to aid and offer innovative features to handle every part of the product lifecycle required to support and deliver consumer-facing connected machines and services.
The Site Reliability Engineering combines skills of software and systems engineering. Your key responsibility is to focus on optimizing existing systems, building infrastructure, and eliminating work through automation to make them more reliable and ensure the highest possible uptime for all users and developers on the platform.
- Support services before they go live through activities such as system design consulting, capacity planning, and launch reviews
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation, and refinement
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
- Practice sustainable incident response and postmortems
- Build and evolve the operations handbook
【会社概要 | Company Details】
Growing Cloud Robotics company, provides a cloud robotics platform that accelerates solution development and operation.
【就業時間 | Working Hours】
Flextime（Mon - Fri）
【休日休暇 | Holidays】
Saturday, Sunday, and National Holidays, Year-end and New Year Holidays, Paid Holidays, Other Special Holidays
【待遇・福利厚生 | Services / Benefits】
各種社会保険完備（厚生年金保険、健康保険、労災保険、雇用保険）、 屋内原則禁煙（屋外に喫煙所あり）、 通勤交通費支給等
Social insurance, Transportation Fee, No smoking indoors allowed (Designated smoking area), etc.
- Experience in product development and/or supporting operations
- Mastery of one or more of the following programming languages including but not limited to Python, Golang, Ruby, Bash
- Experience with algorithms, data structures, complexity analysis, and software design
- Familiar with Config Management, Docker, IaaS, PaaS, Continuous Delivery, Continuous Integration, DevOps, ChatOps
- Solid understanding of network fundamentals and practical experience troubleshooting networked services
- Demonstrated proficiency with Linux systems, public cloud platforms, and associated tools/technologies
- Experience with container management platforms like Kubernetes, OpenShift or Mesos
- Extremely organized, detail oriented and thorough in every undertaking
- Ability to balance multiple tasks and projects effectively and quickly adapt to new variables
- Experience in designing, analyzing and troubleshooting distributed systems
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
- Ability to debug and optimize code and automate routine tasks