This position is closed and is no longer accepting applications.

Senior Site Reliability Engineer - Team Lead

Rapyuta Robotics Koto-ku, Tokyo
    💴 No salary range given
    🏡 No remote
    🧪 Minimum years of experience unspecified
    💬 No Japanese required
    🌏 Apply from abroad
    🧳 Relocate to Japan

About Rapyuta Robotics

Rapyuta Robotics Koto-ku, Tokyo

Building low­ cost, lightweight autonomous mobile robots with high-level intelligence distributed in the cloud, and enabling such robots to offload some of their heavy computation and seamlessly learn and share experiences with one another.

Key benefits

  • Open, inclusive, safe environment
  • There's always something to learn
  • Relocation support

About the position

We are developing the world’s first enterprise-level Platform-as-a-Service (PaaS) for robots, creating a rare opportunity for an experienced, product-focused engineering professional. The PaaS aims to aid and offer innovative features to handle every part of the product lifecycle required to support and deliver consumer-facing connected machines and services.

Site Reliability Engineering combines skills of software and systems engineering. Your key responsibility is to focus on optimizing existing systems, building infrastructure, and eliminating work through automation to make them more reliable and ensure the highest possible up-time for a cloud-based robotics system.

Your responsibilities

  • Leading the SRE team, mentoring junior engineers and supporting delivery excellence
  • Supporting services before they go live through activities such as system design consulting, capacity planning, and launch reviews
  • Maintaining services once they are live by measuring and monitoring availability, latency, and overall system health
  • Engaging in and improve the whole lifecycle of services—from inception and design, through deployment, operation, and refinement
  • Scaling systems sustainably through mechanisms like automation, and evolving systems by pushing for changes that improve reliability and velocity
  • Practicing sustainable incident response and postmortems
  • Building and evolving the operations handbook

Minimum qualifications

  • Bachelor’s degree in Computer Science or a similar technical field of study, or equivalent practical experience with an outstanding track record
  • At least 5 years of experience in product development and/or supporting operations
  • Mastery of one or more of the following programming languages including but not limited to Python, Golang, Ruby, Bash
  • Expertise with Configuration Management, Docker, IaaS, PaaS, Continuous Delivery, Continuous Integration, DevOps, ChatOps
  • Solid understanding of network fundamentals and practical experience troubleshooting networked services Demonstrated proficiency with: Linux systems, public cloud platforms, and associated tools/technologies

Preferred qualifications

  • Extremely organized, detail oriented and thorough in every undertaking
  • Ability to balance multiple tasks and projects effectively and quickly adapt to new variables
  • Experience in designing, analyzing and troubleshooting distributed systems
  • Experience with team management
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Ability to debug and optimize code and automate routine tasks

Related jobs

More jobs like this

I'll send you a digest of new English-friendly software developer jobs in Japan. Your email stays private, I don’t share or sell it.