As a Senior SRE, you will be responsible for developing key processes and procedures to facilitate the smooth planning, execution, and delivery of products, as well as implementing requirements. With our aim of using AI to solve societal issues, you will have the mission of solving engineering department problems!
Responsibilities
- Lead the design, implementation, and management of scalable and reliable infrastructure solutions in public cloud environments (e.g., AWS).
- Lead the development and maintenance of Kubernetes clusters, ensuring optimal performance, availability, and security.
- Collaborate with development teams to provide expertise in designing architecture, act as a trusted advisor for development teams, provide consultations on infrastructure-related matters and guide them toward effective and scalable solutions.
- Monitor system performance, troubleshoot complex issues, and implement proactive measures to ensure high availability and reliability.
- Lead incident response and resolution, conducting post-mortem analyses to identify areas for improvement.
- Lead the professional development initiatives within the team by mentoring junior members, conducting comprehensive code reviews to uphold quality and best practices, and orchestrating training and workshops to enhance overall skill sets.
Requirements
- Extensive expertise in at least one cloud platform (i.e. AWS, Azure, GCP, etc…) and experience in designing and leading the management of scalable cloud-based infrastructure
- Strong expertise in infrastructure-as-code solutions such as Terraform
- Strong operational expertise in containerization technologies, especially Kubernetes
- In-depth knowledge of source control, CI/CD, infrastructure automation, orchestration, deployment automation and configuration management
- Solid understanding of networking and security best practices
- Excellent problem-solving skills and the ability to lead collaboratively in a team-oriented environment.
- While our team is mostly english-speaking, you should be comfortable enough talking in Japanese with other internal stakeholders
Nice to haves
While not specifically required, tell us if you have any of the following.
- AWS Solutions Architect certifications or knowledge on par with those
- Certified Kubernetes Administrator or knowledge on par with those
- Familiar with scripting languages (Shell, Python, Golang)
- Familiar with extended infrastructure-related tooling such as Ansible or Chef
- Experience in working with large software systems developed on Unix/Linux
- Experience of working with monitoring and metrics systems (e.g Grafana, Datadog, etc.)
- Experience in leading teams through incident response and post-mortem analysis
- Experience in working closely together with development, product and business teams
- Japanese daily conversation ability
Compensation
7 to 13 million JPY annually.