As an engineer responsible for the high availability and reliability of the system infrastructure required for a digital bank, you will be tasked with building and operating this infrastructure.
Experience Gained
- Challenge from Scratch. You will have the opportunity to build everything from the ground up, including the system infrastructure of a new digital bank, SRE culture, development processes, and on-call systems.
- Modern Technology Stack. You can work with modern and challenging technologies in environments that demand financial-grade reliability, such as AWS multi-region configurations and Aurora Global Database.
- Contribution to the Business. By closely collaborating with business and application development teams, you can directly contribute to the business launch from a technical perspective. Your input will influence the bank’s reliability.
- Flat Organization. In a small, elite team, you will have the autonomy to actively participate in technology selection and architecture design.
Tech Stack
- Backend: Kotlin
- Infrastructure: Amazon ECS, Amazon Aurora Global Database (MySQL/PostgreSQL), Amazon ElastiCache Global Datastore, RedHat Enterprise Linux, Cloud flare
- Monitoring: Datadog
- IaC: Terraform
Responsibilities
- Ensure stable operation of the system platform with high availability and reliability
- Improve service reliability by defining and operating SLI/SLO and continuously improving performance
- Use Datadog for monitoring, observability, and incident response
- Ensure compliance and manage the infrastructure and security operations, particularly with AWS multi-region configurations
- Maximize the productivity of the development team through platforms (CI/CD, development environments and testing environments)
- Conduct performance optimization and create mechanisms for it
- Handle incidents, recovery, and incident management
- Handle incidents, recovery, and incident management, as well as fostering a postmortem culture.
- Address regulatory requirements specific to the financial industry
Requirements
- At least 3 years of hands-on experience in design and operations within SRE, infrastructure, or similar fields.
- Proficiency in working with cloud platforms, especially AWS.
- Familiarity with container orchestration tools such as Kubernetes.
- Experience with monitoring and observability solutions.
- Competence in using Infrastructure as Code tools like Terraform or Ansible.
- Development experience utilizing Git for pull requests and code reviews
- Conscious of high security levels and governance, you explore automation and better methods
- You have the ability to question the status quo and seek better ways to do things, considering trade-offs with the current state of the organization
- With an understanding of the development team’s pain points, you can propose and implement solutions from a Platform/SRE perspective to improve the overall service
- You contribute to the growth of the entire team by addressing challenges that no one knows the answer to yet, and by collaborating with the team
- Your contributions help connect and enhance the growth of the entire team
- Japanese: N3 level (able to use translation tools for text and conversation)
Nice to haves
While not specifically required, tell us if you have any of the following.
- Practical experience in industries with high security and compliance requirements, such as financial systems
- Expertise in managing and optimizing RDBMS like MySQL or PostgreSQL.
- Experience in designing, building, and operating disaster recovery environments using multi-region setups
- Experience in designing and operating SLI/SLO
- Experience in improving deployment times, performance optimization, and enhancing CI processes for development efficiency
- Experience in web application development
- Experience in managing AWS Organizations
- Experience in AI development and/or experience in using AI tools to improve development processes.
- Money Forward recently announced our AI Strategy roadmap which focuses on improving AI-driven operational efficiencies, as well as integrating AI agents into our products to deliver better value to our users.
Compensation
6.408 to 15 million JPY annually.