We are seeking a Lead Database Reliability Engineer (DBRE). In this role, you will lead reliability, operational quality, and migration strategies for the company-wide database infrastructure, with a focus on PostgreSQL and MySQL. As a DBRE on the Platform Team, you will take ownership of formulating and executing database reliability strategies across all products.
This is a founding role. Your work will define how we operate our relational databases for the next several years.
Your first mission upon joining will be to lay the groundwork for a migration program from MySQL to PostgreSQL based on our database strategy. You will prepare the environment so that product teams and SREs can execute the migration safely and smoothly.
Responsibilities
PostgreSQL Technical Leadership
- Owning the company-wide technical strategy for PostgreSQL
- Leading the design of PostgreSQL workloads in a multi-tenant architecture (schema design patterns, Row-Level Security, connection pooling, tenant isolation strategies)
- Providing hands-on support to product teams adopting PostgreSQL through design reviews, technical guidance, and pair debugging
- Defining and driving company-wide PostgreSQL operational standards (provisioning, backup, observability, performance tuning, disaster recovery)
- Managing PostgreSQL infrastructure as code (IaC) using Terraform, maintaining all environments in a reproducible and version-controlled state \
Standardization of MySQL Operations
- Defining and deploying operational best practices for existing MySQL systems
- Providing technical advisory and design reviews to product teams and SREs responsible for day-to-day MySQL operations
- Note: Day-to-day MySQL operations will continue to be handled by SREs and product teams. This role focuses on “raising company-wide operational standards” rather than “directly managing operations.”
Migration Support (Pull-Based)
- Providing technical support for MySQL-to-PostgreSQL migrations proposed by product teams and SREs
- Conducting migration design reviews, pre-migration simulations, and cutover planning support
- Compiling lessons learned from each migration project into a knowledge base
Reliability Engineering and Incident Response
- Integrating SRE principles (SLIs/SLOs/error budgets, toil reduction, blame-free post-mortems) into overall database operations
- Providing technical leadership for database-related incidents, conducting post-mortems, and establishing mechanisms to prevent recurrence
- Reducing toil through automation (provisioning, configuration management, backup/restore, failover, routine maintenance)
- Applying observability tools such as Datadog to database systems (query performance monitoring and alert design)
Knowledge Sharing and Technical Standardization
- Providing design reviews, workshops, and architectural guidance as the principal technical SME for relational databases across the company
- Accelerating the product team’s transition from MySQL to PostgreSQL through concept mapping and pattern sharing
- Developing and maintaining high-quality technical documentation, runbooks, and architecture references
Language Requirements
Requirements
PostgreSQL Expertise
- Extensive experience operating PostgreSQL in production environments
- Experience operating PostgreSQL in multi-tenant architectures and the ability to analyze trade-offs between schema separation, RLS, and shared schemas
- Understanding of PostgreSQL’s internal behavior
Experience Migrating from MySQL to PostgreSQL
- Experience leading the technical aspects of production-grade migrations from MySQL to PostgreSQL
- Understanding of pitfalls during schema conversion (data types, collation, character sets, default behavior)
- Knowledge of migration tools and experience designing zero-downtime cutover strategies and rollback plans
MySQL Operational Experience
- Extensive experience operating MySQL in production environments
- A level of MySQL understanding sufficient to provide technical advice to teams actively operating MySQL
Technical Leadership
- Proven track record as a technical lead in the database domain
- Experience establishing and promoting technical standards across multiple engineering teams and business units
SRE / Reliability Engineering
- Understanding of and practical experience with SRE principles (SLIs/SLOs/error budgets, reducing toil through automation, blame-free post-mortems, and capacity planning)
- Experience in IaC operations for database infrastructure using tools such as Terraform
- Experience applying observability tools such as Datadog to database systems
- Participation in on-call rotations and leading the response to database-related incidents
Communication and Knowledge Sharing
- Business-level or higher English proficiency
- Communication skills to clearly convey complex database concepts to engineers of varying experience levels
- Writing skills to continuously maintain high-quality technical documentation, runbooks, and architecture guides
- Experience mentoring engineers and conducting technical design reviews
Nice to haves
While not specifically required, tell us if you have any of the following.
Preferred Skills and Experience
- Experience participating in the PostgreSQL community (contributing, speaking at conferences, writing technical articles)
- Experience operating large-scale PostgreSQL environments (multi-terabyte databases, high QPS workloads)
- Python/Go programming skills for building operational automation tools
- Experience developing stored procedures and triggers using PL/pgSQL
- Experience in AI development and/or experience in using AI tools to improve development processes.
- Money Forward recently announced our AI Strategy roadmap which focuses on improving AI-driven operational efficiencies, as well as integrating AI agents into our products to deliver better value to our users. (More information here)
Compensation
¥8,004,000 ~ ¥15,000,000 annually.