At PayPay, we’re constantly working on improving our systems and processes to be prepared for PayPay’s exponential growth. As a Data SRE at PayPay, we strive towards empowering our developers with the right tools and ensuring high availability, top-notch performance so that our users can have a great experience with our services.
Considering PayPay’s growth, we are looking for experienced SRE who can deliver Observability, Stability and Operations of Data Systems/Pipelines bottlenecks, ensure reliability of the Data Pipelines, Data Lakes, Job Schedulers. Specifically, we are looking for someone who can bring informed and unique viewpoints, enjoy collaborating with a cross-functional team, and is actively pushing boundaries to develop scalable Big Data solutions and positive user experiences.
Responsibilities
- Define SLOs, SLIs with respect to key indicators like Data Freshness, Data Quality, etc.
- Design, Support and improve the availability, scalability, stability, reliability, monitoring and alerting & latency of Paypay Data systems
- Manage day-to-day operations of data services, near real-time and batch data pipelines
- Create new designs, architectures, standards and methods for large-scale distributed systems
Requirements
- Knowledge of ingesting, modelling, processing and ETL designs
- Has, in past demonstrated managing large scale production grade Data Lake, Data Warehouse & ETL systems
- Should be the Point Person for data integrity/quality within data storage systems and perform root cause analysis & triage issues
- Should be involved in Data Storage capacity planning, make forecasts, and ability to tune the systems
- Work with multiple stakeholders across teams to build secure data transfer Qualifications
- Good understanding of DevOps concepts and implementation
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems like Redis, Elasticsearch, Kafka, Hadoop and MySQL
- Experience in analytical solutions like Looker
- Experience in Data Lakes like Apache Hudi, Data Warehouse like Big Query and RedShift
- Knowledge of Spark, Glue, Python and Scala
- In-depth knowledge and hands-on experience with AWS Cloud Native Data Applications and production workloads.
Nice to haves
These aren’t required, but be sure to mention them in your application if you have them.
- Knowledge about Microservices
- Knowledge about observability and how to gather data
- System design experience and capacity planning for large distributed systems
- Understanding of Automation tools and implementation
- Terraform/cloud formation experience
- Experience with managing monitoring tools like Cloudwatch, NewRelic, etc. Good understanding of DevOps concepts and implementation
Compensation
9 to 12 million JPY annually.