Mercari Group utilizes data for business decisions and other processes in a wide variety of areas, including marketing, machine learning, and R&D. As part of Merpay’s Data Platform Team, you will design, develop and operate the data infrastructure/data pipelines which support data utilization for the entire Mercari Group, including not only Merpay, but the Mercari marketplace app. You will also be responsible for work on the Mercari Group’s data utilization as a whole.
The Data Platform Team is an engineering organization which builds the systems necessary to a wide range of domains that utilize data. This work includes development of an ecosystem enabling data collection, Data Lake, DWH, use of collected data and other processes required by data utilization. The following describes the responsibilities of a software engineer on the Data Platform Team, as well as the environment and organization in which they work.
Unique and Bold Challenges
- Opportunity to build large-scale data infrastructure to support the massive amounts of data generated by Mercari Group, including the Mercari Marketplace app with its over 20M monthly users.
- Work on development of an ecosystem to support data-driven business expansion by addressing the data utilization needs of many domains including analytics/decision-making, marketing, machine learning, and R&D.
- Take on the entire data platform creation process, from design to development to operation, to revise existing data infrastructure and develop new features which meet new data utilization needs.
Responsibilities
- Collecting data from microservices
- Developing batch-based data pipelines
- Developing streaming-based data pipelines
- Developing SDK for logging within microservices
- Data Lake, DWH
- Constructing Data Lakes for storing collected data
- Managing partial permissions and resources on DWH
- Data utilization
- Developing tools to support use of data in Data Lakes and on DWH
- Developing a platform to provide stream data processing function
- Company-wide
- Improving performance, aiming for low latency/high throughput according to application and middleware development, operation, and requirements
- Developing tools to automate operations and/or lower costs
- Identifying and solving technical issues with the system caused by engineering
Requirements
- A shared belief in Mercari and Merpay’s missions and values
- A degree in computer science or a related field, or else five or more years of practical experience in software development
- Experience designing, developing, and operating large-scale services and/or distributed systems
- At least two of the following:
- Experience developing in the cloud with AWS, GCP, etc.
- Experience developing systems utilizing container technologies such as Kubernetes
- Development experience using message queues like Cloud Pub/Sub and Apache Kafka
- Experience in data processing development using distributed processing frameworks like Apache Flink and Apache Spark
- Experience with ETL system using workflow engines like Airflow and Digdag
- Experience designing application logs for large-scale services
- Japanese ability: Independent (CEFR - B2) - Can exchange complex information in your area of expertise with limited support/accommodation from the other party
Nice to haves
While not specifically required, tell us if you have any of the following.
- Experience working at a financial institution, Fintech company, or EC company
- Ability to identify the cause of technical system issues (e.g. drop in performance) and to resolve them
- Knowledge of network protocols such as TCP/IP, HTTP, gRPC, etc.
- Experience developing and operating software using two or more of the following: Go, Java, Scala, or Python
- Developed data collection systems using Treasure Data-created OSS (such as Fluentd, Embulk)
- Experience in development using DWH like BigQuery, Redshift, or Snowflake
- Experience in development using RDBMS, including MySQL and RDS
- Experience in development or cluster operation using the Hadoop ecosystem
- Experience developing software based on a microservice architecture
- Experience publishing and contributing to OSS