The Plazma team at Treasure Data is one of the essential elements of our CDP solution and is part of the Core Services group, which supports customer data ingestion and availability at a rate of 70B records per day. We develop & run the storage and query engine components and enable customers to find and store their data by offering comprehensive solutions based on OSS and proprietary software. You are expected to help the team develop the future of our Trino query engine and expand from there into Hive/Hadoop and our in-house developed storage solution. This includes maintaining technical excellence to address challenges that currently lack industry-wide solutions and delivering the roadmap together with your team. Our team consists of Big Data experts across Japan, Korea and Canada who are passionate about OSS contribution, and we take pride in the quality of service we offer.
Responsibilities
- Work as a member of the team by designing and developing Trino solutions
- Be responsible for providing solution expertise around Trino technologies. This includes technology assessment, use case development, as well as solution outline and design for modern data architectures
- Establish standards and guidelines for the design & development, tuning, deployment, and maintenance of advanced data access frameworks and distributed systems
- Document architectural and technology advancements
- Work with your team to set up the roadmap for Trino-related products based on operational needs and customer-requested features
- Mentor and train new members in the team
- Version and release management of Trino products
- Evaluate, test, and set a base version
- Backport any needed patches from trunk, which contains the latest cutting-edge version of the project, but therefore may also be the most unstable version
- Deploy new customer-facing features for Trino
- Coordinate with support and product teams on product releases
- Make contributions to the Trino open source community
- Automate Trino cluster operations to reduce operational overhead
- Design, develop, and evaluate metrics to ensure system health and plan infrastructure capacity of clusters
- Design and develop scripts to automatically start and stop clusters and switch traffic between active clusters for load balancing of customers’ workloads
- Design and develop failure recovery tools to automatically detect the occurrence of faults and recover faulty clusters
- Provide in-depth support services to Trino customers
- Take responsibility for on-call to support Trino customers
- Deal with escalations on product defects and performance issues, lead and perform in-depth troubleshooting of Trino-related systems
Requirements
- Work-context English
- A BS or higher in Computer Science or equivalent experience
- At least 5 years’ experience:
- Java
- Operating production scale deployments
- With MySQL, PostgreSQL or other open-source distributed databases/key-value stores
- Solid understanding of cloud architecture and services in public clouds like AWS, GCP, or Microsoft Azure
- Deep understanding of distributed systems and their challenges
- Experience working with distributed, scalable Big Data stores or NoSQL, including HDFS, S3, Cassandra, Big Table, etc.
- Experience in developing use cases, functional specs, design specs, ERDs etc.
- A solid understanding of computer science (algorithms, data structures, etc.)
- Solid understanding and handling of Big Data problems
- Able to work independently as well as in a team
- Strong capability in implementing new and improved data solutions for multi-tenant environments
- Solid understanding of cloud architecture and services in public clouds like AWS, GCP, or Microsoft Azure
- Experience developing and operating distributed data warehouses or query engines like Trino, Presto, Spark, Snowflake, Databricks, AWS Athena, Google BigQuery.
Nice to haves
While not specifically required, tell us if you have any of the following.
- Deep understanding of the capabilities of Trino
- Kotlin and Scala experience
- Familiar with microservices-based software architecture
- Expertise in Data Integration patterns
- Strong track record to drive rapid prototyping and design for Big Data
- Experience with extending Free and Open–Source Software (FOSS) or COTS products
- Strong IT & Security skill sets and knowledge
- Experience with the design and development of multiple object–oriented systems
- Good understanding of ‘infrastructure as code’ and operations