The Machine Learning team is a relatively new team whose initial goal is to productize an Automated Machine Learning (AutoML) platform. This team works closely with the Backend Workflow team within the Core Services group in order to deliver ML pipelines as a service for all customers. Additional ML solutions already in our roadmap include Text Analysis, Causal Analysis/Discovery, Explainable AI (XAI), Uplift modeling, and Exploratory Data Analysis (EDA).
Productizing our ML products as a scalable cloud service requires diverse knowledge and experience not only in data science, but also software engineering. Areas such as container platform Java APIs (AWS ECS), AWS cloud API programming, Python ML libraries, SQL query processing for large data, Pandas dataframe processing for feature engineering, and AWS infrastructure management using Terraform are all topics you will be directly involved with over the course of your work in the ML team at Treasure Data.
This position is ideal for those with not only Data Science and Machine Learning skills, but also conventional cloud engineering skills for developing, deploying, and operating these critical ML products. This ML engineer position will expect candidates to understand machine learning algorithms, have experience in data analysis, and desire to grow as a software engineering in cloud computing environments.
Success in this role requires a passion for developing and productizing ML products with strong interests and knowledge in data science along with strong experience in software engineering.
You do this by collaborating with others to achieve our shared goals together in a self-organized team; pursuing autonomy with ownership, while increasing trust and sustainability to evolve continuously together. You would be able to effectively communicate ideas, software system designs, implementations, and decisions in a clear and concise manner to make others understandable.
- Work with product managers and engineering colleges to define and deliver new ML products.
- Continuously learning new ML algorithms or techniques.
- Work with distributed development teams to operate ML as a Service by participating in on-call rotations.
- Pro-actively and continuously improving existing systems and processes together with team members.
- A BS in Computer Science or a related field, or equivalent work experience.
- 3+ years of experience with software development
- Hands on experience on building and maintaining machine learning product or services
- Strong Python coding skills along with other typed programming language experience such as Java or C++, meaning that not only scripting skills.
- Strong data science knowledge including state-of-the-art ML models, libraries, and techniques.
- Experience with SQL query processing and Pandas Dataframe API programming
- Industry experience using public cloud IaaS providers like AWS.
- Quickly catch up new technologies or company standards.
- Understand software development life cycle such as mock, CI (circleci), unit testing, and github actions.
Nice to haves
These aren’t required, but be sure to mention them in your application if you have them.
- Experience with Automl frameworks such as AutoGluon, H2O AutoML, PyCaret, FLAML and so on.
- Experience for Explainable AI (XAI) such as SHAP and LIME.
- Experience working with big data technologies such as Hadoop, Hive, Presto, Spark, BigQuery, and Redshift.
- (Awarded) experience in data science competitions such Kaggle.
- OSS contribution experiences.
- Strong EDA skills using Pandas and Notebooks for feature engineering and so on.
- Experience with Container runtimes such as AWS ECS/EKS and Kubernates.
- Familiar with Infrastructure as Code using Terraform, or CloudFormation.
- Demonstrated ability working collaboratively in cross-functional teams and a strong track record for delivery as part of a team.
- Familiar with security best practices including knowledge about Security Groups, IAM, networks.
- Experience with distributed teams across different time zones.