As an AI/ML Engineer at Zeals you will design, build and optimise the machine learning models and underlying infrastructure that power our Conversational Commerce platform. You are proficient in applying cutting edge LLMs, RAG pipelines, and vector search, alongside classical supervised and unsupervised ML algorithms for intent classification, recommendation, and user behavior prediction. You will own the entire model lifecycle, data collection, feature engineering, training, evaluation, deployment and online monitoring, and work closely with product, backend and data teams to convert state of the art research into reliable, scalable and secure production systems that keep our chatbots smarter, safer and more natural.
Responsibilities
- AI / ML & NLP Research & Optimisation
- Design, train and maintain LLMs, RAG pipelines, vector retrieval modules , and classical ML models (e.g. intent classification, recommendation, anomaly detection).
- Build multilingual NLP components such as named entity recognition (NER), sentiment analysis and topic modelling for Japanese, Chinese and English.
- Apply prompt engineering, fine tuning, quantisation and distillation to maximise accuracy, latency and cost efficiency.
- Own the full model lifecycle data collection, feature engineering, training, evaluation and iterative improvement.
- Production Deployment & Continuous Delivery
- Partner with backend, data engineering and product teams to deploy models on AWS / GCP with CI/CD, blue green or canary releases, online monitoring and A/B testing.
- Build horizontally scalable training, inference and monitoring architectures (Docker / Kubernetes / serverless) to ensure high availability and low latency.
- Orchestrate end to end ML pipelines with Airflow, Kubeflow or similar , and maintain a model registry & feature store.
- Research & Technology Scouting
- Track cutting edge AI/ML/NLP papers and open source projects; quickly prototype, benchmark and translate promising ideas into product value.
- Conduct offline and online experiments, reporting metrics such as precision@k, recall, latency and serving cost to guide roadmap decisions.
- Internal Tooling & Knowledge Sharing
- Build automated experimentation platforms, reproducible notebooks and evaluation dashboards.
- Author best practice documentation, code samples and tech talks to uplift team skills and drive a culture of continuous learning.
Requirements
- 3+ years of hands on experience shipping AI/ML/NLP products, covering the full lifecycle from POC to production.
- Proven expertise in RAG system design and implementation, integrating vector databases (FAISS, Pinecone, Qdrant, etc.) with LLM generators.
- Proven expertise in designing, testing, and optimizing scalable, reusable prompts for conversational AI applications.
- Strong Python skills plus deep proficiency with PyTorch or TensorFlow.
- Practical experience with cloud deployment and containerisation (Docker/Kubernetes).
- Solid data engineering fundamentals, designing efficient ETL pipelines, feature stores and data quality controls.
Nice to haves
While not specifically required, tell us if you have any of the following.
- Japanese ability
- End to end MLOps experience: model versioning, ML centric CI/CD, feature stores, monitoring and automated retraining.
- Track record optimising large scale LLM inference for speed or cost (quantisation, distillation, LoRA, etc.).
- Background in conversational AI, voice interfaces or multilingual chatbots.
- Hands on experience with privacy (PII protection), generative AI risk management or security hardening.
- Open source contributions, peer reviewed publications or an active technical blog.
- Familiarity with AWS Bedrock / SageMaker, GCP Vertex AI, serverless architectures or GPU cluster cost optimisation.