Responsibilities
- Prompt & Pipeline Engineering: Design and implement robust prompt-tuning and orchestration pipelines using frameworks like LangChain or LlamaIndex. This includes techniques such as prompt chaining, few-shot prompting, and developing autonomous agents.
- Model Integration & Deployment: Integrate and deploy LLMs (both proprietary and open-source) into production environments. This involves managing API integrations, handling request/response schemas, and ensuring models are performant and cost-effective.
- Data & Knowledge Systems: Design and build data pipelines for Retrieval-Augmented Generation (RAG) systems. This includes creating vector stores, optimizing embedding strategies, and ensuring data is up-to-date and accessible for real-time model queries.
- Backend & API Development: Develop and maintain high-performance, fault-tolerant backend services and APIs that serve as the core of our AI-centered applications.
- Performance Benchmarking: Conduct quantitative analysis and A/B testing to benchmark the performance, latency, and cost of different LLM prompts and models, and iterate on solutions based on data-driven insights.
- Cloud Infrastructure: Utilize cloud platforms (GCP, AWS, Azure) to deploy, manage, and scale applications and their supporting infrastructure.
- Collaborative Development: Work closely with frontend developers and product managers to integrate LLM-powered features seamlessly into the user experience and deliver value-driven solutions.
Requirements
- Education and Experience: Hold a Bachelor’s degree, or 10 years of industry experience in a software development role
- Programming Languages: Experience in at least one systems language (Python, Go, Java, C++, Rust, etc.)
- Database Experience: Experience with database management and operations, including relational and NoSQL databases, and an understanding of vector databases.
- Cloud Proficiency: Proficiency with setting up, deploying, operating, and maintaining applications with at least one major cloud provider, such as AWS, GCP, or Azure
- Applied LLM Experience: Demonstrable experience in building and deploying at least one application powered by a large language model. This could include a personal project, a contribution to an open-source tool, or professional work.
- Benchmarking and Performance: Familiarity with benchmarking and quantitative analysis methods to assess and optimize AI and application performance
- Passion for the Field: Passionate about AI and how it applies in the present, and vision for how it will apply in the new future
- Location and Visa: Must be currently residing in Japan with a valid visa
- Language Ability: Proficient in English, able to discuss in depth technical topics in it; familiarity with Japanese (N3 or above) is a plus