NLP Data Engineer

Yaraku Shibuya-ku, Tokyo
  • 💴 ¥4.2M ~ ¥5.0M annually
  • 🏡 Fully remote (from Japan)
  • 🧪 Minimum years of experience unspecified
  • 💬 Conversational Japanese
  • 🌏 Apply from abroad
  • 🧳 Relocate to Japan
DO YOU NEED MORE INFO?
ASK A QUESTION

About Yaraku

Yaraku Shibuya-ku, Tokyo

Yaraku is a small start-up located in Shibuya, Tokyo. Our focus is primarily on our web-based Translation Management System application.

Key benefits

  • Work deeply without interruptions
  • Freedom and responsibility
  • Craft quality code

About the position

We are seeking a talented NLP Data Engineer to join our team. As an NLP Data Engineer, you will be responsible for designing, developing, and maintaining our data infrastructure. You will play a vital role in creating crawlers to collect text data from the internet, filtering, processing, and overseeing the quality of our text data to support NLP initiatives.

Responsibilities

  • Design, develop, and maintain efficient and scalable data pipelines for collecting text data from various sources, including databases and the internet.
  • Implement data cleaning and preprocessing techniques to enhance the quality and consistency of text data.
  • Collaborate with NLP engineers and researchers, to understand their data requirements and ensure the availability and accessibility of high-quality text data.
  • Monitor and optimize data processing workflows to ensure efficient and reliable data delivery.
  • Identify and resolve data quality issues, implementing measures to maintain data accuracy and integrity.
  • Stay up-to-date with the latest advancements in data engineering technologies, identifying opportunities to enhance our data infrastructure and workflows.

Requirements

  • Bachelor’s degree in a relevant field or a minimum of 10 years of work experience (for Visa purposes). If you have more than 10 years of experience, you must provide proof of your experience by submitting employment contracts from your former employers.
  • Solid understanding of data processing and data pipeline architectures.
  • Proficiency in Python, including expertise with relevant libraries and frameworks such as Moses, SentencePiece, and spaCy.
  • Strong problem-solving and analytical skills, with attention to detail and data quality.
  • Intermediate Japanese reading ability, as you’ll be working with Japanese data. You won’t need to write Japanese or talk in it to perform this position.

Nice to haves

While not specifically required, tell us if you have any of the following.

  • Experience with web scraping techniques and tools.
  • Knowledge of distributed computing frameworks like Apache Spark.
  • Knowledge of database systems and SQL.
  • Familiarity with text data cleaning and preprocessing techniques.
  • Experience with data governance and compliance in handling sensitive or personal text data.

Compensation

4.2 to 5.04 million JPY annually.

DO YOU NEED MORE INFO?
ASK A QUESTION

Meet Yaraku's Developers

Photo of Mary Grygjeanne Grace Icay

Building up Yaraku's QA Process from scratch

with Mary Grygjeanne Grace Icay

Grace started her career as a software developer in the Philippines but discovered her passion for QA when she moved to Japan. As Yaraku's first QA Engineer, not only does she perform the QA herself, Grace also designed Yaraku’s official QA process.

Read her story...

Other Jobs at Yaraku

Related jobs

More jobs like this

I'll send you a digest of new English-friendly software developer jobs in Japan. Your email stays private, I don’t share or sell it.