AI Data Engineer
AI Data Engineer
Job type:
Skillhouse Contract
Specialization:
Data and Analytics
Language Level:
English Level - High Intermediate (TOEIC 730),Japanese Level - Advanced (JLPT Level 1)
Location:
Sumida-Ku
Salary:
¥750,000.00 - ¥850,000.00 Monthly
Job Reference:
491019
Our client, a global and one of the world’s largest Insurance Service provider is looking for a AI Data Engineer with in the IT Architecture team.
Responsibilities
- AI Data Pipelines: Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured data from multiple sources
- Data Quality & Feature Engineering: Perform cleansing, anomaly detection, deduplication, missing value handling, and feature preparation for AI/ML
- Data Governance & Lineage: Track data lineage, enforce compliance, and ensure reproducibility and transparency
- Dataset Management & MLOps Support: Curate, version-control, and manage Gold datasets; enable smooth integration with ML workspaces
- Metadata & Schema Management: Define metadata schemas, maintain data dictionaries, and handle schema evolution for AI model readiness
- AI Contextualization & Dataset Curation: Enrich datasets with labels, feature descriptions, temporal validity, and usage rights; create high-quality AI-ready datasets
- NLP Integration: Incorporate NLP tools (e.g., spaCy) for preprocessing, tokenization, NER, dependency parsing, and structured feature outputs
- Support AI Applications: Prepare data for autonomous agents, RAG systems, chatbots, and conversational AI
- Collaboration & Communication: Work in agile teams, engage with stakeholders, and manage data asset lifecycle and documentation
Required Skills:
- Programming: Expert proficiency in Python (data processing/engineering) and strong command of SQL
- Domain Understanding: Ability to quickly grasp insurance business logic and data structures to correctly interpret and model raw data
- Cloud & Data Platforms: Extensive experience with cloud platforms (Azure, GCP, AWS) and technologies such as Spark, Databricks, Azure Synapse, Azure Data Factory, or similar
- NLP Tools: Hands-on experience with spaCy or similar NLP libraries for feature extraction and integration
- MLOps & Data Governance: Familiarity with MLOps practices, dataset versioning, and data governance in enterprise environments
Why should you apply:
- Long term work opportunity, plus WFH available
- Great team dynamics and learning opportunity
- Opportunities to learn/brush-up English/Japanese language
Company Details:
A US based world’s leading insurance providers, offering a broad range of life, health, and retirement solutions to individuals, families, and businesses. The company is heavily invested in digital transformation, utilizing advanced technologies like cloud computing, data analytics, AI, and cybersecurity to enhance customer experience and streamline operations. As part of its values, it has a strong focus on creating a diverse environment, and in particular on the appointment of women in high-level position.
Working Hours: 9:00 - 18:00 (Mon-Fri)
Working Style: 4 days’ work in office, and 1 day work from home (more WFH days is possible)
Holidays: Saturday, Sunday, National Holidays, Year-end and New Year Holidays, Paid Holidays
Services/Benefits: Transportation expenses up to 20,000 yen per month, plus Paid leave, plus social insurance (health insurance, welfare pension, and work-related accident insurance), Periodic health examination, and Employment insurance



