Back to Jobs

Senior Research Engineer (Code World Models)

Amsterdam | Netherlands | Belgrade

TrulyRemote Verified

Hand-curated global remote job with direct application link

Technical Requirements

PythonMachine LearningNLPDistributed TrainingData PipelinesModel Pre-trainingDeep Learning Frameworks

In this role, you will:

  • Design and run pre-training, continued pre-training, and mid-training experiments for code models.
  • Build and improve data pipelines for large-scale model training, including filtering, deduplication, mixture design, and dataset quality checks.
  • Work with code corpora, repositories, tests, execution traces, and synthetic data.
  • Develop evaluations for complex repository-level code reasoning tasks.
  • Collaborate with researchers and engineers working on ML for code and AI developer tools.

We’ll be happy to have you on our team if you:

  • Have hands-on experience with model pre-training, continued training, or mid-training.
  • Have strong engineering skills in Python and experience with modern ML frameworks.
  • Understand large-scale ML training workflows, including data processing, distributed training, checkpointing, evaluation, experiment tracking, and debugging.
  • Have experience working with large datasets and care about data quality, contamination, sampling, and reproducibility.
  • Have a background in NLP, ML for software engineering, or a similar domain.
  • Enjoy working on research problems with high uncertainty and turning ideas into working experiments.

It would be a plus if you:

  • Have experience training or adapting models for code generation, code understanding, software agents, program repair, test generation, or repository-level reasoning.
  • Have worked with execution-based data, such as unit tests, traces, logs, compiler feedback, runtime states, or sandboxed code execution.
  • Have experience with large-scale distributed training of models with 70B+ parameters.
  • Understand evaluation challenges for code models, including benchmark contamination, flaky tests, execution-based scoring, and long-horizon task evaluation.
  • Have contributed to ML infrastructure, open-source projects, or research systems.
Senior Research Engineer (Code World Models)
JetBrains
Apply