Job Description:
Rakuten Asia, in partnership with the Economic Development Board (EDB) through the Industrial Postgraduate Programme (IPP), is seeking new PhD students. We are looking for individuals with a robust understanding of deep learning, machine learning, and natural language processing to contribute to our innovative research projects.
Essential requirements include proven hands-on expertise and strong engineering skillsets, specifically in the development and training of PyTorch models.
IPP Programme Benefits
Candidates successfully selected for this programme will receive full sponsorship for their postgraduate studies and will be hired by Rakuten Asia upon successful completion.
Collaboration Model
The collaboration will include joint PhD student supervision, shared access to computational resources for large-scale model compression experiments, and regular research exchanges. Output will include high-impact publications, open-source tools, and demonstrable prototypes of efficient AI.
Project Outline
Introduction
Rakuten is committed to advancing the frontier of AI infrastructure, with a strong focus on optimizing large-scale GPU clusters for training and serving Large Language Models (LLMs). As models grow in size and complexity—ranging from dense architectures to mixture-of-experts (MoE)—achieving efficiency across training, inference, and deployment has become increasingly critical. Our GPU Optimization department combines deep system expertise and significant computational assets, and we are seeking strategic collaborations with leading universities to jointly tackle these challenges.
Proposed Research Areas
We propose collaborative research in the following areas, with flexibility to refine topics based on mutual expertise:
Design token-aware, load-balanced scheduling algorithms for MoE and hybrid LLM workloads that reduce inter-GPU communication and optimize heterogeneous cluster utilization.
Develop high-throughput, low-latency inference techniques for state space models, leveraging their linear-time properties to outperform traditional attention mechanisms in long-context scenarios.
Explore advanced quantization, memory-efficient checkpointing, offloading strategies, and dynamic memory management techniques to support training and inference of ultra-large models.
Investigate hybrid parallelism (data, model, pipeline, expert) and communication-reduction strategies tailored for scaling LLMs across thousands of GPUs.
Develop compiler, kernel, and data layout optimizations that fully exploit features of modern GPU architectures, improving throughput for both dense and sparse model operations.
Create optimized model serving strategies using speculative decoding, continuous batching, expert routing, and adaptive computation for production-grade LLM applications.
Rakuten Group, Inc. (TSE: 4755) is a global technology leader in services that empower individuals, communities, businesses and society. Founded in Tokyo in 1997 as an online marketplace, Rakuten has expanded to offer services in e-commerce, fintech, digital content and communications to 2 billion members around the world. The Rakuten Group has more than 30,000 employees, and operations in 30 countries and regions. For more information visit https://global.rakuten.com/corp/.
Take the next step in your career journey