Joint effort by OpenDriveLab at The University of Hong Kong, Huawei Inc. and Shanghai Innovation Institute (SII).
- Highlights
- News
- Benchmark
- System Architecture
- Roadmap
- Getting Started
- Citation
- Contributing
- License
- Related Resources
- WorldEngine is a post-training framework for Physical AI that systematically addresses the long-tail safety-critical data scarcity problem in autonomous driving.
- Data-driven long-tail discovery: Failure-prone scenarios are automatically identified from real-world driving logs by the pre-trained agent itself — no manual design, no synthetic perturbations.
- Photorealistic interactive simulation via 3D Gaussian Splatting (3DGS): Each discovered scenario is reconstructed into a fully controllable, real-time-renderable simulation environment with independent dynamic agent manipulation.
- Behavior-driven scenario generation: Leverages Behavior World Model (BWM) to generalize and synthesize diverse traffic variations from existing long-tail scenarios, expanding sparse safety-critical events into a dense, learnable distribution.
- RL-based post-training on synthesized safety-critical rollouts substantially outperforms scaling pre-training data alone — competitive with a ~10× increase in pre-training data.
- Production-scale validation: Deployed on a mass-produced ADAS platform trained on 80,000+ hours of real-world driving logs, reducing simulated collision rate by up to 45.5% and achieving zero disengagements in a 200 km on-road test.
- [2026/04/09] Official dataset released. See OpenDriveLab/WorldEngine or OpenDriveLab/WorldEngine (ModelScope)
- [2026/04/10] Official code repository established.
We compare different post-training paradigms on the nuPlan dataset, evaluating on both open-loop and closed-loop metrics across common and rare driving scenarios.
Metric notes: Early stage. Stable ckpts and corresponding results coming soon.
- Open-loop PDMS is aligned with NAVSIM v1.1 PDM Score. Common denotes the standard
navtestsplit; Rare denotes thenavtest_failuressubset — failure-prone rare-case scenarios extracted fromnavtest.- Closed-loop Success Rate is defined as the fraction of simulated driving episodes completed without collision or off-road failure.
- Closed-loop PDMS* is the PDM Score obtained via SimEngine closed-loop testing, where the planner interacts with reactive agents in simulation under real-time rendering.
Training notes:
- Rare logs are failure-prone scenarios automatically extracted from
navtrainby the pre-trained agent itself (see Rare Case Extraction).- Common logs are the standard cases in
navtrain.
| Method | Open-loop PDMS ↑ (common) | Open-loop PDMS ↑ (rare) | Closed-loop Success Rate ↑ | Closed-loop PDMS* ↑ |
|---|---|---|---|---|
| Base model | 85.62 | 47.15 | 73.61 | 60.28 |
| Supervised fine-tuning on rare logs | 87.03 | 49.68 | 73.26 | 62.26 |
| Post-training on common logs | 86.15 | 51.49 | 64.58 | 56.66 |
| Post-training on rare logs | 89.29 | 62.56 | 74.31 | 62.55 |
| Post-training on rare synthetic replays | 88.01 | 56.62 | 76.39 | 62.11 |
| Post-training on rare rollouts w/o Behaviour WM | 88.99 | 59.69 | 85.07 | 68.29 |
| Post-training with WorldEngine | 88.95 | 59.83 | 88.89 | 70.12 |
Key findings:
- Post-training on rare logs significantly outperforms supervised fine-tuning (62.56 vs 49.68 open-loop rare PDMS), demonstrating the advantage of reward-guided optimization over imitation.
- Post-training on common logs provides limited benefit and even degrades closed-loop performance (success rate drops from 73.61% to 64.58%), confirming that long-tail event discovery is essential.
- The full WorldEngine pipeline achieves the best closed-loop performance (88.89% success rate, 70.12 PDMS*), a +15.28% absolute improvement in success rate over the base model.
Each pair shows the Base model vs WorldEngine post-trained model on the same rare-case scenario. Left: front-camera rendering; Right: BEV trajectory visualization.
Zero disengagements in 200 km on-road testing on a mass-produced ADAS platform.
WorldEngine consists of two tightly coupled subsystems:
| Module | Function | Core Technology |
|---|---|---|
| SimEngine | Closed-loop simulation with ego & agents | Hydra, Ray, rendering |
| AlgEngine | End-to-end model training & evaluation | MMDetection3D, UniAD/VADv2/HydraMDP |
- Core platform integration (SimEngine + AlgEngine)
- Multi-GPU distributed simulation and training
- Rare case extraction and fine-tuning pipeline
- Comprehensive documentation and usage guides
- Hugging Face / ModelScope dataset
- Open-source release (code, data, early pre-trained models)
- arXiv preprint
- Behavior World Model integration
- Stable pre-trained models
WorldEngine provides comprehensive guides for each stage of your workflow:
| Guide | Purpose | Key Topics |
|---|---|---|
| Installation | Set up both conda environments | Two-environment setup (simengine + algengine), dependencies, troubleshooting |
| Data Organization | Prepare datasets and checkpoints | Data structure, Hugging Face/ModelScope downloads, symlinks |
| Quick Start | Run your first experiment in 5 min | Quick test tutorial, understanding results, complete pipeline |
| SimEngine Usage | Master closed-loop simulation | Rollout scripts, distributed testing, configuration, metrics |
| AlgEngine Usage | Train and fine-tune models | Training from scratch, evaluation, rare case extraction, RL fine-tuning |
WorldEngine requires two separate conda environments due to different Python requirements.
Full installation guide: docs/installation.md
Verify your installation with a pre-trained model:
# Set up environment variable
export WORLDENGINE_ROOT=$(pwd)
# Option 1: Single GPU test
bash scripts/closed_loop_test.sh
# Option 2: Multi-GPU test (Default 8 GPUs)
bash scripts/multigpu_closed_loop_test.shWhat this does:
- Loads a pre-trained VADv2 model (50% training data, epoch 8)
- Runs closed-loop simulation on 288 rare-case test scenarios
- Evaluates with navsim v1 PDMS (collision avoidance, progress, comfort, etc.)
- Saves results to
experiments/closed_loop_exps/e2e_vadv2_50pct/navtest_failures_NR/
Detailed quick start tutorial: docs/quick_start.md
After the quick test, explore each subsystem in detail:
Learn how to run simulations, generate rollouts, and test models:
- Rollout scripts for data generation (no model required)
- Testing scripts for model evaluation (single/multi-GPU)
- Ray distributed simulation for large-scale testing
- Reactive vs non-reactive agent modes
- Configuration guide for all Hydra parameters
Learn how to train models, extract rare cases, and fine-tune:
- Training from scratch
- Open-loop evaluation on test sets
- Rare case extraction from evaluation failures
- RL-based fine-tuning on long-tail scenarios
- Multi-GPU training with distributed data parallel
WorldEngine's simulation environments are powered by 3D Gaussian Splatting (MTGS):
- Multi-traversal reconstruction from nuPlan data
- Photorealistic rendering for closed-loop simulation
- Asset generation for SimEngine scenes
If any parts of our work help your research, please consider citing us and giving a star to our repository:
If you use the Render Assets (MTGS), please also cite:
@article{li2025mtgs,
title={MTGS: Multi-Traversal Gaussian Splatting},
author={Li, Tianyu and Qiu, Yihang and Wu, Zhenhua and Lindstr{\"o}m, Carl and Su, Peng and Nie{\ss}ner, Matthias and Li, Hongyang},
journal={arXiv preprint arXiv:2503.12552},
year={2025}
}If you use the augmented scenarios data, please cite as well:
@inproceedings{zhou2025nexus,
title={Decoupled Diffusion Sparks Adaptive Scene Generation},
author={Zhou, Yunsong and Ye, Naisheng and Ljungbergh, William and Li, Tianyu and Yang, Jiazhi and Yang, Zetong and Zhu, Hongzi and Petersson, Christoffer and Li, Hongyang},
booktitle={ICCV},
year={2025}
}@article{li2025optimization,
title={Optimization-Guided Diffusion for Interactive Scene Generation},
author={Li, Shihao and Ye, Naisheng and Li, Tianyu and Chitta, Kashyap and An, Tuo and Su, Peng and Wang, Boyang and Liu, Haiou and Lv, Chen and Li, Hongyang},
journal={arXiv preprint arXiv:2512.07661},
year={2025}
}If you find AlgEngine well, please cite as well:
@ARTICLE{11353028,
author={Liu, Haochen and Li, Tianyu and Yang, Haohan and Chen, Li and Wang, Caojun and Guo, Ke and Tian, Haochen and Li, Hongchen and Li, Hongyang and Lv, Chen},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Reinforced Refinement With Self-Aware Expansion for End-to-End Autonomous Driving},
year={2026},
volume={48},
number={5},
pages={5774-5792},
keywords={Adaptation models;Self-aware;Autonomous vehicles;Pipelines;Planning;Training;Reinforcement learning;Uncertainty;Data models;Safety;End-to-end autonomous driving;reinforced finetuning;imitation learning;motion planning},
doi={10.1109/TPAMI.2026.3653866}}We welcome contributions from the community! Whether you want to:
- Report bugs - Open an Issue
- Improve documentation - Submit a Pull Request
- Contribute code - Fork, develop, and submit a PR
Please read our contributing guidelines before submitting PRs.
For questions:
- Check the documentation first
- Search existing Issues
All content in this repository is under the Apache-2.0 license.
The released data is based on nuPlan and is under the CC-BY-NC-SA 4.0 license.
We acknowledge all the open-source contributors for the following projects to make this work possible:
If you find WorldEngine useful, please consider giving us a star!
Quick Links: Documentation | Installation | Quick Start | Issues | Discussions
Contact: For research collaboration or questions, visit our Discussions








