FMPose3D: monocular 3D pose estimation via flow matching

Code: Models:

This is the official implementation of the approach described in:

FMPose3D: monocular 3D pose estimation via flow matching CVPR 2026
Ti Wang, Xiaohang Yu, Mackenzie Weygandt Mathis

🚀 TL;DR

FMPose3D creates a 3D pose from a single 2D image. It leverages fast Flow Matching, generating multiple plausible 3D poses via an ODE in just a few steps, then aggregates them using a reprojection-based Bayesian module (RPEA) for accurate predictions, achieving state-of-the-art results on human and animal 3D pose benchmarks.

News!

Feb 2026: The FMPose3D paper was accepted to CVPR 2026! 🔥
Feb 2026: the FMPose3D code and our arXiv paper is released - check out the demos here or on our project page
March 2026: This method is integrated into DeepLabCut

Installation

Set up an environment

Make sure you have Python 3.10. The installation and demos are tested with Python 3.10. You can set this up with:

conda create -n fmpose_3d python=3.10
conda activate fmpose_3d

pip install fmpose3d

For the animal pipeline, install the optional DeepLabCut dependency:

pip install "fmpose3d[animals]"

PyTorch/CUDA note. FMPose3D pins torch>=2.4.1,<2.5 and torchvision>=0.19.1,<0.20, which use CUDA 12.1 wheels by default on Linux. If your driver does not support CUDA 12.1, or if you need a specific CUDA build, install PyTorch first using the matching command from pytorch.org, then install fmpose3d.

Demos

Testing on in-the-wild images (humans)

This visualization script is designed for single-frame based model, allowing you to easily run 3D human pose estimation on any single image.

Pre-trained weights are downloaded automatically from Hugging Face the first time you run inference, so no manual setup is needed.

Alternatively, you can use your own trained weights or download ours from Google Drive, place them in the ./pre_trained_models directory, and set model_weights_path in the shell script (e.g. demo/vis_in_the_wild.sh).

Next, put your test images into folder demo/images. Then run the visualization script:

sh vis_in_the_wild.sh

The predictions will be saved to folder demo/predictions.

Training and Inference

Dataset Setup

Setup from original source

You can obtain the Human3.6M dataset from the Human3.6M website, and then set it up using the instructions provided in VideoPose3D.

Setup from preprocessed dataset (Recommended)

You also can access the processed data by downloading it from here.

Place the downloaded files in the dataset/ folder of this project:

<project_root>/
├── dataset/
│   ├── data_3d_h36m.npz
│   ├── data_2d_h36m_gt.npz
│   └── data_2d_h36m_cpn_ft_h36m_dbb.npz

Training

The training logs, checkpoints, and related files of each training time will be saved in the './checkpoint' folder.

For training on Human3.6M:

sh ./scripts/FMPose3D_train.sh

Inference

Pre-trained weights are fetched automatically from Hugging Face on the first run. You can also use local weights by setting model_weights_path in the shell script (see Demos above for details).

To run inference on Human3.6M:

sh ./scripts/FMPose3D_test.sh

Inference API

FMPose3D also ships a high-level Python API for end-to-end 3D pose estimation from images. See the Inference API documentation for the full reference.

Experiments on non-human animals

For animal training/testing and demo scripts, see animals/README.md. The animal demo auto-downloads both checkpoints (a 26-joint SuperAnimal-Quadruped fine-tuned on Animal3D for 2D pose, and the FMPose3D animal flow-matching lifter for 3D) from Hugging Face on first run — no manual setup needed.

Citation

@misc{wang2026fmpose3dmonocular3dpose,
      title={FMPose3D: monocular 3D pose estimation via flow matching}, 
      author={Ti Wang and Xiaohang Yu and Mackenzie Weygandt Mathis},
      year={2026},
      journal={CVPR},
      url={https://arxiv.org/abs/2602.05755}, 
}

Acknowledgements

We thank the Swiss National Science Foundation (SNSF Project # 320030-227871) and the Kavli Foundation for providing financial support for this project.

Our code is extended from the following repositories. We thank the authors for releasing the code.

Name		Name	Last commit message	Last commit date
Latest commit History 458 Commits
.github/workflows		.github/workflows
3dhp_test		3dhp_test
animals		animals
dataset		dataset
demo		demo
fmpose3d		fmpose3d
images		images
pre_trained_models		pre_trained_models
scripts		scripts
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FMPose3D: monocular 3D pose estimation via flow matching

🚀 TL;DR

News!

Installation

Set up an environment

Demos

Testing on in-the-wild images (humans)

Training and Inference

Dataset Setup

Setup from original source

Setup from preprocessed dataset (Recommended)

Training

Inference

Inference API

Experiments on non-human animals

Citation

Acknowledgements

About

Releases

Packages

Used by

Contributors

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

FMPose3D: monocular 3D pose estimation via flow matching

🚀 TL;DR

News!

Installation

Set up an environment

Demos

Testing on in-the-wild images (humans)

Training and Inference

Dataset Setup

Setup from original source

Setup from preprocessed dataset (Recommended)

Training

Inference

Inference API

Experiments on non-human animals

Citation

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages

Used by

Contributors

Languages