DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning

🤔 Your LLM-Wiki Need to be Refined

DeepRefine is a general LLM-based reasoning model for agent-compiled knowledge refinement that improves the quality of any pre-constructed knowledge bases with user queries to make it more suitable for the downstream tasks.

News

[2026/5/10] Static quants of DeepRefine-v1-8B, 🤗 mradermacher/DeepRefine-v1-8B-GGUF has been released. Thanks to the community!

📊 Training Data Preprocessing

We collect the raw training data of HotpotQA from https://hotpotqa.github.io/ and then construct the data samples for RL training through the following script:

bash scripts/autograph-r1/data_prepare/hotpotqa_cons.sh

Or you can also access the training data under the folder data/.

🔥 GRPO Training

⚠️ Configuration Reminder: Please ensure to replace all path configurations in the following scripts with your own paths.

Update your config in verl/third_party/autograph_r1/config.ini.

Train $\texttt{DeepRefine-4B}$ based on $\texttt{Qwen3-4B-Instruct-2507en}$ with GRPO and GBD reward:

bash scripts/train/run_qwen3-4b_graph_refiner.sh

Train $\texttt{DeepRefine-8B}$ based on $\texttt{Qwen3-8B}$ with GRPO and GBD reward:

bash scripts/train/run_qwen3-8b_graph_refiner.sh

We have also provided our model in HuggingFace.

🔍 Evaluation

⚠️ Configuration Reminder: Please ensure to replace all path configurations in the following scripts with your own paths.

There are six evaluation mode:

Graph Retriever, no refinement.

bash scripts/eval/gr_refine_bench_no_refine.sh

Graph Retriever, naive refinement (without training).

bash scripts/eval/gr_refine_bench_wo_rl.sh

Graph Retriever, deeprefine.

bash scripts/eval/gr_refine_bench_rl.sh

Text Retriever, no refinement.

bash scripts/eval/tr_refine_bench_no_refine.sh

Text Retriever, naive refinement (without training).

bash scripts/eval/tr_refine_bench_wo_rl.sh

Text Retriever, deeprefine.

bash scripts/eval/tr_refine_bench_rl.sh

📖 Citation

@article{huang2026deeprefine,
  title={DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning},
  author={Huang, Haoyu and Bai, Jiaxin and Liu, Shujie and Wei, Yang and Tsang, Hong Ting and Gao, Yisen and Xie, Zhongwei and Li, Yufei and Song, Yangqiu},
  journal={arXiv preprint arXiv:2605.10488},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
AutoSchemaKG		AutoSchemaKG
assets		assets
autograph		autograph
autorefiner		autorefiner
benchmark		benchmark
config		config
data		data
scripts		scripts
verl		verl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning

News

📊 Training Data Preprocessing

🔥 GRPO Training

🔍 Evaluation

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning

News

📊 Training Data Preprocessing

🔥 GRPO Training

🔍 Evaluation

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages