Skip to content
/ RetPO Public

[NAACL 2025 Findings] Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search

Notifications You must be signed in to change notification settings

dmis-lab/RetPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RetPO

Official implementation of "Ask Optimal Questions: Aligning Large Language Models with Retriever’sPreference".

Chanwoong Yoon1*, Gangwoo Kim1*, Byeongguk Jeon1, Sungdong Kim2,3, Yohan Jo4, Jaewoo Kang1
Korea University1, NAVER Cloud2, KAIST AI3, Seoul National University4
In NAACL 2025.

📃 Paper | 🤗 Model | 🤗 RF-Collection

Overview Image

Abstract Conversational search, unlike single-turn retrieval tasks, requires understanding the current question within a dialogue context. The common approach of rewrite-then-retrieve aims to decontextualize questions to be self-sufficient for off-the-shelf retrievers, but most existing methods produce sub-optimal query rewrites due to the limited ability to incorporate signals from the retrieval results. To overcome this limitation, we present a novel framework RetPO (Retriever's Preference Optimization), which is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems. The process begins by prompting a large LM to produce various potential rewrites and then collects retrieval performance for these rewrites as the retrievers' preferences. Through the process, we construct a large-scale dataset called RF collection, containing Retrievers' Feedback on over 410K query rewrites across 12K conversations. Furthermore, we fine-tune a smaller LM using this dataset to align it with the retrievers' preferences as feedback. The resulting model demonstrates superiority on two benchmarks, surpassing the previous state-of-the-art performance of rewrite-then-retrieve approaches, including GPT-3.5.

Content

  1. Installation Instructions
  2. Evaluation
  3. RetPO (Retriever's Preference Optimization)

1. Installation Instructions

Please be aware that we utilize two distinct environments.

  1. retpo_search (retriever indexing and search)
  2. retpo_qr (QR model training and inference)

The base retrieval code uses faiss-gpu, which is tied to specific versions of CUDA and torch. If the versions do not match, errors may occur. Therefore, we use separate environments.

retpo_search

As we require a lot of retrieval of dense retriever, we recommend to consider to use faiss-gpu.

# create environment
conda create -n retpo python==3.9 && conda activate retpo_search

# install torch
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116

# faiss-cpu or faiss-gpu
# CPU
pip install faiss-cpu==1.7.3
# GPU
pip install https://github.com/kyamagu/faiss-wheels/releases/download/v1.7.3/faiss_gpu-1.7.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

# other requirements
pip install -r requirements.txt

retpo_qr

# create environment
cd retpo_qr/
conda create -n retpo_qr python=3.10 && conda activate retpo_qr

# install torch
pip install torch==2.1.0 # this specific version is crucial for reproducibility. you may need to install other variants based on your hardware.

# install dependencies
python -m pip install .

# Flash Attention 2 (Optional, but Recommended for Faster Training)
# If your machine has less than 96GB of RAM and many CPU cores, reduce MAX_JOBS, e.g.:
python -m pip install flash-attn --no-build-isolation

2. Evaluation

Preparation

We mainly evaluate our method using two types of retrievers: BM25 and ANCE on two Conversational QA benchmarks: TopiOCQA and QReCC.

There are well-organized repositories for preprocessing these datasets and indexing passages for retrieval. We recommend using them before running our code. We mainly refer to the ConvGQR as a reference.

Specifically, to run our code, you need to prepare following files.

You can find the code to prepare these folders here:
pyserini_index/ # https://github.com/fengranMark/ConvGQR/blob/main/bm25/create_index.sh
tokenized/ # https://github.com/fengranMark/ConvGQR/blob/main/gen_tokenized_doc.py
embeddings/ # https://github.com/fengranMark/ConvGQR/blob/main/gen_doc_embeddings.py

ROOT_DIR/
└── datasets/
    └── checkpoints # Retriever checkpoints
        └── ad-hoc-ance-msmarco # https://huggingface.co/3ricL/ad-hoc-ance-msmarco
    └── topiocqa/
        ├── pyserini_index/
        ├── full_wiki_segments.tsv
        ├── tokenized/
        ├── embeddings/
    └── qrecc/
        ├── pyserini_index/ 
        ├── full_wiki_segments.tsv
        ├── tokenized/
        ├── embeddings/

Reproduce our performance

For those who'd like to reproduce our reported performance, you can download our queries generated by RetPO from this Google Drive. (Place it in the ROOT_DIR/distill_outputs of the repository.)

You can reproduce our main performance by running the following command.

cd eval
bash ./scripts/bm25_topiocqa.sh

3. RetPO (Retriever's Preference Optimization)

Download RF-Collection

We construct a large-scale dataset called RF-COLLECTION, containing Retrievers’ Feedback on over 410K query rewrites across 12K conversations. You can download it from Huggingface using the following command.

from datasets import load_dataset

ds = load_dataset("dmis-lab/RF-Collection", cache_dir="{ROOT_DIR}/retpo_qr/")

Citation

@article{yoon2024ask,
  title={Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search},
  author={Yoon, Chanwoong and Kim, Gangwoo and Jeon, Byeongguk and Kim, Sungdong and Jo, Yohan and Kang, Jaewoo},
  journal={arXiv preprint arXiv:2402.11827},
  year={2024}
}

Contact

For more information or any questions of our work, feel free to contact me (cwyoon99 (at) korea.ac.kr or gmail.com).

About

[NAACL 2025 Findings] Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published