How to fine-tune DiffCoder?

Hi ByteDance! First of all, great job on Stable-DiffCoder! 🚀 

I'm trying to understand the recommended way to fine-tune the model for a custom use case. As far as I understand, traditional auto-regressive frameworks like `trl` doesn't match the pretraining objective and won't work or, worse, will silently degrade performance.

Is there an official or recommended fine-tuning pipeline?