Adding the PRRSV2 nsp3-4-5 dataset from Baby et al. 2026 (under review)#458
Adding the PRRSV2 nsp3-4-5 dataset from Baby et al. 2026 (under review)#458LMIVV-medvet wants to merge 2 commits into
Conversation
|
Here is a preview link: https://master.clades.nextstrain.org/?dataset-server=gh:@LMIVV |
|
Thanks for submitting this PR. I made a preview link: https://master.clades.nextstrain.org/?dataset-server=gh:@LMIVV I also have a few concrete questions: Your dataset name has both community/UdeM-LMIVV/BabyV/PRRSV2/nsp3-4-5/Baby2026 The lineages/clades defined are not always very well separated and some seem very rare. There might not be a better solution to this, but I thought I'd flag it:
The tree seem rooted on n68. There is no need to root on a specific strain and a separate rooting might be closer to what is biological relevant.
|
|
Regarding release. This can be released once technical and biological questions regarding the dataset are clarified. Once it is released, it can only be updated, not removed. |
|
Hello, Thank you for reviewing our dataset, here are the answers to your questions: -Your dataset name has both BabyV and Baby2026, is this intended? -The lineages/clades defined are not always very well separated and some seem very rare. -The tree seem rooted on n68. There is no need to root on a specific strain, and a separate rooting might be closer to what is biological relevant. Thank you very much, |
|
thanks for following up and answering my questions. I posed these mostly to ensure that these were conscious choices. If you are happy with how the dataset performs, that is fine by me. Regarding the root: it is true that nextclade requires a specific reference to align to and historically this had to be the root. But we can now separate the two. So if you'd rather root at midpoint, feel free to change. There is a short note on this in the FAQ. Otherwise, happy to keep as is. Let me know how you want to proceed and what time line for release you want. |
|
Excellent! We tested the dataset on our local machine and we were satisfied with the results. Since it worked well I think that for the initial release I would keep it that way and in the next I will change it since we will have to rebuild the tree with the new data at that time anyway. For the release timeline, the paper is currently under review and we would like to wait until it is accepted before releasing the dataset. If all goes well (fingers crossed), we should receive the reviewers comments within the next few weeks. We planned to provide the preview link to the reviewers if they asked for it (thank you for creating it by the way). However, if it causes problems or conflicts on your side if we wait too long, we can release it when you require it. |


Description of proposed changes
We want to add a new community dataset to Nextclade based of the nsp3-4-5 region of the PRRS virus type. We would like it to be released upon the publication of the paper which is currently under review or before, if asked by the reviewers.
Checklist