Attempt to run humann (3) with the official container#53
Conversation
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.2. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
|
I promise I ran these tests locally first y'all lol. I actually think this might not be an issue with my changes (?!?) but a nextflow version issue? A bit confused because I think they are all supposed to be running |
nickp60
left a comment
There was a problem hiding this comment.
A few things
-- this reverts to the old versions.yml instead of the updated versions topic, so all the versions logic changed here needs to be reverted
-- the meta should all be tagged as single_end at this point, as the tool does not accept paired data. The merging happens upstream and should be changing the single_end tag to true accordingly. Not sure why this is popping up now though...
-- the wrapper script is a nice idea! I hope it works. We might be back into headache territory due to the larger container size, but maybe we cross that bridge later
|
|
Updated the versions and marked everying as single ended. I think the linting failure is coming again from a nextflow version update issue, but attempts to set the nextflow version to a lwoer version also cause the linter to fail (because the linting.yaml file is no longer the same as the template lol). Bit of a catch 22 so maybe we take this as a prodding to comply idk. |
|
Thanks! LGTM. Feel free to bump the template if that makes merging easier. |
|
@miraep8 @nickp60 could we close this in favour of nf-core/modules#11201? |
|
@vinisalazar What @miraep8 is proposing here is a way to avoid having to use the patched versions that allow for the config location variable. If this works (and it seems to be!) then we don't have to ask people to use our custom docker images, which should make the review process easier. |
|
I think this should be incorporated into your module PR for humann @vinisalazar. Then we can use the official conda/docker containers and be off the hook for supporting them in the future. |
|
I think that Vini's solution in #11201 also makes use of the official docker images actually! His solution is different than mine ie: If this approach works for humann regroup (and it seems to be) then I would definitely be in favor of closing this in favor of your PR Vini as I think your solution seems more elegant than mine! I have just wanted to try your version out first and haven't had a chance to yet :) but I suspect it will be fine if its working in your hands. |
|
The HUMANN_CONFIG is the stuff I added in the patched version of humann -- that will not work in non-patched containers. The tests pass because (A) its only being run in -stub mode and (B) it regrouping to one of the tables included in the package data: its not using the data in the untarred utility database output from the UNTAR rule. you can confirm this by disabling the you can also add a line to the script section and look for which group options are available: which is missing all the ones from the utility database |
|
Thanks Nick! I had thought for some reason that Vini was doing something slightly different from what you had before. In that case Vini - maybe it would be worth integrating my changes into your human module pull instead. (At least for the regroup module)...sorry for the confusion on my part! And thank you both for your work on this :) |
HUMANN_CONFIG env var only works in patched containers. Use Python to set config.utility_mapping_database in-memory before calling humann.tools.regroup_table.main() directly. Also remove HUMANN_CONFIG block from humann/humann (no-op in official containers). Refs nf-core/funcprofiler#53 Co-authored-by: Mirae Baichoo <miraep8@gmail.com>
Add test_genefamilies.tsv with UniRef90 IDs present in utility_nfDEMO mapping. Add non-stub test to verify utility_db is actually used. Fix stub test input (was fastq, now correct TSV). Refs nf-core/funcprofiler#53 Co-authored-by: Mirae Baichoo <miraep8@gmail.com>
|
Just updated nf-core/modules#11201, please have look if possible |
* humann/humann: add humann module with v3/v4 support Module originally written by @d4straub on nf-core#1089. Refactored by @nickp60 for the nf-core/funcprofiler pipeline. Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> * humann/regroup: add regroup submodule Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> * humann/renorm: add renorm submodule Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> * humann/humann: edit file extensions to match other modules * humann/humann: fix nf-core lint failures - Move process from base.nf into main.nf (fixes process_exist, main_nf_script_outputs, when_exist lint checks) - Rename duplicate val(meta) to val(meta_profile) in input block (fixes nextflow-lint pre-commit hook) - Rewrite meta.yml output section to new dict-keyed format; add missing inputs (profile, utility_db); fix glob patterns (*.{tsv.gz}) - Add missing test tags: modules_nfcore, humann/humann, metaphlan/metaphlan, untar; fix modules_ typo - Replace eval()/topic:versions with standard versions.yml output; add process.out.versions to snapshot assertions * humann/humann: apply lint fixes run `nf-core modules lint humann/humann` * humann/renorm: fix deprecated container and conda directives Replace if/else container block with ternary syntax. Remove params.enable_conda conditional. * humann/renorm,regroup: fix meta.yml schema validation errors - Convert output sections from old list format to new dict-keyed format - Convert input sections to new nested array format for tuple inputs - Fix groups type: value -> type: string - Add missing utility_db input to regroup - Add versions_humann output and topics section to both - Fix licence to block list format * humann/humann: add @vinisalazar to maintainers * humann/humann: fix meta_input_names and stub gzip syntax * humann/humann: replace deprecated 'shell' directive * humann/renorm: fix container link, add when block, fix test tags * humann/regroup: move process to main.nf, fix lint failures * humann: add getHumannVersion helper with v3 fallback and warning Replace inline tokenization and local getProcessNamePrefix with shared getHumannVersion() in utils.nf. Defaults to HUMANN3 with a log.warn when process is not aliased to HUMANN3 or HUMANN4. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * humann/renorm: fix stub gzip syntax * humann/humann: fix test process name HUMANN3/HUMANN4 -> HUMANN_HUMANN * humann/humann: add nextflow.config with pipelines_testdata_base_path * humann/regroup: add nextflow.config and wire config directive in test * humann/utils: suppress getHumannVersion warning for canonical unaliased process names * humann: support ext.version override for conda/container/ext selection * humann/humann: add v4 test config with ext.version=HUMANN4, remove .nftignore ref * humann/humann: update branch based on nf-core/funcprofiler@da2f92c Co-authored-by: Nick Waters <nickp60@gmail.com> * humann/humann: remove ext.version, drop versions.yml emit, switch to biocontainers - Remove task.ext.version fallback; version selection relies solely on process name aliasing via getHumannVersion(task.process) - Drop versions.yml output and END_VERSIONS blocks (topics output covers versioning) - Remove stale versions entry from meta.yml to match main.nf outputs - Switch container directive to standard biocontainers format to satisfy main_nf_container lint check (quay.io + depot.galaxyproject.org) * humann/humann: alias HUMANN4_HUMANN via wrapper for v4 test - Add main_v4_wrapper.nf that includes HUMANN_HUMANN as HUMANN4_HUMANN, so getHumannVersion() returns 'HUMANN4' from the process name - Update main_v4.nf.test to use the wrapper script and HUMANN4_HUMANN - Remove ext.version from nextflow_v4.config; add explicit container override for HUMANN4_HUMANN to use the vdblab v4 image * humann/regroup: remove ext.version, switch to biocontainers - Remove task.ext.version fallback from conda/container directives - Switch to standard biocontainers format for lint compliance * humann/renorm: add utils.nf integration, switch to biocontainers - Import getConda/getContainer/getHumannVersion from utils.nf for consistent dynamic conda selection with regroup - Switch to standard biocontainers format for lint compliance * humann/humann: fix MPAHUMANN4 ext.args Update snapshots * humann/humann: fix lint check Test wasn't finding version in snapshots due to new topic format * humann/humann: remove v4 test files * humann/humann: remove environment_humann4.yml * humann/humann: rename environment file, drop v3/v4 comments * humann/humann: remove utils.nf dependency, hardcode v3 values * humann/regroup: remove utils.nf dependency, hardcode v3 conda * humann/regroup: bump environment.yml to humann=3.6.1 * humann/renorm: remove utils.nf dependency, hardcode v3 conda * humann/renorm: bump environment.yml to humann=3.6.1 * humann: remove utils.nf (v3/v4 dispatch no longer needed) * humann: rename module directory humann → humann3 * humann3: rename process definitions HUMANN_ → HUMANN3_ * humann3: update meta.yml name fields to humann3_* * humann3: update test file names, process refs, and tags * humann3: update snapshot keys and process names * Rename humann3/humann3 as humann3/humann * humann3/humann: fix linting * humann3/humann: fix test name * humann3: pin metaphlan version This version of humann doesn't work with the latest versions of metaphlan * Update snapshot * humann3: pin metaphlan version * humann3/humann: use Python 3.11 Metaphlan 4.0.6 relies on distutils and crashes on Python 3.12 * humann3/humann: remove unnecessary nuc_ext pattern Code review for nf-core#11201 Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> * humann3/humann: remove unnecessary line - code review (nf-core#11201) Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> * humann3/humann: use humann_config to set utility mapping path * humann3/humann: revert "use humann_config to set utility mapping path" This reverts commit 8ac60c1. * humann3/regroup: replace HUMANN_CONFIG with Python in-memory config HUMANN_CONFIG env var only works in patched containers. Use Python to set config.utility_mapping_database in-memory before calling humann.tools.regroup_table.main() directly. Also remove HUMANN_CONFIG block from humann/humann (no-op in official containers). Refs nf-core/funcprofiler#53 Co-authored-by: Mirae Baichoo <miraep8@gmail.com> * humann3/regroup/tests: add non-stub test with real genefamilies input Add test_genefamilies.tsv with UniRef90 IDs present in utility_nfDEMO mapping. Add non-stub test to verify utility_db is actually used. Fix stub test input (was fastq, now correct TSV). Refs nf-core/funcprofiler#53 Co-authored-by: Mirae Baichoo <miraep8@gmail.com> * humann3/regroup/tests: regroup to uniref90_level4ec to test utility_db uniref90_rxn is built-in; uniref90_level4ec requires the custom utility_db mapping, confirming the Python config fix works correctly. Co-authored-by: Nick Waters <nickp60@gmail.com> * humann3/humann: fix MetaPhlAn version capture command sed pattern was case-sensitive and didn't match actual output; update snapshot to reflect stripped prefix. Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> * humann3/humann: apply linting * humann3: capture MetaPhlAn and Python versions in all modules Add versions_python and versions_metaphlan outputs to regroup and renorm; add versions_python to humann. Update meta.yml and snapshots to match. * humann3/regroup/tests: remove test_genefamilies.tsv * humann3/regroup/tests: fetch genefamilies from test-datasets modules branch * humann3: pin minor Python version * humann3: update snapshots --------- Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> Co-authored-by: Mirae Baichoo <miraep8@gmail.com>
* humann/humann: add humann module with v3/v4 support Module originally written by @d4straub on nf-core#1089. Refactored by @nickp60 for the nf-core/funcprofiler pipeline. Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> * humann/regroup: add regroup submodule Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> * humann/renorm: add renorm submodule Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> * humann/humann: edit file extensions to match other modules * humann/humann: fix nf-core lint failures - Move process from base.nf into main.nf (fixes process_exist, main_nf_script_outputs, when_exist lint checks) - Rename duplicate val(meta) to val(meta_profile) in input block (fixes nextflow-lint pre-commit hook) - Rewrite meta.yml output section to new dict-keyed format; add missing inputs (profile, utility_db); fix glob patterns (*.{tsv.gz}) - Add missing test tags: modules_nfcore, humann/humann, metaphlan/metaphlan, untar; fix modules_ typo - Replace eval()/topic:versions with standard versions.yml output; add process.out.versions to snapshot assertions * humann/humann: apply lint fixes run `nf-core modules lint humann/humann` * humann/renorm: fix deprecated container and conda directives Replace if/else container block with ternary syntax. Remove params.enable_conda conditional. * humann/renorm,regroup: fix meta.yml schema validation errors - Convert output sections from old list format to new dict-keyed format - Convert input sections to new nested array format for tuple inputs - Fix groups type: value -> type: string - Add missing utility_db input to regroup - Add versions_humann output and topics section to both - Fix licence to block list format * humann/humann: add @vinisalazar to maintainers * humann/humann: fix meta_input_names and stub gzip syntax * humann/humann: replace deprecated 'shell' directive * humann/renorm: fix container link, add when block, fix test tags * humann/regroup: move process to main.nf, fix lint failures * humann: add getHumannVersion helper with v3 fallback and warning Replace inline tokenization and local getProcessNamePrefix with shared getHumannVersion() in utils.nf. Defaults to HUMANN3 with a log.warn when process is not aliased to HUMANN3 or HUMANN4. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * humann/renorm: fix stub gzip syntax * humann/humann: fix test process name HUMANN3/HUMANN4 -> HUMANN_HUMANN * humann/humann: add nextflow.config with pipelines_testdata_base_path * humann/regroup: add nextflow.config and wire config directive in test * humann/utils: suppress getHumannVersion warning for canonical unaliased process names * humann: support ext.version override for conda/container/ext selection * humann/humann: add v4 test config with ext.version=HUMANN4, remove .nftignore ref * humann/humann: update branch based on nf-core/funcprofiler@da2f92c Co-authored-by: Nick Waters <nickp60@gmail.com> * humann/humann: remove ext.version, drop versions.yml emit, switch to biocontainers - Remove task.ext.version fallback; version selection relies solely on process name aliasing via getHumannVersion(task.process) - Drop versions.yml output and END_VERSIONS blocks (topics output covers versioning) - Remove stale versions entry from meta.yml to match main.nf outputs - Switch container directive to standard biocontainers format to satisfy main_nf_container lint check (quay.io + depot.galaxyproject.org) * humann/humann: alias HUMANN4_HUMANN via wrapper for v4 test - Add main_v4_wrapper.nf that includes HUMANN_HUMANN as HUMANN4_HUMANN, so getHumannVersion() returns 'HUMANN4' from the process name - Update main_v4.nf.test to use the wrapper script and HUMANN4_HUMANN - Remove ext.version from nextflow_v4.config; add explicit container override for HUMANN4_HUMANN to use the vdblab v4 image * humann/regroup: remove ext.version, switch to biocontainers - Remove task.ext.version fallback from conda/container directives - Switch to standard biocontainers format for lint compliance * humann/renorm: add utils.nf integration, switch to biocontainers - Import getConda/getContainer/getHumannVersion from utils.nf for consistent dynamic conda selection with regroup - Switch to standard biocontainers format for lint compliance * humann/humann: fix MPAHUMANN4 ext.args Update snapshots * humann/humann: fix lint check Test wasn't finding version in snapshots due to new topic format * humann/humann: remove v4 test files * humann/humann: remove environment_humann4.yml * humann/humann: rename environment file, drop v3/v4 comments * humann/humann: remove utils.nf dependency, hardcode v3 values * humann/regroup: remove utils.nf dependency, hardcode v3 conda * humann/regroup: bump environment.yml to humann=3.6.1 * humann/renorm: remove utils.nf dependency, hardcode v3 conda * humann/renorm: bump environment.yml to humann=3.6.1 * humann: remove utils.nf (v3/v4 dispatch no longer needed) * humann: rename module directory humann → humann3 * humann3: rename process definitions HUMANN_ → HUMANN3_ * humann3: update meta.yml name fields to humann3_* * humann3: update test file names, process refs, and tags * humann3: update snapshot keys and process names * Rename humann3/humann3 as humann3/humann * humann3/humann: fix linting * humann3/humann: fix test name * humann3: pin metaphlan version This version of humann doesn't work with the latest versions of metaphlan * Update snapshot * humann3: pin metaphlan version * humann3/humann: use Python 3.11 Metaphlan 4.0.6 relies on distutils and crashes on Python 3.12 * humann3/humann: remove unnecessary nuc_ext pattern Code review for nf-core#11201 Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> * humann3/humann: remove unnecessary line - code review (nf-core#11201) Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> * humann3/humann: use humann_config to set utility mapping path * humann3/humann: revert "use humann_config to set utility mapping path" This reverts commit 8ac60c1. * humann3/regroup: replace HUMANN_CONFIG with Python in-memory config HUMANN_CONFIG env var only works in patched containers. Use Python to set config.utility_mapping_database in-memory before calling humann.tools.regroup_table.main() directly. Also remove HUMANN_CONFIG block from humann/humann (no-op in official containers). Refs nf-core/funcprofiler#53 Co-authored-by: Mirae Baichoo <miraep8@gmail.com> * humann3/regroup/tests: add non-stub test with real genefamilies input Add test_genefamilies.tsv with UniRef90 IDs present in utility_nfDEMO mapping. Add non-stub test to verify utility_db is actually used. Fix stub test input (was fastq, now correct TSV). Refs nf-core/funcprofiler#53 Co-authored-by: Mirae Baichoo <miraep8@gmail.com> * humann3/regroup/tests: regroup to uniref90_level4ec to test utility_db uniref90_rxn is built-in; uniref90_level4ec requires the custom utility_db mapping, confirming the Python config fix works correctly. Co-authored-by: Nick Waters <nickp60@gmail.com> * humann3/humann: fix MetaPhlAn version capture command sed pattern was case-sensitive and didn't match actual output; update snapshot to reflect stripped prefix. Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> * humann3/humann: apply linting * humann3: capture MetaPhlAn and Python versions in all modules Add versions_python and versions_metaphlan outputs to regroup and renorm; add versions_python to humann. Update meta.yml and snapshots to match. * humann3/regroup/tests: remove test_genefamilies.tsv * humann3/regroup/tests: fetch genefamilies from test-datasets modules branch * humann3: pin minor Python version * humann3: update snapshots --------- Co-authored-by: Nick Waters <nickp60@gmail.com> Co-authored-by: Daniel Straub <daniel.straub@uni-tuebingen.de> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> Co-authored-by: Mirae Baichoo <miraep8@gmail.com>
Hi team!
Given the recent feedback we got I thought it might be worth taking a stab at running humann3 with the official container. Initially had been planning to upgrate to 3.9.1, but the signularity index for the galaxy projects only goes up to 3.9, so using that for now.
For the main step that required our hacky patch in the home-baked container (the humann_regroup submodule) I now have a hacky python script that tries to modify the config python object (ie doesn't need to actually change anything written in the container) before calling humann regroup. I think it works ok.... lets definitely do a full test before merging, just adding for now to share what I have been working on with this.