Conversation
…inks to docs WIP refining
match UI change to 'contribute'
hlapp
left a comment
There was a problem hiding this comment.
@egrace479 looks pretty good. See two edits for clarification (though I'm not 100% sure these are right).
The PR is marked as draft and it looks like there may still be some placeholders (in the form of ...s), so making this a comment. If instead you meant it to be ready for merging, remove the draft status and re-assign me.
|
|
||
| { loading=lazy } | ||
|
|
||
| This method is fine for smaller files (<100MB), or dataset repositories that are distributed, not well organized, have less files. If you are uploading existing files, navigate to the target folder first. |
There was a problem hiding this comment.
| This method is fine for smaller files (<100MB), or dataset repositories that are distributed, not well organized, have less files. If you are uploading existing files, navigate to the target folder first. | |
| This method is fine for smaller files (<100MB), or data uploads from distributed sources, have relatively flat structure with few directories, and/or have few files. If you are uploading existing files, navigate to the target folder first. |
|
|
||
| Hugging Face provides a comprehensive Command Line Interface (CLI) and corresponding [docs](https://huggingface.co/docs/huggingface_hub/en/guides/cli). Note that this is installed with the `huggingface_hub` python package, but can also be installed directly, then called with `hf <command>`. | ||
|
|
||
| The Hugging Face CLI is the ideal method for larger datasets, with more files. It works directly from HPC clusters, such as OSC. Under the hood, [`hf upload`](https://huggingface.co/docs/huggingface_hub/en/guides/cli#hf-upload) uses the same upload functions described below, under [Upload a Dataset with HfApi](#upload-a-dataset-with-hfapi). |
There was a problem hiding this comment.
| The Hugging Face CLI is the ideal method for larger datasets, with more files. It works directly from HPC clusters, such as OSC. Under the hood, [`hf upload`](https://huggingface.co/docs/huggingface_hub/en/guides/cli#hf-upload) uses the same upload functions described below, under [Upload a Dataset with HfApi](#upload-a-dataset-with-hfapi). | |
| The Hugging Face CLI is the ideal method for uploads that are large in volume, have more than a few files, and/or a folder structure with many or nested directories. It works directly from HPC clusters, such as OSC. Under the hood, [`hf upload`](https://huggingface.co/docs/huggingface_hub/en/guides/cli#hf-upload) uses the same upload functions described below, under [Upload a Dataset with HfApi](#upload-a-dataset-with-hfapi), but obviates the need to first write a custom script. |
|
I'm also wondering how much of the |
I opened the draft PR so you could see what I had so far (since you had this issue in mind). I hadn't removed all the old content yet, instead moving most of it further down the page in case I wanted to pull more for the main content. This is also why (as noted in your first comment) there are placeholders. I will re-read tomorrow, but am happy for any feedback. My current plan: I think lines 89-114 ( |
Revises the Hugging Face dataset upload guide to better reflect the preferred methods.
Still ToDo:
Closes #44