Improve `trust_remote_code` by hlky · Pull Request #13448 · huggingface/diffusers

hlky · 2026-04-12T15:05:10Z

What does this PR do?

As per #13446 trust_remote_code fails under several circumstances:

pretrained_model_name_or_path as Hub repo A and custom_pipeline as Hub repo B, trust_remote_code is bypassed and remote code runs from repo B

from diffusers import DiffusionPipeline

DiffusionPipeline.from_pretrained(
    "google/ddpm-cifar10-32", custom_pipeline="XManFromXlab/diffuser-custom-pipeline", trust_remote_code=False
)

pretrained_model_name_or_path as local directory and custom_pipeline as Hub repo B, trust_remote_code is never checked and remote code runs from repo B

from diffusers import DiffusionPipeline
from huggingface_hub import snapshot_download

snapshot_path = snapshot_download(repo_id="google/ddpm-cifar10-32")
DiffusionPipeline.from_pretrained(
    snapshot_path, custom_pipeline="XManFromXlab/diffuser-custom-pipeline", trust_remote_code=False
)

pretrained_model_name_or_path as local directory with custom components, trust_remote_code is never checked

from diffusers import DiffusionPipeline
from huggingface_hub import snapshot_download

snapshot_path = snapshot_download(repo_id="hf-internal-testing/tiny-sdxl-custom-components")
pipeline = DiffusionPipeline.from_pretrained(
    snapshot_path, trust_remote_code=False
)
assert pipeline.config.unet == ("diffusers_modules.local.my_unet_model", "MyUNetModel")
assert pipeline.config.scheduler == ("diffusers_modules.local.my_scheduler", "MyScheduler")

This moves trust_remote_code checks into get_cached_module_file where the actual custom module loading takes place. trust_remote_code is passed from several code paths for complete coverage.

I've added 3 separate ValueError in get_cached_module_file to account for the different sources of remote code: local, git and hub. git path could be considered trusted as these are from https://huggingface.co/datasets/diffusers/community-pipelines-mirror, hub path covers "custom pipelines" and local path is reached for "custom components" (these are added to the allowed files in download so become local files).

With PR the above 3 cases are resolved:

The repository for XManFromXlab/diffuser-custom-pipeline contains custom code in pipeline.py which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/XManFromXlab/diffuser-custom-pipeline/blob/main/pipeline.py.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

The repository for XManFromXlab/diffuser-custom-pipeline contains custom code in pipeline.py which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/XManFromXlab/diffuser-custom-pipeline/blob/main/pipeline.py.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

ValueError: The directory C:\Users\user\.cache\huggingface\hub\models--hf-internal-testing--tiny-sdxl-custom-components\snapshots\ce2b9d3f819e7791f53053646ebe37d7e87d73d3\unet contains custom code in my_unet_model.py which must be executed to correctly load the model. You can inspect the file content at C:\Users\user\.cache\huggingface\hub\models--hf-internal-testing--tiny-sdxl-custom-components\snapshots\ce2b9d3f819e7791f53053646ebe37d7e87d73d3\unet\my_unet_model.py.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

Fixes #13446

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

….from_pretrained method

hlky · 2026-04-12T15:26:49Z

src/diffusers/pipelines/pipeline_utils.py

@@ -1674,21 +1678,6 @@ def download(cls, pretrained_model_name, **kwargs) -> str | os.PathLike:
                custom_class_name = config_dict["_class_name"][1]

            load_pipe_from_hub = custom_pipeline is not None and f"{custom_pipeline}.py" in filenames


Just to note this remains to control repo_id:

diffusers/src/diffusers/pipelines/pipeline_utils.py

Line 1699 in dc8d903

repo_id=pretrained_model_name if load_pipe_from_hub else None,

diffusers/src/diffusers/pipelines/pipeline_loading_utils.py

Lines 441 to 458 in dc8d903

def _get_custom_pipeline_class(

custom_pipeline,

repo_id=None,

hub_revision=None,

class_name=None,

cache_dir=None,

revision=None,

):

if custom_pipeline.endswith(".py"):

path = Path(custom_pipeline)

# decompose into folder & file

file_name = path.name

custom_pipeline = path.parent.absolute()

elif repo_id is not None:

file_name = f"{custom_pipeline}.py"

custom_pipeline = repo_id

else:

file_name = CUSTOM_PIPELINE_FILE_NAME

It helps distinguish between:
a) custom_pipeline is e.g. my_pipeline and that filename exists in pretrained_model_name's files
b) custom_pipeline is Hub repo (and pipeline.py is used)

Maybe could be renamed load_pipe_from_hub -> hub_contains_custom_pipeline?

sayakpaul · 2026-04-13T02:58:12Z

tests/pipelines/test_pipelines.py


 class CustomPipelineTests(unittest.TestCase):
    def test_load_custom_pipeline(self):
+        with self.assertRaises(ValueError):


Should we investigate the ValueError messaging as well (it should have something related to the use of trust_remote_code or not something else)?

And on main, this should not have yielded a ValueError, right? That is how we know, for one instance, that it's broken.

On main:

pretrained custom_pipeline trust_remote_code?

hub/repoA my_pipeline ✅

hub/repoA one_step_unet[1] ❌

hub/repoA hub/repoB ❌

any local directory any ❌

PR:

pretrained custom_pipeline trust_remote_code?

hub/repoA my_pipeline ✅

hub/repoA one_step_unet[1] ✅

hub/repoA hub/repoB ✅

any local directory any ✅

[1] or any community pipeline name

This case is more implicit vs explicit consent, but on main there is potential for misuse by combining the "trusted" nature of community pipeline names and third party Hub repos.

A user may copy an example like:

pipeline = DiffusionPipeline.from_pretrained("google/ddpm-cifar10-32", custom_pipeline="one_step_unet")

or

pipeline = DiffusionPipeline.from_pretrained("google/ddpm-cifar10-32", custom_pipeline="pipeline_stable_diffusion_xl_controlnet_adapter_inpaint")

or

pipeline = DiffusionPipeline.from_pretrained("google/ddpm-cifar10-32", custom_pipeline="pipeline_stable_diffusion_x/_controlnet_adapter_inpaint")

The first two are harmless from diffusers/community-pipelines-mirror, the third is malicious with a user registered as pipeline_stable_diffusion_x with a repo name _controlnet_adapter_inpaint. There are many community pipelines so many potential username/repo name combinations that could easily be missed.

Considering that I think community pipeline names should remain trusted, WDYT? We can just remove this to do so.

diffusers/src/diffusers/utils/dynamic_modules_utils.py

Lines 340 to 345 in 78a5028

if not trust_remote_code:

raise ValueError(

f"The community pipeline for {pretrained_model_name_or_path} contains custom code which must be executed to correctly "

f"load the model. You can inspect the repository content at https://hf.co/datasets/{COMMUNITY_PIPELINES_MIRROR_ID}/blob/main/{revision}/{pretrained_model_name_or_path}.py.\n"

f"Please pass the argument `trust_remote_code=True` to allow custom code to be run."

)

sayakpaul · 2026-04-13T03:04:29Z

src/diffusers/utils/dynamic_modules_utils.py

    revision: str | None = None,
    local_files_only: bool = False,
    local_dir: str | None = None,
+    trust_remote_code: bool = False,


Would it make sense to add the ValueError to the caller sites of get_cached_module_file instead? Because the function itself isn't specifically tied to custom pipelines, I think.

sayakpaul · 2026-04-13T03:05:33Z

src/diffusers/pipelines/pipeline_utils.py

                custom_class_name = config_dict["_class_name"][1]

            load_pipe_from_hub = custom_pipeline is not None and f"{custom_pipeline}.py" in filenames
-            load_components_from_hub = len(custom_components) > 0


Why is this going?

See case 3 - this code is in download which is only reached when we use a Hub path with from_pretrained. It is replaced by the check in get_cached_module_file, specifically the local code path.

Consider this scenario:

hf download rotcasuoicilam/SuperCoolNewModel --local-dir rotcasuoicilam/SuperCoolNewModel

from diffusers import DiffusionPipeline pipeline = DiffusionPipeline.from_pretrained("rotcasuoicilam/SuperCoolNewModel")

rotcasuoicilam/SuperCoolNewModel contains malicious custom components, user downloads the Hub repo assuming it is safe, Diffusers loads the custom components without the user's consent.

I didn't get it.

hf download rotcasuoicilam/SuperCoolNewModel --local-dir rotcasuoicilam/SuperCoolNewModel

is agnostic to DiffusionPipeline.from_pretrained(...).

A malicious actor uploads a model with malicious custom components

Either:
2a. The user follows instructions that say to download the model first then run from the local path
2b. The user chooses to download the model first out of personal preference

DiffusionPipeline.from_pretrained(the_local_path)

pwned

Let's add a test case for this scenario then.

c0e0731

On main:

FAILED tests/pipelines/test_pipelines.py::CustomPipelineTests::test_custom_components_from_local_dir - AssertionError: ValueError not raised

PR:

1 passed

A bit more elaborate explanation for other reviewers (feel free to correct).

The critical branching point is:

if not os.path.isdir(pretrained_model_name_or_path): # ... calls cls.download() which had the trust_remote_code check else: cached_folder = pretrained_model_name_or_path

When you call from_pretrained("rotcasuoicilam/SuperCoolNewModel"):

os.path.isdir("rotcasuoicilam/SuperCoolNewModel") is checked.

If the user previously ran hf download ... --local-dir rotcasuoicilam/SuperCoolNewModel, that directory exists locally.

So os.path.isdir() returns True, and the code takes the else branch at line 871 — it just sets cached_folder = pretrained_model_name_or_path directly.

The download() method is never called.

The old trust_remote_code check for custom components lived inside download(). Since download() is skipped entirely when the path is a
local directory, the check never runs. The custom components in that local folder get loaded without any consent gate.

That's why the fix moves the trust_remote_code check into get_cached_module_file — that's where the actual import of custom .py files
happens, and it runs regardless of whether the code came through download() or the local else branch.

Well just to be clear, using the same local-dir as Hub repo is a possible trick to hide the attack as anyone who didn't pre-download wouldn't be affected but any local directory is affected, and the local directory could be from other sources like snapshot_download or git clone.

sayakpaul · 2026-04-13T07:37:39Z

@bot /style

github-actions · 2026-04-13T07:38:04Z

Style bot fixed some files and pushed the changes.

HuggingFaceDocBuilderDev · 2026-04-13T07:43:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

hlky · 2026-04-13T16:15:40Z

I was curious how common custom components are on the Hub. The results are limited but I managed to scrape 9602 Hub repo paths from the model pages. 190 were gated, 4685 actually had model_index.json - the rest must be LoRA or mis-tagged. Out of those, 58 had custom components with a total of 97 custom components. It is unlikely any of these are malicious but it is still interesting that if loading from a local path any of these Hub repos would currently load custom code without requiring trust_remote_code=True.
has_module.json

Robust trust check for custom_pipeline parameter of DiffusionPipeline…

70d067a

….from_pretrained method

github-actions bot added models tests modular-pipelines utils pipelines size/M PR with diff < 200 LOC labels Apr 12, 2026

hlky commented Apr 12, 2026

View reviewed changes

sayakpaul reviewed Apr 13, 2026

View reviewed changes

sayakpaul requested a review from DN6 April 13, 2026 03:06

test_custom_components_from_local_dir

c0e0731

github-actions bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 13, 2026

Apply style fixes

78a5028

github-actions bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 13, 2026

Merge branch 'main' into trust-remote-code

8ec3f26

github-actions bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 13, 2026

		@@ -1674,21 +1678,6 @@ def download(cls, pretrained_model_name, **kwargs) -> str \| os.PathLike:
		custom_class_name = config_dict["_class_name"][1]

		load_pipe_from_hub = custom_pipeline is not None and f"{custom_pipeline}.py" in filenames

	def _get_custom_pipeline_class(
	custom_pipeline,
	repo_id=None,
	hub_revision=None,
	class_name=None,
	cache_dir=None,
	revision=None,
	):
	if custom_pipeline.endswith(".py"):
	path = Path(custom_pipeline)
	# decompose into folder & file
	file_name = path.name
	custom_pipeline = path.parent.absolute()
	elif repo_id is not None:
	file_name = f"{custom_pipeline}.py"
	custom_pipeline = repo_id
	else:
	file_name = CUSTOM_PIPELINE_FILE_NAME

`pretrained`	`custom_pipeline`	`trust_remote_code`?
hub/repoA	`my_pipeline`	✅
hub/repoA	`one_step_unet`[1]	❌
hub/repoA	hub/repoB	❌
any local directory	any	❌

	if not trust_remote_code:
	raise ValueError(
	f"The community pipeline for {pretrained_model_name_or_path} contains custom code which must be executed to correctly "
	f"load the model. You can inspect the repository content at https://hf.co/datasets/{COMMUNITY_PIPELINES_MIRROR_ID}/blob/main/{revision}/{pretrained_model_name_or_path}.py.\n"
	f"Please pass the argument `trust_remote_code=True` to allow custom code to be run."
	)

Conversation

hlky commented Apr 12, 2026

What does this PR do?

Who can review?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 13, 2026

Uh oh!

hlky commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Apr 13, 2026 •

edited

Loading