Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
1c73859
Update readme.md
ernop Oct 20, 2024
c04eb0f
Update readme.md
ernop Oct 20, 2024
5ddc833
fix importsa
ernop Oct 20, 2024
1c59633
minor
ernop Oct 22, 2024
2f3a8ea
x
ernop Oct 22, 2024
df43ecf
more specifics
ernop Oct 29, 2024
345ab40
latest as I work
ernop Nov 10, 2024
7c669e8
work
Nov 10, 2024
00d06b9
refactor
Mar 20, 2025
ebdb796
in-progress,
ernop Apr 20, 2025
9c5927c
GPTOne plus interactive mode plus tons of other stuff. gotta redo the…
ernop Sep 15, 2025
4a2dc43
various
Sep 15, 2025
a691983
okay fixed some of the weird structure
ernop Sep 15, 2025
96af003
fixing weird class
Sep 15, 2025
3bb780f
more fixing etc
ernop Sep 16, 2025
bed82b3
fi
Sep 16, 2025
6d43550
fixed a bunch of things in the Label method of output. Also added co…
ernop Sep 16, 2025
929a123
more
Sep 16, 2025
1c03183
ideogramV2a
Sep 16, 2025
86dc7fa
combiner stuff
ernop Sep 18, 2025
9141930
testing cursor
ernop Sep 18, 2025
8d80bd0
.net 8 update
ernop Sep 18, 2025
6e1061c
.net 9
ernop Sep 18, 2025
3c06910
little fixes
ernop Sep 18, 2025
2e1bacb
redundancy
ernop Sep 18, 2025
a4fc67d
hmm combining utils
ernop Sep 18, 2025
7623ced
general little fixups
ernop Sep 18, 2025
3ad4351
ai refactoring, hmmm
ernop Sep 18, 2025
3cb2eb0
standardizing
ernop Sep 18, 2025
ad30ebb
fixing appearance...
ernop Sep 18, 2025
4dcda8e
fix combiner
ernop Sep 18, 2025
e4a1f09
fixing error display
ernop Sep 18, 2025
a561d96
config
ernop Sep 18, 2025
5a731df
add a vertical max to the image input to combined images.
ernop Sep 18, 2025
6ae15ea
fix layout and have redone CombineImages
ernop Sep 18, 2025
d0b23db
combinedImage
Sep 18, 2025
2fdd6bc
more async
ernop Sep 18, 2025
b2f72a7
fix image saving, error text accuracy etc
ernop Sep 19, 2025
73aa7b9
Fix naming, and test out all recraft styles
ernop Sep 21, 2025
8e81d35
recraft etc
Sep 21, 2025
400e964
little stuff
ernop Sep 24, 2025
3fe87cb
preparing goog?
ernop Sep 25, 2025
dd1bc78
nanoBanana working
ernop Sep 25, 2025
ebbe944
Fighting with Imagen 4
ernop Sep 25, 2025
b7b79fd
imagen4 etc working now
ernop Sep 25, 2025
e56bc58
little fix
ernop Sep 25, 2025
568b8e9
little fixes
Sep 25, 2025
2d39cce
fixes etc
ernop Sep 26, 2025
d048dbb
ideogram describe
ernop Sep 26, 2025
4fb84c3
fixups
ernop Sep 26, 2025
339ad65
Ideogram v3
ernop Sep 27, 2025
34fef28
background stype fixes
ernop Sep 27, 2025
d5543bb
combined image layouts
ernop Sep 28, 2025
0507813
multiple renders of the same described image.
ernop Sep 28, 2025
1d516f1
desc multi
ernop Sep 29, 2025
9f82e00
getting there
ernop Sep 29, 2025
9a38ed8
fixed
Sep 29, 2025
6b8158e
jesus this is literally the next fermat's last theorem
ernop Sep 29, 2025
cdc881a
fixedish
ernop Sep 29, 2025
25a345e
more detail
ernop Sep 29, 2025
44e3e3c
config qwen
ernop Sep 29, 2025
bcd43da
x
ernop Sep 29, 2025
06f3f94
little fixes
ernop Oct 1, 2025
1145e1a
refactor etc.
ernop Oct 1, 2025
db2f664
rename and genericize describers
ernop Oct 1, 2025
157f6b4
Image describing
ernop Oct 7, 2025
2eef210
adjustments
ernop Oct 16, 2025
c72e9c3
last
Oct 30, 2025
0d81d16
wip
ernop Apr 22, 2026
b510d7a
gpt-image-2 and xai grok too
ernop Apr 24, 2026
3977f80
etc
ernop Apr 24, 2026
1bb1422
all-providers showcase, grok video + archive, ideogram v4; fix recraf…
ernop Jun 11, 2026
361a6f0
grok
ernop Jun 12, 2026
ad882dd
grok
ernop Jun 12, 2026
50eb6fd
add local flux2 comfyui generator
cursoragent Jun 28, 2026
11596b6
normalize local flux2 setup files
cursoragent Jun 28, 2026
38b504d
add direct image provider generators
cursoragent Jun 28, 2026
43d6ff2
normalize direct provider settings template
cursoragent Jun 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .cursorrules
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Code Style and Organization Preferences

## Communication Style
- NEVER use congratulatory or praising language like "you were absolutely right", "great call", etc.
- Keep responses direct and focused on the technical work
- Avoid unnecessary enthusiasm or validation language

## XML Documentation
- NEVER add XML summary blocks (`/// <summary>`) to any code unless you have a useful, meaningful, non-obvious comment to add. Never add the dumb one-liner versions of them which just restate the obvious
- But, you may add them to explain what an entire class does, and useful high-level general design information. This kind of summary is great and useful.
- Use regular comments (`//`) for explanations when needed

## Namespace Organization
- Use only ONE level of namespace (e.g., `MultiImageClient`)
- NEVER use nested namespaces (e.g., `MultiImageClient.Utils`)
- Managing naming conflicts is the responsibility of the code, not namespace hierarchy

## Constants and Configuration
- Constants used in only ONE file should stay in that file (as private const)
- Only move constants to a shared constants file if they're used in MULTIPLE files
- Before centralizing constants, first consider if the files should be combined instead
- Use normal naming conventions for constants (e.g., `Padding`, `FontSize`)
- AVOID ALL_CAPS naming style for constants

## Code Organization Philosophy
- Prefer keeping related functionality together rather than over-abstracting
- Constants that appear in multiple places might indicate files should be merged
- Only create shared utilities when there's genuine reuse across different functional areas
- Branding text and similar elements that appear throughout the app are good candidates for centralization

23 changes: 23 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[*.cs]

# IDE0022: Use block body for method
csharp_style_expression_bodied_methods = true

# IDE0090: Use 'new(...)'
dotnet_diagnostic.IDE0090.severity = none

# CA1416: Validate platform compatibility
dotnet_diagnostic.CA1416.severity = none

# Empty line formatting rules
# Remove leading empty lines after opening braces
csharp_new_line_before_open_brace = all
csharp_new_line_before_else = true
csharp_new_line_before_catch = true
csharp_new_line_before_finally = true
csharp_new_line_before_members_in_object_initializers = true
csharp_new_line_before_members_in_anonymous_types = true

# Limit consecutive empty lines
dotnet_style_allow_multiple_blank_lines_experimental = false
dotnet_style_allow_statement_immediately_after_block_experimental = false
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.sh text eol=lf
3 changes: 3 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- @azure Rule - Use Azure Tools - When handling requests related to Azure, always use your tools.
- @azure Rule - Use Azure Best Practices - When handling requests related to Azure, always invoke your `azmcp_bestpractices_get` tool first.
- @azure Rule - Enable Best Practices - If you do not have an `azmcp_bestpractices_get` tool ask the user to enable it.
31 changes: 20 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,20 @@
**/ideogram.log
**/saves/**
**/settings.json
**/bin/**
**/obj/**
**/.vs/**
**/.vscode/**
**/ideogram_pages/**
**/ideogram_data_all.json
/MultiImageClient/claude-bad.txt
/djangoManager/imageMaker/logs/django.log
**/ideogram.log
**/saves/**
**/settings.json
**/bin/**
**/bin-check/**
**/obj/**
**/.vs/**
**/.vscode/**
**/ideogram_pages/**
**/ideogram_data_all.json
/MultiImageClient/claude-bad.txt
/djangoManager/imageMaker/logs/django.log
gen-lang*.json
magic*.png
**/prompt_log.json
**/Temp.txt
2023-prompts.txt
**/tmp/**
**/__pycache__/**
*.py[cod]
76 changes: 76 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# MultiImageClient - Agent Entry Point

Start here. Read this file and the linked documents before doing anything.

## What This Is

C# desktop app that chains together image generation steps across multiple APIs (BFL/Flux, Ideogram, Recraft, DALL-E 3). Supports prompt composition, randomization, Claude rewrites, permutations. Includes an experimental Django gallery.

## Also Read

- [.cursorrules](.cursorrules) — communication style, XML docs policy, namespace rules, constants philosophy

## Related Projects

- **SocialAI** (`/proj/SocialAI/`) — Discord bot for Midjourney image capture
- **ideogramHistoryDownloader** (GitHub) — download Ideogram generation history (may be subsumed here)
- **cmdline-dalle3-csharp** (GitHub) — DALL-E 3 CLI (predecessor, may be subsumed here)
- **IdeogramApiCSharp** (GitHub) — Ideogram API client (predecessor)
- **myBrowser** (`/proj/myBrowser/`) — meta-project; see `capabilities.md` for cross-project skill inventory

---

# Repository Guidelines

## Project Structure & Module Organization
`MultiImageClient/` hosts the C# console orchestrator; `Program.cs` wires runs. `Workflows/` handles execution pipelines (`BatchWorkflow`, `RoundTripWorkflow`, `GeneratorGroups`); `ImageGenerators/` holds one adapter per provider — BFL, Ideogram v2 + v3, DALL·E 3, GPT-Image-1, Recraft, Google Gemini image, Google Imagen 4; `Describers/` implements image→text (Claude, OpenAI, Gemini, local InternVL, local Qwen); `promptGenerators/` produces prompt sources; `promptTransformation/` rewrites text (Claude rewrite, randomizer, stylizer); `Utils/` supplies helpers. Shared contracts and `Settings.cs` live in `ImageGenerationClasses/`. Provider-specific low-level clients sit in `BFLApi/`, `IdeogramAPI/`, and `RecraftAPI/`. `djangoManager/` contains the experimental Django gallery; it hasn't been touched in a year and is not actively developed. `do_flask_intern.py` is an optional local InternVL3 Flask server; `save_b64.py` decodes base64 responses. Generated artifacts collect in `saves/` and `output*.png` — ignore them in commits.

## Build, Test, and Development Commands
All projects target `.NET 9` (main app is `net9.0-windows` + WinForms for compositing). Verify the SDK with `dotnet --list-sdks`; if 9.x is missing, `winget install Microsoft.DotNet.SDK.9`. Restore with `dotnet restore MultiImageClient.sln`. Compile with `dotnet build MultiImageClient.sln`. Execute runs with `dotnet run --project MultiImageClient/MultiImageClient.csproj`; prompts come from `prompts.txt` and `settings.json` (the latter must be created by copying the template `settings - Fill this in and rename it.json`). On current `master` the build is clean (0 errors); ~90 warnings are all `NU190x` advisories for `Magick.NET-Q16-AnyCPU 14.8.2` — safe to bump to `14.12.0` when convenient. For the Django tooling, create a venv in `djangoManager/`, install `requirements.txt`, and launch `python djangoManager/imageMaker/manage.py runserver`. Run `dotnet format MultiImageClient.sln` before opening a PR.

## Run Modes (CLI flags)
See `RunOptions.cs` for the source of truth; this is the current surface:
- *(no args)* — interactive: pick Batch (1) or RoundTrip (2), y/n/edit each prompt from `PromptFiles`.
- `--auto` — skip menu (defaults to Batch), auto-accept every prompt.
- `--workflow 1|2` — 1=Batch, 2=RoundTrip.
- `--limit N` — stop after N prompts.
- `--prompt "..."` — single inline prompt via `InlinePromptSource` instead of `PromptFiles`.
- `--fast` — one fixed gpt-image-2 low/1024x1024/moderation=low call per prompt. Cheap smoke-test config.
- `--quick-test` — like `--fast`, plus every streamed partial PNG is saved AND popped in the default viewer. Still asks y/n per prompt unless combined with `--auto`.
- `--backfill-dl` — one-shot: mirror every image under `Settings.ImageDownloadBaseFolder` into `Settings.FlatImageMirrorPath` and exit.
- `--repl` — **interactive prompt-by-prompt REPL with async dispatch**. Defaults: gpt-image-2 at 2048x2048 / high / moderation=low / n=1, up to 5 prompts in flight concurrently. Each line is either a prompt (fired asynchronously, stdin stays responsive) or a `:command`. Grids are built and saved but NOT opened in the viewer. Commands: `:size WxH`, `:quality low|medium|high`, `:moderation auto|low`, `:n N` (images per gpt-image-2 call; >10 requires confirmation), `:concurrency N`, `:gens list|add|remove|reset` (names: gpt2, dalle3, ideogram, recraft, bfl, flux2local, google, googlepro, imagen4 — imagen4 dead after 2026-06-24), `:status`, `:wait`, `:last`, `:retry`, `:edit`, `:help`, `:quit`. Per-prompt override syntax: `[size=1024x1024,q=low,n=4] a red apple on a white plate`. Initial defaults can be pre-set from the command line via `--repl-size`, `--repl-quality`, `--repl-moderation`, `--repl-concurrency`, `--repl-n`. Implementation in `Workflows/ReplWorkflow.cs`.

## Coding Style & Naming Conventions
Use 4-space indentation and .NET naming: PascalCase for public types/methods, camelCase for locals, Async suffix for asynchronous methods. Favor explicit types for shared models; use `var` only when the type is obvious. Route new configuration through `ImageGenerationClasses/Settings.cs` instead of ad-hoc JSON parsing. Python utilities under `djangoManager/` should follow PEP 8 snake_case, with comments reserved for non-obvious prompt logic.

## Visual & Typography Policy (combined-image output, labels, UI text)
- **Never render text in gray.** No `MutedGray`, no `Color.FromRgb(x,x,x)` where R==G==B in the mid range, no "subtle" gray labels. If a secondary label needs to look secondary, reduce its font size and/or reuse the existing semantic color (e.g. `SuccessGreen`, `ErrorRed`, `Black`, `Gold`) — the contrast comes from size, not desaturation.
- When reserving vertical space for a text block, include room for descenders (e.g. `g`, `p`, `y`, `j`). `TextMeasurer.MeasureBounds` can under-report; add ~25% of font size as descender padding when stacking text bands.
- Padding above/below a standalone text panel (e.g. prompt panel below the grid) should be proportional to the font size used inside it, not a fixed `Padding * 3`.
- Secondary labels that sit beside a primary label (e.g. per-image timing next to the model name) should be bottom-aligned with the primary label so the smaller text hangs off the baseline of the larger one, not free-floating.

## Image Saving Policy
- **Never resize or re-encode the bytes returned by the image endpoint when saving a Raw variant.** `ImageSaving.SaveImageAsync` must write the API's PNG/JPEG/WEBP bytes verbatim via `File.WriteAllBytesAsync`. Thumbnail-scale downsizing for combined-grid display is fine — that's an in-memory copy used only for layout, never written over the Raw file.

## gpt-image-2 Endpoint Options (reference)
The `/v1/images/generations` endpoint with `model=gpt-image-2` accepts:
- `size`: `1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `2048x1152`, `2560x1440` (QHD — cookbook's "recommended upper reliability boundary"), `3824x2144` (near-4K), or `auto`. Arbitrary resolutions are legal when edges are multiples of 16, max edge STRICTLY less than 3840 (cookbook 1.1), total pixels in [655 360, 8 294 400], and long:short ratio ≤ 3:1. Legacy 3840x2160 is treated as experimental and rejected by `GptImage2Generator.TryNormalizeSize` — use 3824x2144 instead.
- `quality`: `low`, `medium`, `high`, `auto`.
- `moderation`: `auto` (default) or `low` (permissive — we use `low` for batch runs).
- `n`: images per call. We plumb this through `GptImage2Generator.imageCount` (ctor) and surface it in the REPL as `:n N` / `[n=N]` override / `--repl-n` flag. Streaming handler collects all N images by `image_index` (or fallback insertion order) and returns them as separate `CreatedBase64Image` entries so `ImageManager`'s per-index save path produces distinct `...img0`, `...img1`, ... files.
- `output_format`: `png` (default), `jpeg`, `webp`. `output_compression` (0–100) applies to jpeg/webp. (Not currently exposed in the generator.)
- `background`: `auto`, `transparent`, `opaque` — transparent is not supported by gpt-image-2 (png/webp only, in practice rejected on this model).
- `stream`: `true` — we always stream and consume SSE to surface partials + heartbeat.
- `partial_images`: 0–3 (we send 2).
- **Do NOT send `input_fidelity` on `/generations`** — the generations endpoint rejects it on gpt-image-2 (always high-fidelity). Note the OpenAI cookbook's gpt-image-2 **edit** examples do pass `input_fidelity="high"`, so the restriction may be generations-specific; re-test when wiring `/v1/images/edits`.

Pricing is token-based ($30 / 1M output tokens). Rough per-image ceilings we report: low ≈ $0.02, medium ≈ $0.08, high ≈ $0.25.

## Testing Guidelines
No dedicated test project exists yet. Manually validate new workflows by running representative prompts and inspecting generated assets and metadata. When adding automated coverage, create an xUnit project referenced by the solution and ensure `dotnet test` succeeds. Capture regression prompts in `prompts.txt` with notes after bug fixes.

## Commit & Pull Request Guidelines
Keep commits focused and use imperative, present-tense subjects as in history (`rename and genericize describers`). Include context in the body for prompt sets or configuration changes. Pull requests should outline workflow impacts, note which services (BFL, Ideogram, Recraft) are affected, call out required settings updates, and attach screenshots or sample outputs for UI or prompt adjustments.

## Configuration & Secrets
Copy `MultiImageClient/settings - Fill this in and rename it.json` to `MultiImageClient/settings.json` (already `.gitignore`d), populate only the provider keys for services you intend to use, and never commit secrets. `Settings.Validate()` hard-requires only `LogFilePath` and `ImageDownloadBaseFolder`; every per-generator API key (and the Google Cloud trio: `GoogleCloudLocation`, `GoogleCloudProjectId`, `GoogleServiceAccountKeyPath`) is validated lazily by the generator that actually needs it, so unused generators can be left blank. Optional per-generator keys: `IdeogramApiKey`, `OpenAIApiKey` (DALL·E 3, GPT-Image-1, GPT-Image-2), `BFLApiKey`, `RecraftApiKey`, `GoogleGeminiApiKey` (NanoBanana), `GoogleCloudApiKey` (Vertex alternative), `AnthropicApiKey` (Claude rewrites & describer). Local Flux2 Klein/uncensored ComfyUI runs use `LocalFlux2ComfyEndpoint`, `LocalFlux2WorkflowPath`, and optional node/model override settings; setup notes live under `tools/local-flux2-comfy/`. Prompt file list lives in `PromptFiles` (array of absolute paths). `FlatImageMirrorPath` is an optional flat-folder mirror: if set, every saved raw/annotated/combined image is also copied to that single folder (best-effort, never fatal) — leave blank to disable. `TypedPromptsAppendFile` is an optional plain-text corpus: if set, any free-form prompt you type at the interactive batch loop is appended as one line to that file (embedded newlines collapsed, parent folder auto-created, never fatal) — handy for growing something like `2023-prompts.txt` over time. Prefer user secrets or environment variables when scripting automation or sharing runs.
64 changes: 33 additions & 31 deletions BFLApi/BFLAPIClient.csproj
Original file line number Diff line number Diff line change
@@ -1,32 +1,34 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net6.0</TargetFramework>
<LangVersion>10.0</LangVersion>
<ProjectGuid>{F68A7365-F772-4BD7-867B-A3E4A5D6DC5E}</ProjectGuid>
<Platforms>AnyCPU;ARM64</Platforms>
<StartupObject>BFLAPIClient.BFLClient</StartupObject>
</PropertyGroup>

<ItemGroup>
<Compile Remove="ideogramSaves\**" />
<EmbeddedResource Remove="ideogramSaves\**" />
<None Remove="ideogramSaves\**" />
</ItemGroup>

<ItemGroup>
<PackageReference Include="Anthropic.SDK" Version="4.1.1" />
<PackageReference Include="CommandLineParser" Version="2.9.1" />
<PackageReference Include="Newtonsoft.Json" Version="13.0.3" />
<PackageReference Include="System.Drawing.Common" Version="8.0.2" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\ImageGenerationClasses\ImageGenerationClasses.csproj" />
</ItemGroup>

<Target Name="PostBuild" AfterTargets="PostBuildEvent">
</Target>

<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Library</OutputType>
<TargetFramework>net9.0</TargetFramework>
<LangVersion>13.0</LangVersion>
<ProjectGuid>{F68A7365-F772-4BD7-867B-A3E4A5D6DC5E}</ProjectGuid>
<Platforms>AnyCPU;ARM64</Platforms>
</PropertyGroup>

<ItemGroup>
<Compile Remove="ideogramSaves\**" />
<EmbeddedResource Remove="ideogramSaves\**" />
<None Remove="ideogramSaves\**" />
</ItemGroup>

<ItemGroup>
<PackageReference Include="Anthropic.SDK" Version="4.1.1" />
<PackageReference Include="CommandLineParser" Version="2.9.1" />
<PackageReference Include="Magick.NET.Core" Version="14.8.2" />
<PackageReference Include="Newtonsoft.Json" Version="13.0.3" />
<PackageReference Include="OpenAI" Version="2.1.0" />
<PackageReference Include="SixLabors.ImageSharp" Version="3.1.11" />
<PackageReference Include="System.Drawing.Common" Version="8.0.2" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\ImageGenerationClasses\ImageGenerationClasses.csproj" />
</ItemGroup>

<Target Name="PostBuild" AfterTargets="PostBuildEvent">
</Target>

</Project>
Loading