SOLR-15701: Complete configsets api#4264

Open

epugh wants to merge 14 commits intoapache:mainfrom

epugh:complete_configsets_api

Contributor

epugh commented Apr 4, 2026 •

edited

Loading

https://issues.apache.org/jira/browse/SOLR-15701

Description

Add Download and GetFile and PutFile to ConfigSets API and SolrJ.

Solution

This was extracted with copilots help from the SchemaDesigner code base. We are going to merge this as part of ConfigSets API. In a future PR, once this is done, we'll finish the currently started SchemaDesigner work to use these APIs.

The PutFile was broken out of the Upload feature into it's own files with it's own parameters. Before it was inter twingled with Upload of a zipped configset.

Tests

Added some tests.

epugh added 4 commits

April 3, 2026 17:28


          Add download and get file to configsets api. Update docs.

a5c6ccc

Separated out of the larger PR around SchemaDesigner.


          add ref guide docs

8dc9e80


          Merge remote-tracking branch 'upstream/main' into complete_configsets…

f2c1d8f

…_api


          Document change

c7f7ad4

epugh requested a review from gerlowskija

April 4, 2026 00:04

github-actions bot added documentation tests cat:api labels

VishnuPriyaChandraSekar reviewed

View reviewed changes

solr/core/src/java/org/apache/solr/handler/configsets/GetConfigSetFile.java Outdated Show resolved Hide resolved

VishnuPriyaChandraSekar reviewed

View reviewed changes

Contributor

VishnuPriyaChandraSekar left a comment

I have just started reviewing the Solr code base.
I just left a minor comment other than that, PR seems to be fine.

gerlowskija requested changes

View reviewed changes

Contributor

gerlowskija left a comment

This looks pretty good overall. I left some inline comments, mostly small things.

I do have one larger question here though: I gather much of this functionality already existed under the aegis of the SchemaDesigner, but I don't see any deletions or modifications to those files in this PR. Shouldn't this PR be deleting the "download-configset" and "get-configset-file" APIs that exist over in schema-designer land?

solr/api/src/java/org/apache/solr/client/api/endpoint/ConfigsetsApi.java Outdated

+                /**
+                 * V2 API definition for downloading an existing configset as a ZIP archive.
+                 *
+                 * <p>Equivalent to GET /api/configsets/{configSetName}/download

Contributor

gerlowskija Apr 7, 2026

[-0] Not sure what you're trying to say here, but this could use some word-smithing. There's no v1 counterpart to this API which is where I typically see the "equivalent to" language.

Alternately If you're trying to highlight what the API actually looks like, then maybe "Available at" works better? (Or omit altogether since the @Path annotation is a just a line or two below)

Contributor Author

epugh Apr 11, 2026

argh, copy n paste from the old schemadesigner.

solr/api/src/java/org/apache/solr/client/api/endpoint/ConfigsetsApi.java

+                @Path("/configsets/{configSetName}")
+                interface Download {
+                  @GET
+                  @Path("/download")

Contributor

gerlowskija Apr 7, 2026

[-1] /download isn't very REST-ful. The fact that we're fetching/downloading the resource is already kindof implied by the HTTP 'GET' verb/method.

Could we drop it and make this "just" GET /api/configsets/{configsetName}?

Contributor Author

epugh Apr 11, 2026

so we could, and I was thinking about it and then asked Claude for soem suggestions, and he kind of came back to "/download" is easy. Here is the write up:
RESTful Alternatives

Option 1: Keep current design ✅ (Recommended)

GET /api/configsets                          → List (JSON)
GET /api/configsets/{name}/download          → Download ZIP
GET /api/configsets/{name}/{filePath}        → Get single file

Pros:

Explicit and unambiguous
Self-documenting (/download tells you what it does)
No conflicts with file paths
Common pattern (GitHub uses /archive, GitLab uses /repository/archive, etc.)

Cons:

Not purely resource-oriented REST

Option 2: Use `/files` sub-resource

GET /api/configsets/{name}                   → Metadata (JSON) - NEW
GET /api/configsets/{name}/files             → Download all as ZIP
GET /api/configsets/{name}/files/{path}      → Get single file

Pros:

More RESTful hierarchy
Room for metadata endpoint

Cons:

Breaking change to existing API
More complex paths

Option 3: Content negotiation (Pure REST)

GET /api/configsets/{name}
  Accept: application/json  → Metadata
  Accept: application/zip   → ZIP download

Pros:

True REST content negotiation

Cons:

Solr doesn't typically use Accept headers this way
Harder to use with curl/browsers
Still conflicts with /{filePath} pattern

Recommendation

Keep the current design (/download) because:

✅ Unambiguous - no path conflicts
✅ Self-documenting - clear what you get back
✅ Pragmatic - works well without complex content negotiation
✅ Common pattern - many APIs use action-style endpoints for different representations
✅ Backward compatible - if we add metadata later, we can use GET /api/configsets/{name} for JSON

Contributor Author

epugh Apr 11, 2026

If we had another places that we did the zip download, then I could see making them all work the same, with whatever pattern we picked...

solr/api/src/java/org/apache/solr/client/api/endpoint/ConfigsetsApi.java Outdated

+                  @Produces("application/zip")
+                  Response downloadConfigSet(
+                      @PathParam("configSetName") String configSetName,
+                      @QueryParam("displayName") String displayName)

Contributor

gerlowskija Apr 7, 2026

[Q] From looking ahead at code in this PR, it looks like 'displayName' is used primarily (solely?) to inform the "attachment name" part of a "Content-Disposition" response header.

Do I have that right? Or am I missing another usage?

Contributor Author

epugh Apr 11, 2026

that is it... I looked a bit at if we really needed it, but for schema designer we do because it makes crazy "temp" names for the configset.

Contributor Author

epugh Apr 11, 2026

I leaned something new! Turns out we can spedifcy the downloaded zip file name from the JavaScript side in modern browsers, which means we can clean this API up!

solr/api/src/java/org/apache/solr/client/api/endpoint/ConfigsetsApi.java Outdated

+                      summary = "Get the contents of a file in a configset.",
+                      tags = {"configsets"})
+                  ConfigSetFileContentsResponse getConfigSetFile(
+                      @PathParam("configSetName") String configSetName, @QueryParam("path") String filePath)

Contributor

gerlowskija Apr 7, 2026

[-0] IMO "filePath" should be a path parameter rather than a query-parameter. That would allow this API to mirror the "update-configset-file" API really well, and also bring it into line with what we've done for other similar APIs in filestore and elsewhere.

To be explicit, I'm suggesting this API be: GET /api/configsets/<configSetName>/files/some/file/path.txt

Contributor Author

epugh Apr 11, 2026

Interesting... That mimics the suggestion on haveing a /files/ pattern that Claude gave me re the download. Makes sense.

Contributor Author

epugh Apr 11, 2026

Okay, done with this.

solr/api/src/java/org/apache/solr/client/api/endpoint/ConfigsetsApi.java Outdated

+                  @Operation(
+                      summary = "Get the contents of a file in a configset.",
+                      tags = {"configsets"})
+                  ConfigSetFileContentsResponse getConfigSetFile(

Contributor

gerlowskija Apr 7, 2026

[Q] Can you add a bit of context around the decision to return a structured POJO here vs. "just" the verbatim file bytes approach we take in filestore, replication handler, ZooKeeperReadAPI, etc.

The structured POJO route seems to conflict a bit with the @Produces(TEXT_PLAIN) annotation above, unless I'm missing something?

How does this behave if a user puts a small binary file in their configset?

Contributor Author

epugh Apr 11, 2026

wow, this ia a damn good question, that looked simple and then got more complex. Yeah, what if you store a model object that is binary? Or an image? I could imagine that in the near future we will have more binary objects that are part of a configset, and that you might put and get individually... for example, some of the NLP stuff has small binary objects that might be in your configset...

Contributor Author

epugh Apr 11, 2026

Okay, Did a pretty big reworking to bring this in line with other apis in solr... FYI, FileStore still does werid things with wt=raw, but apparently that is because it supports v1 and v2. This only does v2, so easier.

solr/core/src/java/org/apache/solr/handler/configsets/DownloadConfigSet.java Outdated

+                  final String fileName = safeName + "_configset.zip";
+                  return Response.ok((StreamingOutput) outputStream -> outputStream.write(zipBytes))
+                      .type("application/zip")
+                      .header("Content-Disposition", "attachment; filename=\"" + fileName + "\"")

Contributor

gerlowskija Apr 7, 2026

[Q] "Content-Disposition" is a browser-based header AFAICT - it's primarily a hint for browsers as to whether something's an attachment or not. Is it worth having here? Why would we include such a header here, but not on (e.g.) the filestore API or in ReplicationHandler, or somewhere similar that's downloading file content?

Contributor Author

epugh Apr 11, 2026

I need to confirm that this doesn't cause any issues with how Schema Designer works.. Which I think may be the only one who cares about this, versus the filestore api etc. You know, I may back it out until we migrate schema signer over, and if it needs it, we can add it back in, or have a discsussion...

Contributor Author

epugh Apr 11, 2026

okay, turns out this is a older way of handling things, modern browsers take what the schema-designer.js specifies as the filename., so we don't need this at all.

Contributor Author

epugh Apr 11, 2026

this also addresses an earlier ocmment of yours!

solr/core/src/test/org/apache/solr/handler/configsets/UploadConfigSetAPITest.java

+              import org.junit.Test;
+              /** Unit tests for {@link UploadConfigSet}. */
+              public class UploadConfigSetAPITest extends SolrTestCase {

Contributor

gerlowskija Apr 7, 2026

[Q] Can you share a bit of context around this test file please? I'm not going to complain about new tests, but I'm surprised to find that a third of this PR is tests for an API that wasn't touched by the PR 😛

Contributor Author

epugh Apr 11, 2026

yeah, so I was messing around with this PR and SOLR-16341 (blank file in configset) and realized this was lacking and fit in the "complete".

solr/solr-ref-guide/modules/configuration-guide/pages/configsets-api.adoc Outdated

+              [[configsets-download]]
+              == Download a Configset
+              The `download` command downloads an entire configset as a zipped file.

Contributor

gerlowskija Apr 7, 2026

[0] The "command" language here is unfortunate. It fits the v1 APIs and their action=LIST|UPLOAD|CREATE|etc syntax well, but doesn't make sense for the v2 API IMO.

Maybe something like the following would be a bit clearer:

The v2 API allows configsets to be downloaded as a single zipped file. This is useful for backing up configsets, sharing ...
The download API takes the following parameters:

solr/solr-ref-guide/modules/configuration-guide/pages/configsets-api.adoc Outdated

+              [[configsets-get-file]]
+              == Get a Single File from a Configset
+              This command retrieves the contents of a single file from an existing configset.

Contributor

gerlowskija Apr 7, 2026

ditto, re: replacing "command" with "API" or less v1 specific language

solr/solr-ref-guide/modules/configuration-guide/pages/configsets-api.adoc Outdated

+              +
+              The path to the file within the configset (e.g., `solrconfig.xml` or `lang/stopwords_en.txt`).
+              The response will be a JSON object containing:

Contributor

gerlowskija Apr 7, 2026

[-0] Why "JSON" here? v2 APIs in general support javabin, xml, etc. in addition to JSON. Is this API specifically JSON-only in some way? If not, then we might not want to be overly specific here. Maybe drop the word "JSON" and leave the rest as-is?

Contributor Author

epugh Apr 12, 2026

okay, dealt wiht in the other refacotr...

Contributor Author

epugh commented Apr 11, 2026

I have just started reviewing the Solr code base. I just left a minor comment other than that, PR seems to be fine.

@VishnuPriyaChandraSekar by the way, we need more code reviewers for the PR's in Solr. With AI, it's easier then ever to generate code, but reviewing it really isn't any easier. It would be great if you reviewed more PRs, as that would help contributors get some early feedback! I believe you can use the Code Review feature and approve a PR, even if you aren't a committer, you just can't merge.

epugh added 3 commits

April 11, 2026 13:53


          Respond to feedback.

c6b4d9c


          Turns out standalone doesn't support uploading a nested file, though …

e77c73e

…the cloud version does.

Don't love that I'm improving standalone hwen I want to eliminate it.


          Move to using a GET that mimics the upload a file pattern

be4717b

github-actions bot added the cat:cloud label

epugh added 2 commits

April 11, 2026 15:06


          Bring in line downloading a file with other apis in solr.

497b5ba

Use a binary format, don't try and provide path metadata...


          Simplify API based on modern js not needing content disposition header

1cf5545

Contributor Author

epugh commented Apr 11, 2026

Lots of progress, I think I am down to doc fixes.

Contributor Author

epugh commented Apr 12, 2026

Working through the docs.. I am going to split uploading a complete configset from the putting of a single file in an existing configset. Right now the methods are on the same upload interface, and it's a bit confusing.

epugh added 5 commits

April 12, 2026 12:50


          Big refactor to break out uploading a single file into it's own set o…

2c6e7bc

…f classes

Keeps the overall pattern, but simplifies docs by having it's own docs.


          Merge remote-tracking branch 'upstream/main' into complete_configsets…

f200359

…_api


          Merge remote-tracking branch 'upstream/main' into complete_configsets…

9997a9a

…_api


          Now that single file are seperate than zip uploads, clean up tests.

6d2aac9


          several remnants of the "trusted configsets" concept that was removed…

e59a2d2

… in SOLR-17584 that are still around.

Simplifies the tests.

Contributor Author

epugh commented Apr 12, 2026 •

edited

Loading

@gerlowskija I couldn't help it, I also found some remnants of the trusted configset concept that i could remove in this... Simplified osme of the tests!

I think I'm ready for final review!

epugh requested review from VishnuPriyaChandraSekar and gerlowskija

April 12, 2026 20:14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:api cat:cloud documentation tests