Skip to content

Support multiple models like DeepSeek and OpenAI at the same time in one application #2221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sp213 opened this issue Feb 12, 2025 · 15 comments
Assignees
Milestone

Comments

@sp213
Copy link

sp213 commented Feb 12, 2025

Expected Behavior

SpringAI could support both DeepSeek and OpenAI configurations at the same time in one application.

Current Behavior

I failed to find a way to support both models at the same time in SpringAI.

Context

For now, the DeepSeek is using the configuration defined by OpenAI, so there is a problem, if I want to integrate DeepSeek, then there's no configuration for OpenAI. Any possibilities that SpringAI can support both models at the same time, in this way, I can dynamically switch the model from OpenAI to DeepSeek per client's request.

@kevintsai1202
Copy link

Agree! Many tools are capable of utilizing multiple models simultaneously to address a process issue. In Spring AI, however, one can only accomplish such tasks through more low-level approaches. It would be highly beneficial for software development if there were a class that could easily interchange multiple sets of configurations.

@sp213
Copy link
Author

sp213 commented Feb 15, 2025

Yes, I really need this feature. I do believe this is a real case for many Spring AI applications as well.

@FakeTrader
Copy link

i think you can follow the doc https://docs.spring.io/spring-ai/reference/api/chatclient.html#_create_a_chatclient_programmatically or create beans for each model

@liShuangQ
Copy link

I use oneAPI as a service, and then Spring AI uses the package of OpenAI. In this way, different models are utilized. I'm not sure if this is correct.

this.chatClient.prompt(
                        new Prompt(
                                p,
                                OpenAiChatOptions.builder()
                                        .model("xxxx")
                                        .build()
                        )
                )
                .call()
                .content();

@markpollack markpollack added this to the 1.0.0-RC1 milestone Apr 17, 2025
@markpollack
Copy link
Member

markpollack commented Apr 17, 2025

I agree this is a gap. The issue in the example linked to by @FakeTrader is that the underlying chatmodels in this use case need to point to different openai api compatible URLs.

I am looking to fix this for RC1 at least for OpenAI by letting users define multiple client endpoints in applicaton.yml via a map.

Also note, that there is a slippery slope with models that claim to be openai api compatible but then add extra fields to the json. This is polluting the openai implementation, so I think we need to have a dedicated deepseek module.

@markpollack
Copy link
Member

As a example of what I'm thinking

spring:
  ai:
    openai:
      models:
        enabled: true
        instances:
          gpt4:
            apiKey: "your-api-key-for-gpt4"
            baseUrl: "https://api.openai.com"
            organizationId: "your-org-id"
            chatProperties:
              options:
                model: "gpt-4"
                temperature: 0.7
          llama:
            apiKey: "your-api-key-for-llama"
            baseUrl: "https://your-custom-endpoint.com"
            chatProperties:
              options:
                model: "llama-70b"
                temperature: 0.5

and some basic usage (not chat client based, but you get the idea)

@Service
public class MyAIService {
    private final OpenAiChatModelRegistry modelRegistry;
    
    public MyAIService(OpenAiChatModelRegistry modelRegistry) {
        this.modelRegistry = modelRegistry;
    }
    
    public String generateWithGpt4(String prompt) {
        OpenAiChatModel gpt4Model = modelRegistry.getModel("gpt4");
        // Use the model
        return gpt4Model.generate(prompt).getResult().getOutput().getContent();
    }
    
    public String generateWithLlama(String prompt) {
        OpenAiChatModel llamaModel = modelRegistry.getModel("llama");
        // Use the model
        return llamaModel.generate(prompt).getResult().getOutput().getContent();
    }
}

@FakeTrader
Copy link

Also note, that there is a slippery slope with models that claim to be openai api compatible but then add extra fields to the json. This is polluting the openai implementation, so I think we need to have a dedicated deepseek module.

i believe a better approach would be to offer a way to customise the api. the current OpenAiApi is a single class file, and I have to duplicate large sections of code just to modify a single field.

@sun-rui
Copy link

sun-rui commented Apr 22, 2025

As a example of what I'm thinking

spring:
  ai:
    openai:
      models:
        enabled: true
        instances:
          gpt4:
            apiKey: "your-api-key-for-gpt4"
            baseUrl: "https://api.openai.com"
            organizationId: "your-org-id"
            chatProperties:
              options:
                model: "gpt-4"
                temperature: 0.7
          llama:
            apiKey: "your-api-key-for-llama"
            baseUrl: "https://your-custom-endpoint.com"
            chatProperties:
              options:
                model: "llama-70b"
                temperature: 0.5

and some basic usage (not chat client based, but you get the idea)

@Service
public class MyAIService {
    private final OpenAiChatModelRegistry modelRegistry;
    
    public MyAIService(OpenAiChatModelRegistry modelRegistry) {
        this.modelRegistry = modelRegistry;
    }
    
    public String generateWithGpt4(String prompt) {
        OpenAiChatModel gpt4Model = modelRegistry.getModel("gpt4");
        // Use the model
        return gpt4Model.generate(prompt).getResult().getOutput().getContent();
    }
    
    public String generateWithLlama(String prompt) {
        OpenAiChatModel llamaModel = modelRegistry.getModel("llama");
        // Use the model
        return llamaModel.generate(prompt).getResult().getOutput().getContent();
    }
}

+1 for this feature. I have done some hack with similar configuration layout like this before.

@avanathan
Copy link

avanathan commented Apr 23, 2025

We have custom endpoint which acts as an orchestrator for multiple LLMs (including ChatGPT). It follows API spec for OpenAI. Now, if we need to consume multiple models we need to configure different base urls - under openAi spec, we cannot use Spring AI configuration as it stands. I was thinking of an Array under OpenAI. But the setup @markpollack described above works too (provided instance names are given by developers/customizable).

@flamezhang
Copy link

Image

I found the API here that allows me to create a chatclient in a programmatic way, but I don't know how to create a ChatModel and didn't quite understand the instructions here.

can help me?

@YunKuiLu
Copy link
Contributor

Image

I found the API here that allows me to create a chatclient in a programmatic way, but I don't know how to create a ChatModel and didn't quite understand the instructions here.我在这里找到了允许我以编程方式创建 chatclient 的 API,但不知道如何创建 ChatModel,并且在这里的说明也不是很明白。

can help me?  能帮到我吗?

You can refer to: https://docs.spring.io/spring-ai/reference/api/chat/openai-chat.html#_manual_configuration

@markpollack
Copy link
Member

markpollack commented May 7, 2025

I'm tryint to collect all the different issues around this. It is true that one can do

var openAiApi = OpenAiApi.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .build();
var openAiChatOptions = OpenAiChatOptions.builder()
            .model("gpt-3.5-turbo")
            .temperature(0.4)
            .maxTokens(200)
            .build();
var chatModel = new OpenAiChatModel(this.openAiApi, this.openAiChatOptions);

but the issue is that OpenAiChatModel doesn't have such a simple constructor as in the docs.

Now it is

	public OpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions defaultOptions, ToolCallingManager toolCallingManager,
			RetryTemplate retryTemplate, ObservationRegistry observationRegistry) {
		this(openAiApi, defaultOptions, toolCallingManager, retryTemplate, observationRegistry,
				new DefaultToolExecutionEligibilityPredicate());
	}

	public OpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions defaultOptions, ToolCallingManager toolCallingManager,
			RetryTemplate retryTemplate, ObservationRegistry observationRegistry,
			ToolExecutionEligibilityPredicate toolExecutionEligibilityPredicate) {

So let's assume we we start with two autoconfigured beans

@Autowired
OpenAiChatModel baseChatModel;

@Autowired
OpenAiApi baseOpenAiApi;

Suppose you want to create a GPT-4 and a Llama instance, each with their own endpoint and options:

// Derive a new OpenAiApi for GPT-4
OpenAiApi gpt4Api = baseOpenAiApi.mutate()
    .baseUrl("[https://api.openai.com](https://api.openai.com)")
    .apiKey("your-api-key-for-gpt4")
    // .organizationId("your-org-id") // if supported
    .build();

// Derive a new OpenAiApi for Llama
OpenAiApi llamaApi = baseOpenAiApi.mutate()
    .baseUrl("[https://your-custom-endpoint.com](https://your-custom-endpoint.com)")
    .apiKey("your-api-key-for-llama")
    .build();

// Derive a new OpenAiChatModel for GPT-4
OpenAiChatModel gpt4Model = baseChatModel.mutate()
    .api(gpt4Api)
    .options(OpenAiChatOptions.builder()
        .model("gpt-4")
        .temperature(0.7)
        .build())
    .build();

// Derive a new OpenAiChatModel for Llama
OpenAiChatModel llamaModel = baseChatModel.mutate()
    .api(llamaApi)
    .options(OpenAiChatOptions.builder()
        .model("llama-70b")
        .temperature(0.5)
        .build())
    .build();

While there can be declarative means, I think at a programmatic level this would work?

Trying to get this into RC1. I'll make a spike and a PR to discuss.

@markpollack
Copy link
Member

I've make a WIP PR for people to review. #3037

The flow in the tests is

@SpringBootTest(classes = MultiOpenAiClientIT.Config.class)
@EnabledIfEnvironmentVariable(named = "GROQ_API_KEY", matches = ".+")
@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+")
@ActiveProfiles("logging-test")
class MultiOpenAiClientIT {

    private static final Logger logger = LoggerFactory.getLogger(MultiOpenAiClientIT.class);

    @Autowired
    private OpenAiChatModel baseChatModel;

    @Autowired
    private OpenAiApi baseOpenAiApi;

    @Test
    void multiClientFlow() {
        // Derive a new OpenAiApi for Groq (Llama3)
        OpenAiApi groqApi = baseOpenAiApi.mutate()
            .baseUrl("https://api.groq.com/openai")
            .apiKey(System.getenv("GROQ_API_KEY"))
            .build();

        // Derive a new OpenAiApi for OpenAI GPT-4
        OpenAiApi gpt4Api = baseOpenAiApi.mutate()
            .baseUrl("https://api.openai.com")
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .build();

        // Derive a new OpenAiChatModel for Groq
        OpenAiChatModel groqModel = baseChatModel.mutate()
            .openAiApi(groqApi)
            .defaultOptions(OpenAiChatOptions.builder().model("llama3-70b-8192").temperature(0.5).build())
            .build();

        // Derive a new OpenAiChatModel for GPT-4
        OpenAiChatModel gpt4Model = baseChatModel.mutate()
            .openAiApi(gpt4Api)
            .defaultOptions(OpenAiChatOptions.builder().model("gpt-4").temperature(0.7).build())
            .build();

        // Simple prompt for both models
        String prompt = "What is the capital of France?";

        String groqResponse = ChatClient.builder(groqModel).build().prompt(prompt).call().content();
        String gpt4Response = ChatClient.builder(gpt4Model).build().prompt(prompt).call().content();

        logger.info("Groq (Llama3) response: {}", groqResponse);
        logger.info("OpenAI GPT-4 response: {}", gpt4Response);

        assertThat(groqResponse).containsIgnoringCase("Paris");
        assertThat(gpt4Response).containsIgnoringCase("Paris");

        logger.info("OpenAI GPT-4 response: {}", gpt4Response);

        assertThat(groqResponse).containsIgnoringCase("Paris");
        assertThat(gpt4Response).containsIgnoringCase("Paris");
    }

    @SpringBootConfiguration
    static class Config {

        @Bean
        public OpenAiApi chatCompletionApi() {
            return OpenAiApi.builder().baseUrl("foo").apiKey("bar").build();
        }

        @Bean
        public OpenAiChatModel openAiClient(OpenAiApi openAiApi) {
            return OpenAiChatModel.builder()
                    .openAiApi(openAiApi)
                    .build();
        }

    }
}

@markpollack
Copy link
Member

Also note that deepseek now has it's own model implantation as it is starting to differ significantly in terms of options from openai.

@markpollack markpollack self-assigned this May 9, 2025
@markpollack
Copy link
Member

See #3037 . Closing this issue for now. We should revisit a more comprehensive declarative solution post GA in another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants