feat: BMDB integration, system prompt split, LLM speed-ups, dual-DB UI by jcschaff · Pull Request #66 · virtualcell/VCell-AI

jcschaff · 2026-05-06T17:18:20Z

Summary

Brings 76 commits from development into main. Major themes:

BMDB (BioModels.org) integration — new bmdb_router/bmdb_controller/bmdb_schema, new service functions (fetch_bmdb_models, get_xml_file, get_bmdb_model_info), new BMDB tools wired into the LLM, and a parallel BMDB search path on the frontend.
System prompt split — monolithic system_prompt.py carved into a base SYSTEM_PROMPT + per-DB BMDB_SYSTEM_PROMPT / VCDB_SYSTEM_PROMPT, composed at runtime based on the selected database.
LLM response speed-ups — should_use_tools() skips tool round-trips for chitchat; select_tools_for_prompt() filters tools into DB_TOOLS / KB_TOOLS / PUB_TOOLS subsets via regex on the user prompt; asyncio.gather runs tool calls concurrently; summarize_tool_result() truncates large tool outputs; default_rows lowered 1000 → 25. Per-stage timing surfaced to the UI as tool_summary.
Dual-DB UI — ChatBox gains useVCDB / useBMDB checkboxes, a Stop button (AbortController), conditional quick-action button groups, and BMDB-formatted result rendering.
Conversation history — localStorage-backed conversations with deep-linking via ?conversation=<uuid>, real entries in the sidebar.
Service rename — app.services.vcelldb_service → app.services.databases_service (all in-tree imports updated).
Misc — Suspense wrappers for Next 15 build, BMDB AI Analysis tab on /search/[bmid], settings page link updates, Pydantic v2 SettingsConfigDict(extra="ignore").

⚠️ Known issues to address before merging

Opened as draft — the following surfaced during review:

Blocking

Debug print() / console.log left in — llms_service.py (incl. one that dumps the full messages payload), databases_service.py (top-level print on import + several CHECK/DEBUG/RAW JSON prints), tools_utils.py, vcelldb_controller.py, llms_router.py, and ChatBox.tsx (PPPPPP, RRRRRR, AAAAAA, bmkeys x2).
bmkeys = [] reset inside the per-tool-call loop in llms_service.py — only the last tool call's keys survive. Move the initialization above the loop.
No-tools fast path returns the message object, not the content — direct_text = response.choices[0].message or "" should be .message.content or "" in llms_service.py.
Wrong-host connectivity probe — databases_service.get_xml_file calls check_vcell_connectivity() (DNS-checks vcell.cam.uchc.edu) before hitting biomodels.org.
Inconsistent BMDB base URL — backend hardcodes https://biomodels.org/; frontend / docker-compose.yml use https://www.biomodels.org/. Pick one and centralize.
Stray imports — from multiprocessing import process in llms_router.py:1 (likely IDE auto-import); import Suspense from \"react\" in analyze/[id]/page.tsx:4 is a default-import typo (should be named) and unused.
Stray ß character in a comment in llms_service.py ("simple, conversational promptsß").

Should fix

Silent behavior change — LLM tool calls capped at default_rows=25 inside execute_tool regardless of what the model requests; the tool schema still advertises maximum: 50. Either raise the cap or update the schema.
Dead code — unused CategoryEnum / OrderByEnum in bmdb_schema.py; empty sanitize_xml_content stub in databases_service.py; commented payload/userMessage blocks duplicated across handleSendMessage and handleSendMessageBMDB in ChatBox.tsx.
Refactor near-duplicate send functions — handleSendMessage and handleSendMessageBMDB in ChatBox.tsx are ~140-line copies; parameterize over the database key.
Test coverage — test_vcelldb_service.py only got its import path updated for the rename. No tests cover the new fetch_bmdb_models / get_xml_file / get_bmdb_model_info, nor the load-bearing should_use_tools / select_tools_for_prompt regex routing on which the speed claims rest.
BMDB_SYSTEM_PROMPT is missing the publications guidance the old monolithic prompt had — BMDB-mode publication questions will degrade.

Migration / deployer notes

app.services.vcelldb_service is renamed to app.services.databases_service. Any out-of-tree importer (e.g. populate_db.ipynb, CI scripts) must be updated.
New env var NEXT_PUBLIC_API_URL_BMDB is consumed in frontend/app/search/page.tsx and frontend/app/search/[bmid]/page.tsx. Confirm it lands in frontend/.env.example.
Settings switched to Pydantic v2 SettingsConfigDict(extra=\"ignore\") — masks future env var typos silently.
get_llm_response now returns a 3-tuple (result, bmkeys, tool_summary); affected endpoints' JSON gains a tool_summary field.

Test plan

Backend tests pass: `cd backend/app && poetry run pytest tests/`
Frontend builds: `cd frontend && npm run build`
Frontend lints: `cd frontend && npm run lint`
Manual: `/chat` page — VCDB-only, BMDB-only, and both-DBs queries return formatted results
Manual: `/search/[bmid]` — both a VCDB id and a BMDB id (e.g. `BIOMD…` / `MODEL…`) render correctly, AI Analysis tab works for each
Manual: conversation history — start a chat, refresh, deep-link via `?conversation=`, verify restoration
Manual: Stop button aborts an in-flight request

🤖 Generated with Claude Code

…S_API_URL

… for each request

…o backend llm functions

…repo

…input and add another button 'provide history of model changes'

se, format the output results from the query

… search

…ntroduce and test handlesendmessage2 for sending queries to BMDB

…multiple db

…vcdb or bmdb

…d another one to /search/id

…ic questions, add the tool to llm processing

…s; add a stop button

…bmdb specific actions

…te in real time

… up llm response time

…ting tools into subsets and choosing which tools to send to llm based on user prompt; decreasing max result return

… lists (>10 models)

…based on its type

…anging format, this way the llm will stop returning false results

… answers are registered on the chatpage

…k to generate the final response

…d queried even on the biomodel-id-specific page instead of only one

…B for consistency

…L and BIOMD

…n the biomodel specific page (AI Analysis) on the chat screen + loads the history in the sidebar

…functions; start adding the endpoints to the router file

…o llms_router

…out BMDB model to “ask about this specific model”

…iles and for getting information about a specific model

…output

…vidually

…to identifiers.org should be underlined with a link available, and no other elements should have links

reeshapatel12 added 30 commits March 1, 2026 14:55

adding some debugging messages for get_biomodels function

66fae35

debugging messages for vcell specific service.py and testing BIOMODEL…

619c56c

…S_API_URL

update system prompt for better formatting guidelines and consistency…

7d78aac

… for each request

add a test function for connecting BioModels API from frontend chat t…

00cc3e9

…o backend llm functions

add debugging messages to see how backend calls/process works

c1047e4

update admin settings with the proper links/steps for setting up the …

64d1645

…repo

In BioModel search, add a real biomodelID (270051643) as the default …

7b6c039

…input and add another button 'provide history of model changes'

edit BioModels search html setup for results page specific to model

f332874

edit main chat page to add new function button at the bottom of display

5c71e1c

add new search box for querying BioModels Databa

2cea464

se, format the output results from the query

format the biomodel-specific page for results from BioModels database…

1c16a64

… search

add another section of buttons at the bottom of the main chat page, i…

15d48ae

…ntroduce and test handlesendmessage2 for sending queries to BMDB

rename vcelldb_service to databases_service to generalize for adding …

fd9ab57

…multiple db

update to add bmdb-specifc tools for connecting llm to bmdb

6dfef94

update with paths to connect bmdb to backend llm response functions

69670d9

Pass string instead of a bool for the database key, currently either …

05ba323

…vcdb or bmdb

add links to bmdb formatted response: one to the original database an…

7971cd3

…d another one to /search/id

add ai analysis tab to biomodel specific page for bmdb models

7b136fd

add new tool to handle getting results from xml files for bmdb specif…

ed9903e

…ic questions, add the tool to llm processing

edit system prompt to make it less vcell specific

a301b40

add radio swtich buttons for each database instead of two search boxe…

1a12e95

…s; add a stop button

remove extra buttons on bottom of chat page

e9b6750

edit system prompt to fix formatting for both database responses

be96691

connect all buttons to the chatbox

fffa419

change the quickaction buttons and add switch for vcell specific and …

177e60b

…bmdb specific actions

add an option to query multiple databases at once and render the results

c3e6f5a

update loading history in sidebar to only load once per chat and upda…

17465bf

…te in real time

wrap SearchParams in Suspense to build with docker without error

a800a98

updated config settings

49bbfb5

push updated docker_compose settings

d88b73d

reeshapatel12 and others added 30 commits March 24, 2026 15:03

try separating into two format guideline sections for each database

6f047da

force consistent formatting in system prompt

d6871d1

introduce various changes to try and speed up llm response time

8296613

update summarize tool function and add clearer timing logs

d994109

add comments for each new implementation of a suggestion for speeding…

9da8524

… up llm response time

implement several suggestions for improving llm response time: separa…

2b8032e

…ting tools into subsets and choosing which tools to send to llm based on user prompt; decreasing max result return

remove some link formatting from system_prompt and add rules for long…

6e41b2b

… lists (>10 models)

remove specific link formatting

5a965ff

define a function to format a link to the database for each biomodel …

4c9ac12

…based on its type

change the summarize tool function to only shorten results without ch…

0b90dc3

…anging format, this way the llm will stop returning false results

when querying two databases, keep the "AI is thinking" box until both…

372202b

… answers are registered on the chatpage

editing system prompt to fix link formatting for biomodel names.

96ee74d

add a tool summary to output to show how long the query/all tools too…

734af8f

…k to generate the final response

update /search/page.tsx to allow for both databases to be selected an…

3001e0d

…d queried even on the biomodel-id-specific page instead of only one

change all references (in all files) for Biomodels Database to be BMD…

113b3ce

…B for consistency

update BMDB processing to include biomodels that start with both MODE…

7a09ecf

…L and BIOMD

change all references to biomodels database to be BMDB for consistency

26f1b1c

move biomodels endpoint to its own router file

126b73d

create a new section on /local/docs page for all BMDB endpoints

74d3f8a

fix quickAction logic so that it shows the queries from the buttons o…

986915b

…n the biomodel specific page (AI Analysis) on the chat screen + loads the history in the sidebar

create BMDB router and controller files for those specific endpoints/…

a477884

…functions; start adding the endpoints to the router file

Separate BMDB and VCDB routes and controllers; moved LLM BMDB query t…

6172777

…o llms_router

update quickAction handling so it only appears once in the chat screen

460c472

change all references to public_next_url2 to be a bmdb specific name

89a78ab

update the handle search function name to be bmdb specific

9318e99

in /search change text of visible prompt ask about vcell model/ask ab…

c61d47f

…out BMDB model to “ask about this specific model”

create more endpoints for BMDB API calls, including for getting xml f…

937e5ee

…iles and for getting information about a specific model

log the number of rows fetched for list of biomodels in the frontend …

517dbf9

…output

split system prompt into three separate prompts and improve each indi…

5db77db

…vidually

add to system prompt that all model elements that have links leading …

3a2ab97

…to identifiers.org should be underlined with a link available, and no other elements should have links

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: BMDB integration, system prompt split, LLM speed-ups, dual-DB UI#66

feat: BMDB integration, system prompt split, LLM speed-ups, dual-DB UI#66
jcschaff wants to merge 60 commits intomainfrom
development

jcschaff commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jcschaff commented May 6, 2026

Summary

⚠️ Known issues to address before merging

Blocking

Should fix

Migration / deployer notes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants