Skip to content

issue/296 - feat: add Worker and ModelRunner for PD disaggregation#304

Open
spike-zhu wants to merge 9 commits intomainfrom
issue/296
Open

issue/296 - feat: add Worker and ModelRunner for PD disaggregation#304
spike-zhu wants to merge 9 commits intomainfrom
issue/296

Conversation

@spike-zhu
Copy link
Copy Markdown
Collaborator

测试截图:
image

@pengcheng888
Copy link
Copy Markdown
Collaborator

调整 kvcache config 的位置,让创建c++ engine时就初始化kv cache



class Worker(WorkerBase):
"""Default worker for single-device / standalone inference.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worker底下是model runner,model runner里面有个InferEngine是负责多个device的,worker怎么可能是"for single-device"

Architecture:

LLMEngine
└── Worker
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个设计的意思是Scheduler要透过Worker和ModelRunner才能和KVConnector交流吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants