-
Notifications
You must be signed in to change notification settings - Fork 145
make deepseekv3 renderer work with system messages, add renderer that forces thinking #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Renderer that forces inclusion of a thinking block for DsV3 models. | ||
| """ | ||
|
|
||
| def _render_message(self, message: Message) -> tuple[list[int], list[int], list[int]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from the NoThinking renderer. Not sure exactly why it's here.
| if new_content.startswith("</think>"): | ||
| new_content = new_content[len("</think>") :] | ||
| if not new_content.startswith("<think>"): | ||
| new_content = "<think>" + new_content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to support/use the Message's 'thinking' field here like gpt-oss? If we do want to support <think> also directly in the content, should we enforce here that there is also a </think> followed by an actual message after it? And not sure if for multi-message interaction the thinking tokens should be rendered in all previous assistant messages, or be kept only in the last message like in gpt-oss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good questions. Haven't thought these things through yet. No rush to merge this one -- still experimenting with it in my use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: thinking field, we might as well prioritize supporting the most advanced tool use systems that are being released, which would include gpt-oss and kimi-k2-thinking, which use interleaved CoT + tool calls. So I'm in favor of whatever improves the support for those models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what's the right policy for possibly hiding CoT -- in gpt-oss, I assume we include all the thinking traces from the current turn (i.e., back until the last user message), in cases where we have multiple assistant messages including tool calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in gpt-oss we include thinking tokens from the current turn, even if there are multiple assistant channel messages
No description provided.