Skip to content

Found a bug in the codellama vllm model_len logic. #380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 15, 2023

Conversation

sam-scale
Copy link
Contributor

Also, let's just avoid the vLLM error by making sure max_num_batched_tokens >= max_model_len

Pull Request Summary

What is this PR changing? Why is this change being made? Any caveats you'd like to highlight? Link any relevant documents, links, or screenshots here if applicable.

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.

Also, let's just avoid the vLLM error by making sure max_num_batched_tokens >= max_model_len
Copy link
Contributor

@saiatmakuri saiatmakuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching this!

@sam-scale sam-scale merged commit 5b6aeff into main Nov 15, 2023
@sam-scale sam-scale deleted the ss/two-quick-vllm-bugs branch November 15, 2023 21:33
yunfeng-scale added a commit that referenced this pull request Nov 17, 2023
yunfeng-scale added a commit that referenced this pull request Nov 17, 2023
@yunfeng-scale yunfeng-scale mentioned this pull request Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants