Improve precision of grammar_strict
#94
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In some cases, we are underapproximating the next token set, the model may behave weirdly. This is mainly because the grammar-constrained decoding forces the model to generate whitespace " " and the next word separately. The current LLMs are not trained to generate words after whitespace, and the model's quality can degrade.
Hence, in this PR we improve the precision of this approach by using accept sequences of length 3 in certain cases. Mainly, when %ignore tokens such as whitespace are present, this would enable SynCode to have a further lookahead.
Consider a case, input = "I" and the model's next choice is to choose " have" as the next token. SynCode with single lookahead and underapproximating
grammar_strict
mode, would force the model to generate " ". Typically, during the model's actual training, the model is not trained on inputs ending with whitespace. Thus, in the next step, when input="I ", the model's behavior seems to be poorer.To fix this, in this PR we allow a longer lookahead accept sequences in some cases.