You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
switch the model on left down corner ( i tried "Llama-2-7b-chat-hf").
Input Prompt, Click on ''Generate"
See error
Screenshots
##Log captured below
load llm model Llama-2-7b-chat-hf finish. cost 15.6s
No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.
{'input_ids': tensor([[ 1, 1, 29961, 25580, 29962, 3532, 14816, 29903, 6778, 13,
3492, 526, 263, 8444, 13436, 20255, 29889, 3529, 3867, 9109,
29892, 11314, 936, 322, 16232, 2472, 304, 278, 1404, 29889,
3529, 3013, 278, 1962, 1426, 4086, 278, 1021, 408, 278,
1404, 1881, 29889, 13, 29966, 829, 14816, 29903, 6778, 13,
13, 5816, 338, 319, 29902, 29973, 518, 29914, 25580, 29962]],
device='xpu:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], device='xpu:0'), 'streamer': <transformers.generation.streamers.TextIteratorStreamer object at 0x000001C31D15D410>, 'num_beams': 1, 'do_sample': True, 'max_new_tokens': 1024, 'stopping_criteria': [<llm_biz.CustomStopCriteria object at 0x000001C31AB74E50>]}
Traceback (most recent call last):
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\service\llm_biz.py", line 69, in stream_chat_generate
model.generate(**args)
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\lookup.py", line 88, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\speculative.py", line 109, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\pipeline_parallel.py", line 241, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\transformers\generation\utils.py", line 1575, in generate
result = self._sample(
^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\transformers\generation\utils.py", line 2697, in _sample
outputs = self(
^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 1196, in forward
outputs = self.model(
^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\models\llama.py", line 155, in llama_model_forward_4_38
return llama_model_forward_4_38_internal(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\models\llama.py", line 2590, in llama_model_forward_4_38_internal
causal_mask = self._update_causal_mask(attention_mask, inputs_embeds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: LlamaModel._update_causal_mask() missing 1 required positional argument: 'cache_position'
exception:LlamaModel._update_causal_mask() missing 1 required positional argument: 'cache_position'
Environment (please complete the following information):
OS: Win 11 23H2
GPU: iGPU
CPU: Core Ultra 7 155H
Version: v1.01b-MTL-H
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Describe the bug
Error pop up when switching model in "answer" tab
To Reproduce
Steps to reproduce the behavior:
Screenshots
##Log captured below
load llm model Llama-2-7b-chat-hf finish. cost 15.6s
No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set
tokenizer.chat_template
to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.{'input_ids': tensor([[ 1, 1, 29961, 25580, 29962, 3532, 14816, 29903, 6778, 13,
3492, 526, 263, 8444, 13436, 20255, 29889, 3529, 3867, 9109,
29892, 11314, 936, 322, 16232, 2472, 304, 278, 1404, 29889,
3529, 3013, 278, 1962, 1426, 4086, 278, 1021, 408, 278,
1404, 1881, 29889, 13, 29966, 829, 14816, 29903, 6778, 13,
13, 5816, 338, 319, 29902, 29973, 518, 29914, 25580, 29962]],
device='xpu:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], device='xpu:0'), 'streamer': <transformers.generation.streamers.TextIteratorStreamer object at 0x000001C31D15D410>, 'num_beams': 1, 'do_sample': True, 'max_new_tokens': 1024, 'stopping_criteria': [<llm_biz.CustomStopCriteria object at 0x000001C31AB74E50>]}
Traceback (most recent call last):
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\service\llm_biz.py", line 69, in stream_chat_generate
model.generate(**args)
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\lookup.py", line 88, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\speculative.py", line 109, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\pipeline_parallel.py", line 241, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\transformers\generation\utils.py", line 1575, in generate
result = self._sample(
^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\transformers\generation\utils.py", line 2697, in _sample
outputs = self(
^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 1196, in forward
outputs = self.model(
^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\models\llama.py", line 155, in llama_model_forward_4_38
return llama_model_forward_4_38_internal(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\byao2\AppData\Local\Programs\AI Playground\resources\env\Lib\site-packages\ipex_llm\transformers\models\llama.py", line 2590, in llama_model_forward_4_38_internal
causal_mask = self._update_causal_mask(attention_mask, inputs_embeds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: LlamaModel._update_causal_mask() missing 1 required positional argument: 'cache_position'
exception:LlamaModel._update_causal_mask() missing 1 required positional argument: 'cache_position'
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: