Intel® Optimized AI Frameworks
Receive community support for questions related to PyTorch* and TensorFlow* frameworks.
89 Discussions

[Intel oneAPI] Issue Loading VideoLLaMA2.1-7B-AV with Intel Extension for Transformers

HARSHA_ARAVIND
Beginner
3,266 Views

 

import os
import torch
from transformers import AutoModelForCausalLM
#from videollama2 import model_init, mm_infer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM

# Set environment variables for Intel oneAPI
os.environ["DPCPP_COMPATIBILITY_MODE"] = "1"
os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"

# Move model to Intel GPU (if available)
device = "dpcpp" if torch.backends.mps.is_available() else "cpu"
print("Harsha: ", device)

# Load model using Intel oneAPI optimized transformers
model_path = "DAMO-NLP-SG/VideoLLaMA2.1-7B-AV"
#model, processor, tokenizer = model_init(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
model.to(device)

 

 

Traceback (most recent call last):
File "/home/u722dcf58b7b222d3b6558c78c57e9b4/VideoLLaMA2.1-7B-AV/llama7b.py", line 18, in <module>
model = AutoModelForCausalLM.from_pretrained(model_path)
File "/home/u722dcf58b7b222d3b6558c78c57e9b4/.conda/envs/videollama-intel/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/modeling/modeling_auto.py", line 497, in from_pretrained
config, _ = AutoConfig.from_pretrained(
File "/home/u722dcf58b7b222d3b6558c78c57e9b4/.local/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 947, in from_pretrained
raise ValueError(
ValueError: The checkpoint you are trying to load has model type `videollama2_qwen2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

0 Kudos
1 Reply
Ying_H_Intel
Moderator
2,939 Views

Hi @HARSHA_ARAVIND 

Thanks for raising the ticket. As intel extension for transformers project was stopped since last year:intel/intel-extension-for-transformers:   But the related work are upstreamed to transformer or pytorch, you can use them directly.

Please refer:

Large Language Models (LLM) Optimizations Overview — Intel&#174 Extension for PyTorch* 2.7.10+xpu documentation

and  intel-extension-for-pytorch/examples/gpu/llm at xpu-main · intel/intel-extension-for-pytorch

 for Intel GPU support in PyTorch

python -m pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/xpu
python -m pip install intel-extension-for-pytorch==2.7.10+xpu oneccl_bind_pt==2.7.0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

please also notice the hardware requirement  and memory cosumpation of  7B models

Thanks

0 Kudos
Reply