- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am running the sample code for optimization/quantization of Dolly V2, to better understand OpenVINO and to determine if it is applicable to our projects with various models such as FLAN-T5, FLAN-UL2, and MPT. Here are my questions:
First, in the following line
ov_model = OVModelForCausalLM.from_pretrained(model_id, device=current_device, export=True)
parameter device pertains to the target device for inference optimization, as per documentation. For example, device can be "CPU" or "GPU". Where can I find a list or documentation for devices which may be specified besides these two. For instance, how to specify optimizing for ARM, or iGPU.
Second, the sample code does not address quantization to FP16 or INT-8. How can this be explicitly specified via the Optimum/OpenVINO API. Do I have to perhaps modify the sample code as follows:
tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, device_map=current_device, load_in_8bit=True, export=True)
Third, the source code for class OVModel sets self.device = torch.device("cpu"). I interpret this as OVModelForCasualLM instance to run on CPU only. Is it possible to expedite execution of the sample code on a GPU? It takes an extremely long time on a commodity CPU.
I apologize for asking basic questions, but I have searched thru OpenVINO and Optimum Hugging Face documentation and code without much success or clarity. I am trying to formulate a plan for our team projects on the usability/applicability of OpenVINO toolkit.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Berdy,
As requested, I will close this case and we will continue providing support on your second post.
Regards,
Aznie

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page