- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I'm currently comparing different embedding models and the compatibility an performance in openvino. As I'm looking for top performing multilingual models I tried to use Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2. I tried the conversion to openvino format with export_model.py and optimum cli.
With export_model.py the conversion cancels during the process. I used these commands:
python export_model.py embeddings --source_model BAAI/bge-multilingual-gemma2 --weight-format fp16 --config_file_path models/config_all.json
python export_model.py embeddings --source_model Alibaba-NLP/gte-Qwen2-1.5B-instruct --weight-format int8 --config_file_path models/config_all.json
With optimum-cli the conversion works but the models cannot be loaded. This is the error which I get doing a request to the model:
APIStatusError: {"error": "Mediapipe graph precondition failed - FAILED_PRECONDITION: CalculatorGraph::Run() failed in Run:
Calculator::Open() for node "OpenVINOModelServerSessionCalculator_1" failed: ; OpenVINOModelServerSessionCalculator failed to load the model
Calculator::Open() for node "OpenVINOModelServerSessionCalculator_2" failed: ; OpenVINOModelServerSessionCalculator failed to load the model"}
Request:
from openai import OpenAI
import numpy as np
client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)
model = "Alibaba-NLP/gte-Qwen2-1.5B-instruct"
embedding_responses = client.embeddings.create(
input=[
"That is a happy person",
"That is a happy very person"
],
model=model,
)
embedding_from_string1 = np.array(embedding_responses.data[0].embedding)
embedding_from_string2 = np.array(embedding_responses.data[1].embedding)
cos_sim = np.dot(embedding_from_string1, embedding_from_string2)/(np.linalg.norm(embedding_from_string1)*np.linalg.norm(embedding_from_string2))
print("Similarity score as cos_sim", cos_sim)
Folder:
And the graph file:
input_stream: "REQUEST_PAYLOAD:input"
output_stream: "RESPONSE_PAYLOAD:output"
node {
calculator: "OpenVINOModelServerSessionCalculator"
output_side_packet: "SESSION:tokenizer"
node_options: {
[type.googleapis.com / mediapipe.OpenVINOModelServerSessionCalculatorOptions]: {
servable_name: "Alibaba-NLP/gte-Qwen2-1.5B-instruct_tokenizer_model"
}
}
}
node {
calculator: "OpenVINOModelServerSessionCalculator"
output_side_packet: "SESSION:embeddings"
node_options: {
[type.googleapis.com / mediapipe.OpenVINOModelServerSessionCalculatorOptions]: {
servable_name: "Alibaba-NLP/gte-Qwen2-1.5B-instruct_embeddings_model"
}
}
}
node {
input_side_packet: "TOKENIZER_SESSION:tokenizer"
input_side_packet: "EMBEDDINGS_SESSION:embeddings"
calculator: "EmbeddingsCalculator"
input_stream: "REQUEST_PAYLOAD:input"
output_stream: "RESPONSE_PAYLOAD:output"
node_options: {
[type.googleapis.com / mediapipe.EmbeddingsCalculatorOptions]: {
normalize_embeddings: true,
}
}
}
I think there is a problem with the mediapipe graph which I created manually. Is there any documentation on how to create the folder structure and mediapipe graph? During the export with optimum a lot of addtional files get exported, not like it is with export_model.py. Are there any hints on how to get these models running with openvino?
Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Florianoli,
Thanks for reaching out to us.
For your information, Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2 are not supported at the moment. Models supported by optimum-intel should be compatible. In serving validation are included Hugging Face models:
- nomic-ai/nomic-embed-text-v1.5
- Alibaba-NLP/gte-large-en-v1.5
- BAAI/bge-large-en-v1.5
- BAAI/bge-large-zh-v1.5
- thenlper/gte-small
On the other hand, did you encountered OSError: The paging file is too small for this operation to complete. (os error 1455) while exporting Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2?
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Wan,
thank you for answer. No, I didn't encount that error.
I just looked up the supported models, I found that Quen2/1.5 and Gemma2 are supported. Or does this only count for the generative models and not the embedding models?
https://huggingface.co/docs/optimum/main/en/intel/openvino/models
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Florianoli,
Thanks for the information.
Let me check with relevant team and I'll provide an update here as soon as possible.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Florianoli,
Sorry for the delay, please note I had no issues loading both Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2 models using the Model Server as described in the following How to serve Embeddings models via OpenAI API guide. I used the 2024.5 version of the docker images for Model Server, I'd expect newer version 2024.6 should also work fine. Please have a try and hope it helps you. If the issue persists on your end kindly share additional information to help us reproduce (i.e. how you are loading the model, OpenVINO version used, optimum version, if docker used, etc.)
Exported models:
$ python export_model.py embeddings --source_model Alibaba-NLP/gte-Qwen2-1.5B-instruct --weight-format int8 --config_file_path models/config.json
$ python export_model.py embeddings --source_model BAAI/bge-multilingual-gemma2 --weight-format int8 --config_file_path models/config.json
Testing over serving API:
$ python openai_client.py
Similarity score as cos_sim 0.965433590102649
$ python openai_client-qwen.py
Similarity score as cos_sim 1.0
$ python openai_client-gemma2.py
Similarity score as cos_sim 0.26743674766226727
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Florianoli,
We will proceed with closing this case since we have provided solution. If you need further assistance, please open a new ticket.
Best regards,
Wan

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page