hidden text to trigger early load of fonts ПродукцияПродукцияПродукцияПродукция Các sản phẩmCác sản phẩmCác sản phẩmCác sản phẩm المنتجاتالمنتجاتالمنتجاتالمنتجات מוצריםמוצריםמוצריםמוצרים
Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6538 Discussions

convert embedding models with export_model.py or optimum-intel

Florianoli
Beginner
1,218 Views

Hi everyone,

I'm currently comparing different embedding models and the compatibility an performance in openvino. As I'm looking for top performing multilingual models I tried to use Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2. I tried the conversion to openvino format with export_model.py and optimum cli. 

 

With export_model.py the conversion cancels during the process. I used these commands:

python export_model.py embeddings --source_model BAAI/bge-multilingual-gemma2 --weight-format fp16 --config_file_path models/config_all.json

python export_model.py embeddings --source_model Alibaba-NLP/gte-Qwen2-1.5B-instruct --weight-format int8 --config_file_path models/config_all.json

 

With optimum-cli the conversion works but the models cannot be loaded. This is the  error which I get doing a request to the model:

APIStatusError: {"error": "Mediapipe graph precondition failed - FAILED_PRECONDITION: CalculatorGraph::Run() failed in Run: 
Calculator::Open() for node "OpenVINOModelServerSessionCalculator_1" failed: ; OpenVINOModelServerSessionCalculator failed to load the model
Calculator::Open() for node "OpenVINOModelServerSessionCalculator_2" failed: ; OpenVINOModelServerSessionCalculator failed to load the model"}

 

Request:

from openai import OpenAI
import numpy as np

client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)
model = "Alibaba-NLP/gte-Qwen2-1.5B-instruct"
embedding_responses = client.embeddings.create(
input=[
"That is a happy person",
"That is a happy very person"
],
model=model,
)
embedding_from_string1 = np.array(embedding_responses.data[0].embedding)
embedding_from_string2 = np.array(embedding_responses.data[1].embedding)
cos_sim = np.dot(embedding_from_string1, embedding_from_string2)/(np.linalg.norm(embedding_from_string1)*np.linalg.norm(embedding_from_string2))
print("Similarity score as cos_sim", cos_sim)

 

 

Folder:

Florianoli_0-1736363808242.png

 

And the graph file:

input_stream: "REQUEST_PAYLOAD:input"
output_stream: "RESPONSE_PAYLOAD:output"
node {
calculator: "OpenVINOModelServerSessionCalculator"
output_side_packet: "SESSION:tokenizer"
node_options: {
[type.googleapis.com / mediapipe.OpenVINOModelServerSessionCalculatorOptions]: {
servable_name: "Alibaba-NLP/gte-Qwen2-1.5B-instruct_tokenizer_model"
}
}
}
node {
calculator: "OpenVINOModelServerSessionCalculator"
output_side_packet: "SESSION:embeddings"
node_options: {
[type.googleapis.com / mediapipe.OpenVINOModelServerSessionCalculatorOptions]: {
servable_name: "Alibaba-NLP/gte-Qwen2-1.5B-instruct_embeddings_model"
}
}
}
node {
input_side_packet: "TOKENIZER_SESSION:tokenizer"
input_side_packet: "EMBEDDINGS_SESSION:embeddings"
calculator: "EmbeddingsCalculator"
input_stream: "REQUEST_PAYLOAD:input"
output_stream: "RESPONSE_PAYLOAD:output"
node_options: {
[type.googleapis.com / mediapipe.EmbeddingsCalculatorOptions]: {
normalize_embeddings: true,
}
}
}

I think there is a problem with the mediapipe graph which I created manually. Is there any documentation on how to create the folder structure and mediapipe graph? During the export with optimum a lot of addtional files get exported, not like it is with export_model.py. Are there any hints on how to get these models running with openvino?

 

Thanks!

0 Kudos
5 Replies
Wan_Intel
Moderator
1,165 Views

Hi Florianoli,

Thanks for reaching out to us.


For your information, Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2 are not supported at the moment. Models supported by optimum-intel should be compatible. In serving validation are included Hugging Face models:

  • nomic-ai/nomic-embed-text-v1.5
  • Alibaba-NLP/gte-large-en-v1.5
  • BAAI/bge-large-en-v1.5
  • BAAI/bge-large-zh-v1.5
  • thenlper/gte-small


On the other hand, did you encountered OSError: The paging file is too small for this operation to complete. (os error 1455) while exporting Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2?



Regards,

Wan


0 Kudos
Florianoli
Beginner
1,131 Views

Hello Wan,

thank you for answer. No, I didn't encount that error.

 

I just looked up the supported models, I found that Quen2/1.5 and Gemma2 are supported. Or does this only count for the generative models and not the embedding models?

 

https://huggingface.co/docs/optimum/main/en/intel/openvino/models

0 Kudos
Wan_Intel
Moderator
1,078 Views

Hi Florianoli,

Thanks for the information.


Let me check with relevant team and I'll provide an update here as soon as possible.



Regards,

Wan


0 Kudos
Luis_at_Intel
Moderator
791 Views

Hi Florianoli,

 

Sorry for the delay, please note I had no issues loading both Alibaba-NLP/gte-Qwen2-1.5B-instruct and BAAI/bge-multilingual-gemma2 models using the Model Server as described in the following How to serve Embeddings models via OpenAI API guide. I used the 2024.5 version of the docker images for Model Server, I'd expect newer version 2024.6 should also work fine. Please have a try and hope it helps you. If the issue persists on your end kindly share additional information to help us reproduce (i.e. how you are loading the model, OpenVINO version used, optimum version, if docker used, etc.)

 

Exported models:

$ python export_model.py embeddings --source_model Alibaba-NLP/gte-Qwen2-1.5B-instruct --weight-format int8 --config_file_path models/config.json

$ python export_model.py embeddings --source_model BAAI/bge-multilingual-gemma2 --weight-format int8 --config_file_path models/config.json

 

Testing over serving API:

$ python openai_client.py

Similarity score as cos_sim 0.965433590102649

$ python openai_client-qwen.py

Similarity score as cos_sim 1.0

$ python openai_client-gemma2.py

Similarity score as cos_sim 0.26743674766226727


0 Kudos
Wan_Intel
Moderator
323 Views

Dear Florianoli,

We will proceed with closing this case since we have provided solution. If you need further assistance, please open a new ticket.



Best regards,

Wan


0 Kudos
Reply