Intel® Developer Cloud
Help connecting to or getting started on Intel® Developer Cloud
164 Discussions

Intel LLM Fine Tuning with Hugging Face

Sid5
New Contributor I
4,060 Views

 

from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Define the path to the checkpoint
checkpoint_path = r"./results/checkpoint-1000" # Replace with your checkpoint folder

# Load the model
model = AutoModelForSequenceClassification.from_pretrained("Training/AI/GenAI/ai-innovation-bridge/workshops/ai-workloads-with-huggingface/results/checkpoint-1000")

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
from transformers import cached_file

# Replace with the actual values
path_or_repo_id = "username/repository"
filename = "model_file.bin"
cache_dir = "path/to/cache_dir"

# Download the file
resolved_file = cached_file(
path_or_repo_id=path_or_repo_id,
filename=filename,
cache_dir=cache_dir,
force_download=True, # Set to True to force download
resume_download=True, # Set to True to resume download if it was interrupted
proxies=None, # Set to your proxy settings if needed
token=None, # Set to your Hugging Face token if the repository is gated
revision=None, # Set to a specific commit hash or tag if needed
local_files_only=True, # Set to True to only look for local files
subfolder=None, # Set to a subfolder if the file is in a subfolder
repo_type=None, # Set to "dataset" or "model" if necessary
user_agent=None, # Set to a custom user agent if needed
_raise_exceptions_for_gated_repo=True, # Set to False to not raise exceptions for gated repositories
_raise_exceptions_for_missing_entries=True, # Set to False to not raise exceptions for missing entries
_raise_exceptions_for_connection_errors=True, # Set to False to not raise exceptions for connection errors
_commit_hash=None, # Set to a specific commit hash if needed
)

# Use the resolved_file path for further processing

This code produces the following error and I am stuck with this for quite some time now. Please suggest a quick fix to proceed.



File ~/.local/lib/python3.11/site-packages/transformers/utils/hub.py:385, in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs) 383 try: 384 # Load from URL or cache if already cached --> 385 resolved_file = hf_hub_download( 386 path_or_repo_id, 387 filename, 388 subfolder=None if len(subfolder) == 0 else subfolder, 389 repo_type=repo_type, 390 revision=revision, 391 cache_dir=cache_dir, 392 user_agent=user_agent, 393 force_download=force_download, 394 proxies=proxies, 395 resume_download=resume_download, 396 token=token, 397 local_files_only=local_files_only, 398 ) 399 except GatedRepoError as e: File ~/.local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:110, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs) 109 if arg_name in ["repo_id", "from_id", "to_id"]: --> 110 validate_repo_id(arg_value) 112 elif arg_name == "token" and arg_value is not None: File ~/.local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:158, in validate_repo_id(repo_id) 157 if repo_id.count("/") > 1: --> 158 raise HFValidationError( 159 "Repo id must be in the form 'repo_name' or 'namespace/repo_name':" 160 f" '{repo_id}'. Use `repo_type` argument if needed." 161 ) 163 if not REPO_ID_REGEX.match(repo_id): HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'Training/AI/GenAI/ai-innovation-bridge/workshops/ai-workloads-with-huggingface/results/checkpoint-1000'. Use `repo_type` argument if needed. The above exception was the direct cause of the following exception: OSError Traceback (most recent call last) Cell In[3], line 7 4 checkpoint_path = r"./results/checkpoint-1000" # Replace with your checkpoint folder 6 # Load the model ----> 7 model = AutoModelForSequenceClassification.from_pretrained("Training/AI/GenAI/ai-innovation-bridge/workshops/ai-workloads-with-huggingface/results/checkpoint-1000") 9 # Load the tokenizer 10 tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") File ~/.local/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:488, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 485 if commit_hash is None: 486 if not isinstance(config, PretrainedConfig): 487 # We make a call to the config file first (which may be absent) to get the commit hash as soon as possible --> 488 resolved_config_file = cached_file( 489 pretrained_model_name_or_path, 490 CONFIG_NAME, 491 _raise_exceptions_for_missing_entries=False, 492 _raise_exceptions_for_connection_errors=False, 493 **hub_kwargs, 494 ) 495 commit_hash = extract_commit_hash(resolved_config_file, commit_hash) 496 else: File ~/.local/lib/python3.11/site-packages/transformers/utils/hub.py:450, in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs) 448 raise EnvironmentError(f"There was a specific connection error when trying to load {path_or_repo_id}:\n{err}") 449 except HFValidationError as e: --> 450 raise EnvironmentError( 451 f"Incorrect path_or_model_id: '{path_or_repo_id}'. Please provide either the path to a local folder or the repo_id of a model on the Hub." 452 ) from e 453 return resolved_file OSError: Incorrect path_or_model_id: 'Training/AI/GenAI/ai-innovation-bridge/workshops/ai-workloads-with-huggingface/results/checkpoint-1000'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
0 Kudos
30 Replies
Sid5
New Contributor I
791 Views

Hi Peh,

 

I just tried doing that but still got the same error: mv ./checkpoint-1000 ./ai-workloads-with-huggingface
mv: cannot stat './checkpoint-1000': No such file or directory
u48b8a1541f69e0a0aba9dc17773a666@idc-beta-batch-pvc-node-20:~$ mv checkpoint-1000 ai-workloads-with-huggingface
mv: cannot stat 'checkpoint-1000': No such file or directory

 

Regards,

Siddhartha Sharma

0 Kudos
Sid5
New Contributor I
784 Views

Hi Peh,

I tried doing that but got the following error, I even replied to this around 10 mins from your reply but it vanished somehow. It resulted in the same thing let me know what went wrong here.

mv ./checkpoint-1000 ai-workloads-with-huggingface
mv: cannot stat './checkpoint-1000': No such file or directory
u48b8a1541f69e0a0aba9dc17773a666@idc-beta-batch-pvc-node-20:~$ mv checkpoint-1000 ai-workloads-with-huggingface
mv: cannot stat 'checkpoint-1000': No such file or directory
u48b8a1541f69e0a0aba9dc17773a666@idc-beta-batch-pvc-node-20:~$ mv ¬/checkpoint-1000 ¬/ai-workloads-with-huggingface
mv: cannot stat '¬/checkpoint-1000': No such file or directory
u48b8a1541f69e0a0aba9dc17773a666@idc-beta-batch-pvc-node-20:~$ mv ./checkpoint-1000 ./ai-workloads-with-huggingface
mv: cannot stat './checkpoint-1000': No such file or directory

 

 

Thanks & Regards

Siddhartha Sharma

0 Kudos
Peh_Intel
Moderator
758 Views

Hi Siddhartha Sharma,


In this case, please share your Pre-trained model (relevant files in Checkpoint-1000 folder), the script of your Jupyter Notebook and other relevant files in order for me to reproduce from my end.



Regards,

Peh


0 Kudos
Peh_Intel
Moderator
709 Views

Hi Siddhartha Sharma,


I am unable to view your files via the shared link as we are on different IDC account.


You can export your Jupyter Notebook as HTML file and upload here. Next, download all the files in checkpoint-1000 folder and upload here as well.



Regards,

Peh


Sid5
New Contributor I
686 Views

Hi Peh,

I tried downloading and uploading the files yesterday but it was showing error while uploading. The zipped folder seems to have different files instead of the files I tried to zip right now, as far as the rest my laptop does not have enough space to download them. While trying to zip the files it keeps stopping in between and does not work correctly which is very strange. There has to be some other way you can access these files using key or something.

Sid5_0-1717143279332.png

 

I even tried downloading files separately (1262 files) then uploading to convert into zip but the page got stuck.

Sid5_0-1717157066624.png

 

 

 

 

 

Best Regards

Siddhartha Sharma

0 Kudos
Sid5
New Contributor I
659 Views

Here's the zipped folder for the checkpoint-1000 folder or as many files I could download and zip for now. Would have to repeat the same at least 5 times to get all the files.

0 Kudos
Sid5
New Contributor I
552 Views

Hi Peh,

Hope you are doing well! Here's the warning (screenshot you shared earlier) that I fixed while I was fine tuning the pre trained model (about three months ago). This does not show up on my notebook and I have the files for the same. I have saved the files for this warning in the vvv folder and that's what the resolved file is about I suppose. This contains the HTML file for the jupytr notebook too along with the warning resolution files.

 

 

Thanks & Regards

Siddhartha Sharma

 

 

 

0 Kudos
Peh_Intel
Moderator
487 Views

Hi Siddhartha Sharma,

 

I've sent you a private message. Please check your inbox message from your Profile.

Peh_Intel_0-1717541074208.png

 

 

Regards,

Peh

Peh_Intel
Moderator
64 Views

Hi Siddhartha Sharma,



For your information, we have SLURM locked down for security reasons. Hence, you are not able to run your custom model on SLURM.


Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.



Regards,

Peh


0 Kudos
Reply