Intel® oneAPI Data Analytics Library
Learn from community members on how to build compute-intensive applications that run efficiently on Intel® architecture.

Serialization of DataSourceDictionary

Harvey_S_
Beginner
678 Views

Hi

I'm using a StringDataSource for importing categorical data (text labels) into NumericTables for SVM modelling. The idea being to import the data, let the DataSource work out the category labels, create the model then save the DataSourceDictionary and the model for later prediction.

I can serialize/deserialize the DataSourceDictionary ok, but the contained DataSourceFeatures don't seem to serialize the CategoricalFeatureDictionary which means I've lost the data labels. Should this work or have I missed something?

Kind Regards

 

 

0 Kudos
4 Replies
Ilya_B_Intel
Employee
678 Views

Thank you, Harvey for your reprort.

That is an issue indeed, we will fix that at the nearest release opportunity.

0 Kudos
Harvey_S_
Beginner
678 Views

Ok thanks, as a temporary workaround I'll probably tokenize it myself and build up the NumericTable by hand. I see that the StringDataSource also build up the stats for the columns, can you tell me if this is necessary for a NumericTable bound for SVM training?

0 Kudos
Ilya_B_Intel
Employee
678 Views

No, SVM does not use stats from NumericTable

0 Kudos
Harvey_S_
Beginner
678 Views

Good. Thanks for your help.

 

0 Kudos
Reply