Intel® Distribution for Python*
Engage in discussions with community peers related to Python* applications and core computational packages.

Force the use of one thread with PyDAAL

velvia
Beginner
722 Views

Hi,

I would like to make some benchmarks and force PyDAAL to use only one thread for k-means clustering. How can I do that?

Best regards,

Francois

0 Kudos
2 Replies
Christophe_H_Intel2
722 Views

Hi Francois,

The library includes an example called set_number_of_threads.py that should give you all the information you need.  I'll paste it here for convenience.

from os.path import join as jp
from os import environ

import daal.algorithms.kmeans as kmeans
import daal.algorithms.kmeans.init as init
from daal.data_management import FileDataSource, DataSourceIface
from daal.services import Environment

# Input data set parameters
datasetFileName = jp('..', 'data', 'batch', 'kmeans_dense.csv')

# K-Means algorithm parameters
nClusters = 20
nIterations = 5
nThreads = 2
nThreadsInit = None
nThreadsNew = None

if __name__ == "__main__":

    # Get the number of threads that is used by the library by default
    nThreadsInit = Environment.getInstance().getNumberOfThreads()

    # Set the maximum number of threads to be used by the library
    Environment.getInstance().setNumberOfThreads(nThreads)

    # Get the number of threads that is used by the library after changing
    nThreadsNew = Environment.getInstance().getNumberOfThreads()

    # Initialize FileDataSource to retrieve the input data from a .csv file
    dataSource = FileDataSource(
        datasetFileName, DataSourceIface.doAllocateNumericTable,
        DataSourceIface.doDictionaryFromContext
    )

    # Retrieve the data from the input file
    dataSource.loadDataBlock()

    # Get initial clusters for the K-Means algorithm
    initAlg = init.Batch(nClusters)

    initAlg.input.set(init.data, dataSource.getNumericTable())
    res = initAlg.compute()
    centroids = res.get(init.centroids)

    # Create an algorithm object for the K-Means algorithm
    algorithm = kmeans.Batch(nClusters, nIterations)

    algorithm.input.set(kmeans.data, dataSource.getNumericTable())
    algorithm.input.set(kmeans.inputCentroids, centroids)

    # Run computations
    unused_result = algorithm.compute()

    print("Initial number of threads:        {}".format(nThreadsInit))
    print("Number of threads to set:         {}".format(nThreads))
    print("Number of threads after setting:  {}".format(nThreadsNew))

Thanks,

Chris

0 Kudos
velvia
Beginner
722 Views

Thanks for your help. I appreciate.

0 Kudos
Reply