Solved: System Crash when get blocks from a CSRNumericTable

ray_l_1 · ‎12-10-2017

hi, Sir
I found PyDAAL got crashed when I trying to print a CSRNumericTable, and here is the example

import numpy as np
from utils import printNumericTable

### import Available Modules for CSRNumericTable###
from daal.data_management import (CSRNumericTable,readOnly,BlockDescriptor)
# Non zero elements of the matrix
values = np.array([1,2,7,8,5,3,9,6,4], dtype=np.intc)
# Column indices "colIndices" corresponding to each element in "values" array
colIndices = np.array([0, 1, 1, 2, 0, 2, 3, 1,3], dtype=np.uint64)
# Row offsets for every first non zero element encountered in each row
rowOffsets = np.array([0,2,4,7,9], dtype=np.uint64)
# Creation of CSR numeric table with the arguments dicussed above
nObservations = 3 # Number of rows in the numpy array
nFeatures = 3# Number of columns in numpy array

CSR_nT = CSRNumericTable(values, colIndices, rowOffsets, nFeatures, nObservations)
#printNumericTable(CSR_nT)
block=BlockDescriptor(ntype=np.intc)
print("start")
CSR_nT.getBlockOfRows(0, CSR_nT.getNumberOfRows(), readOnly, block)
print("end")
print(block.getArray())

but in the console, only "start" is printed, and "end" is never appeared. in the windows event viewer , an error was reported "C:\IntelPython3\lib\site-packages\daal\data_management\_data_management.cp36-win_amd64.pyd" . When directly using printNumericTable function from uitls in example, issue is same.

VictoriyaS_F_Intel · ‎12-11-2017

Hello Ray,

Intel DAAL supports only one-based indexing for CSRNumericTable. In your example zero-based indexing is used. To convert zero-based indices to one-based add following lines into the example:

colIndices = colIndices + 1
rowOffsets = rowOffsets + 1

From the colIndices and rowOffsets you provide, I assume that the numeric table size is 4 x 4. That is why you need to change nObservations and nFeatures values in the example as follows:

nObservations = rowOffsets.size - 1
nFeatures = 4                       # should be greater or equal to np.max(colIndices) for colIndices in one-based format

Can you please make the suggested changes and reply whether this helps to resolve the issue?

Best regards,

Victoriya

View solution in original post

VictoriyaS_F_Intel · ‎12-11-2017

Hello Ray,

Intel DAAL supports only one-based indexing for CSRNumericTable. In your example zero-based indexing is used. To convert zero-based indices to one-based add following lines into the example:

colIndices = colIndices + 1
rowOffsets = rowOffsets + 1

From the colIndices and rowOffsets you provide, I assume that the numeric table size is 4 x 4. That is why you need to change nObservations and nFeatures values in the example as follows:

nObservations = rowOffsets.size - 1
nFeatures = 4                       # should be greater or equal to np.max(colIndices) for colIndices in one-based format

Can you please make the suggested changes and reply whether this helps to resolve the issue?

Best regards,

Victoriya

ray_l_1 · ‎12-11-2017

removed duplication

ray_l_1 · ‎12-11-2017

deleted due to a refresh issue

ray_l_1 · ‎12-11-2017

deleted

ray_l_1 · ‎12-11-2017

hi, Victoriya Thanks a lot for the help. I changed my code from zero-base to one-base, and it works now. but I am still surprised why numpy uses zero-based indices ,and CSRNumericTable use one-based. and I also find there is a typo error in the code snippet of White Paper.
in the example of CRSNumericTable, the rowOffsets is printed as np.array([1,4,6,9,2,14], dtype=np.uint64), this will introduce a crash of the system . The correct one shall be np.array([1,4,6,9,12,14], dtype=np.uint64). From the experience, looks the Numeric Table is not so strong enough. a simple typo error can introduce a system crash immediately, that could be bad when there are large data.