tbb::task_scheduler_init::default_num_threads() returns the number of threads that the task scheduler will use by default. Often this is equal to the number of cores, but I don't know how hyperthreading CPUs are treated.
As I explained in the answer to your other post, you most probably do not need to know the number of cores at all, since the degree of partitioning should depend on the amount of work necessary to process each subgraph. And if your sparse matrices are really large, then you should partition the graph in much more parts then the number of cores.
In fact partitioning into 16 (or even more) subgrapphs on dualcore machine usually makes perfect sence, as the work distribution across the subgraphs is normally very nonuniform (this is one of the problems with running MPI version of this algorithm on the grids). TBB applications running on shared memory systems can benefit from fine grained parallelism. Follow the link above for more details.