Well, the simple answer is: you can't get there from here. maps are an example of sorted associative containers from STL. All the forms of parallel for that I'm aware of expect to take advantage of pointer arithmetic to trivially subdivide the problem into multiple sub-problems that the worker threads can address, the so-called random access containers in STL terms.
Instead you might look into something like the Intel TBB parallel_do, which only requires an input iterator over the map in order to process it. An added complication in your example, though, is that ituses a map of maps, a compound container. Something needs to unpack those inner maps and feed them through a device like parallel_do. And parallel_do has limited parallel scaling because of the requirement to get inputs sequentially through the iterator.