If you link with the multi-threaded libraries then new/delete/malloc/free/etc... will be multi-thread safe.
The other "gotcha's" are standard multi-threaded issues that exist with regular and irregular data applications. For shared variables, this may require using the atomic<...> variant and/or using Interlocked...(&var, ...) functions and/or critical sections. Note, look at the documentation for std:: classes/member functions as these are (may) not always thread safe.
Memory allocations and deallocations pass through a critical section (when linking with mt/md libraries). If you are performing a large number of allocation and deallocation (new/delete) of same (and/or similar) sized objects then look into using either a scalable allocator (TBB, ?Boost, ...) or write your own pool of available objects allocator. This is relatively easy to do.
If the algorithm is truly recursive (in that each iteration depends on the previous one) it may not be amenable to parallelization unless each iteration is parallelizable in itself.
Then again at times a subroutine will call itself (recursively) to perform operations such as partitioning (tile-ing) of a larger problem set. Although this is not technically a recursive algorithm it shares the aspect of self reference that recursive algorithms have.