Hi Jason,
Based on your description, your program spent many time on read/write files. You might use LocksAndWaits analysis to see if other cores work during that time.
For Improvemen idea, you may add parallel work when code works on read/write files...create new threads to do other works which don't depends on results of read/write files.
Regards, Peter