Not knowing the comparative processing times of the components you describe, nevertheless I'll take a stab at this.
General comments: this design is wired to two HW threads and may not scale when there are more threads available.
B offers the prospect that thebuffer(s) filled by read() might still be cache resident when processData() is invoked, possibly reducing some thrash in the memory hierarchy. This is of course subject to the network load and MTU supported.
Perhaps all three tasks should be in a single thread but you have a pool of threads which take turns doing the accept()?