If the altmemphy is configured as burst length=8, full rate, ddr2 data width=16. Then one read command will trigger the altmemphy to perform a burst 8 read, that is 8*16=128bit data.  

but the altmemphy only connect 32 bit ctl_rddata to controller, so how does altmemphy know which 32 bit in the 128 bit data to put on the ctl_rddata bus?  

The controller well feed the 128 bits out over 4 clock cycles (32 bits each), known as beats, starting with the lowest address first.