as a comparison, in titanx, forward is 0.74ms, backward with input is 3.09 ms, with weight is 0.76 ms. For forward, titanx is only a little faster than KNL7250, but for backward, KNL7250 is much slower. This is similar with other W,H,C configuration.
can any one give me the reason? is it because mkl has not made much optimization for backward yet? it seems mkl-dnn (https://github.com/01org/mkl-dnn) only supports forward operations now.