Advisor will always report gather (indirect addressing) as a potentially inefficient memory access pattern. How bad it is would be data and platform dependent, and is not actually measured, other than by reported timings. If you cared to profile under VTune, you could try to assess how much time is spent on cache events, but it might not lead to coding improvements.
Your memory accesses spaced by im.cols appear to account for the vertical stride diagnosis. If the generated code is successfully vectorized with unity stride parallel stores, it may be the best you can do.