For our embedded products we use the iNiche stack. We have several systems running and we see occasionally network packets being dropped by the iNiche stack. We have the impression that it is somehow related to Windows 7 machines, as we have never seen a drop of packets if an XP PC is controlling our embedded devices.I tracked down the actual packet which is dropped. It appears in this case that the checksum of the IP header has a value of 0xffff. The standard definition of the IP header is not entirely clear to me how to handle this case. In fact the standard describes a 1-complement sum and for the end result the complement should be taken. Now for the case shown in the WireShark window below the checksum is different from the calculation of the Inter Niche stack. In file ipdemux,c function ip_rcv() the following is coded: csum = pip->ip_chksum; pip->ip_chksum = 0; hdrlen = ip_hlen(pip); tempsum = ~cksum(pip, hdrlen >> 1); iNiche stack calculates the checksum with the ip_chksum field set to zero: 4500 01e8 7304 0000 8011 0000 This word is set to zero by iNiche (see code snapshot above) C0A8 2201 C0A8 22AF ------- 1-complement sum FFFF ~FFFF = 0000 <=== This is used to compare against the checksum send in the packet which is 0xFFFF and thus fails. I found a note (see below) telling that -0 (i.e. 0xFFFF) should be used in case the checksum is +0. This is exactly what windows 7 does. This packet is dropped by InterNiche stack. Is this an error in the stack?
Hmmm I just read the thread http://www.ietf.org/mail-archive/web/ietf/current/msg21280.htmlThe general view is that both 0x0000 and 0xffff are valid (and equivalent) for IP header checksums. Most code that generates them will not generate 0xffff. The InterNiche stack should just remove all the code that zaps the checksum, and just sum the entire header and verify the result is 0xffff. Note that 0xffff is special for the UDP checksum => not computed.
Below the response we received via our EBV FAESection 4 of RFC 1624 gives an example where a checksum is computed using the equation from RFC 1141, and as such incorrectly generates a checksum of 0xFFFF, whereas the checksum when calculated from scratch would be 0x0000. This would indicate that Windows 7 system is possibly generating a checksum using methods in RFC 1141 rather than the updated recommendations for incremental checksums in RFC 1624, derived from the base specification RFC 793. Having said that, Section 5. of this document acknowledges that systems may not agree on the checksum in this case, with a recommendation to work around systems that generate 0xFFFF by modifying the checksum checking methodology used. In summary, I can see this issue having two main points: (a) The Windows 7 machine does not generate RFC 793-complient checksums. It may use the methods in RFC 1141 which was updated to correct for 0xFFFF checksum "