I'm working with a fairly basic server/client setup, where both are located on the same network. They communicate via Winsock2 blocking sockets over TCP/IP, and are doing so perfectly fine.
However, for the scenario described below, the client sometimes sees an abortive connection termination (RST). It goes right roughly 99 out of 100 times, but that last time annoyingly fails some tests and therefore, my whole build. It is completely unpredictable when and where it happens, and so reproducing the problem has so far eluded me.
If I understand the the relevant MSDN page correctly, the nominal connection termination sequence for blocking sockets can be summarized as:
Client | Server
-----------------------------
shutdown(SD_SEND) |
| send() response data
i=recv() until i==0 | shutdown(SD_SEND)
closesocket() | closesocket()
In my setup it is necessary to
- do a relatively expensive operation (let's call it
expensive_operation()
) depending on whether a portion of the received data (let's say, 512 bytes) contains a trigger value. The server is single-threaded, so expensive_operation()
effectively stops recv()
ing the data stream until expensive_operation()
is complete
- initiate a server shutdown sequence if the client sends a particular sentinel value, let's call it
0xDEADBEEF
.
My client is implemented such that the sentinel value is always sent last, so after sending it, no other data is sent:
send( "data data data 0xDEADBEEF" )
to server
shutdown(SD_SEND)
<------- FAILURE OCCURS HERE
recv()
until 0 bytes received
closesocket()
Whenever the server receives 0xDEADBEEF
, it confirms the shutdown request and continues termination:
recv()
512 bytes of data or until 0 bytes are returned
- Check for trigger. If a trigger is found, perform
expensive_operation()
and go back to step 1, otherwise continue
- Check for sentinel value. If sentinel is not found, go back to step 1.
- If the sentinel is found:
send( confirmation )
to client
shutdown(SD_SEND)
closesocket()
- all the normal server shutdown stuff
I can understand that if the client intends to send more data after the sentinel, this will result in abortive connection termination -- because the server actively terminates the connection. This is completely expected and by design, and I can indeed reliably reproduce this behavior.
However, in the nominal case, the sentinel is always last in the sequence, which indeed always happens as evidenced by the relevant log entries, and indeed graceful connection termination happens as expected most of the time. But not always...
As I said, it happens randomly and sporadically, so I can't produce a code snippet that reliably reproduces the problem. The only thing that's consistent is that the failure always occurs when calling shutdown()
in the client...
I suspect it's more of a design flaw, or some synchronization issue I'm not handling yet, rather than a problem with the code (although I'd be happy to provide the relevant code snippets).
So is there anything obvious I'm overlooking here?
Copyright License:
Author:「Rody Oldenhuis」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/29963619/unpredictable-abortive-shutdown-on-client-side