Need Help Troubleshooting Network Issue (Client TCP Connections Stuck in FIN_WAIT_2)











up vote
0
down vote

favorite












(Note: I originally asked this question on the "Network Engineering" side, but a moderator there rejected it as "off topic" and told me to ask here instead.)



I am running a video surveillance server called ZoneMinder (version 1.26.5) on a Fedora 18 Linux box. ZoneMinder has a web-based user interface and uses a CGI executable called "zms" to transmit an MJPEG video stream to a web browser over TCP. The problem is that sometimes the video stream connection does not terminate properly; if I am viewing a video stream and close the browser window, the underlying TCP connection remains open and the zms process on the server continues to send video frames across the network. This occurs even if I terminate ALL instances of the browser on the Windows machine (verified using Task Manager). My expectation is that Windows should immediately shut down the TCP connection once the browser process terminates, but for some unknown reason that doesn't always happen, and Windows continues to accept packets on the connection indefinitely. When this problem occurs, the zms process on the server still sees the connection as open and will continue to stream video until either the Windows machine is powered down or the zms process is killed (manually, from the command shell). When reviewing surveillance events it's not uncommon to accumulate a dozen or more of these "zombie" zms processes; if I don't log on to the ZoneMinder server machine via SSH and kill these processes manually they will continue to run indefinitely, consuming disk and network I/O bandwidth and bogging down the rest of the system.



Once in the failed state, running netstat on the Windows machine shows the TCP connection is in the FIN_WAIT_2 state. A Wireshark capture shows that the Windows machine is still acknowledging segments on the connection even though there is no longer a running process receiving that data.



I have 3 Windows machines: One desktop running Windows 7 Pro SP1, one desktop running Win 7 Home Premium SP1, and one laptop running Win 7 Home Premium SP1. Of these three, the two desktop machines exhibit the problem intermittently, whereas the laptop never exhibits the problem.



I normally use the Firefox browser, but I also tried Chrome. Both work 100% on the laptop and fail intermittently on the desktops. Using Firefox and Chrome on other platforms that I have tried, such as Linux and Android, never exhibit the problem.



One of the Windows machines that fails is connected to the same gigabit switch as the ZoneMinder server box; the Windows laptop that always works is connected to a WiFi AP and reaches the ZoneMinder server through a second GigE switch. The Android devices connect both from inside and from outside beyond the firewall with no issues.



To eliminate the possibility of a network driver issue, on one of the desktop machines I tried swapping out Realtek network card with an Intel NIC, but the failure still occurs.



I've now run out of ideas; how can I troubleshoot this further? I can provide Wireshark captures if that would be helpful (they are large - ~100MB - so I've left them off for now).



Thanks for your help!










share|improve this question






















  • Interesting - I knew that the CRC calculation could be offloaded to the hardware but not the rest of that stuff. In any case, I don't think this applies to my situation, as I am able to capture everything in Wireshark, not just the 3-way handshake. I added the registry keys anyway (they did not previously exist) but still no success. I think there are at least 2 problems here: 1. Why the server does not send a FIN in response to the client's FIN (which is ACKed by the server), and 2. Why does Windows not send a RST segment to the server when the client process terminates.
    – dvarapala
    Jun 20 '14 at 15:39

















up vote
0
down vote

favorite












(Note: I originally asked this question on the "Network Engineering" side, but a moderator there rejected it as "off topic" and told me to ask here instead.)



I am running a video surveillance server called ZoneMinder (version 1.26.5) on a Fedora 18 Linux box. ZoneMinder has a web-based user interface and uses a CGI executable called "zms" to transmit an MJPEG video stream to a web browser over TCP. The problem is that sometimes the video stream connection does not terminate properly; if I am viewing a video stream and close the browser window, the underlying TCP connection remains open and the zms process on the server continues to send video frames across the network. This occurs even if I terminate ALL instances of the browser on the Windows machine (verified using Task Manager). My expectation is that Windows should immediately shut down the TCP connection once the browser process terminates, but for some unknown reason that doesn't always happen, and Windows continues to accept packets on the connection indefinitely. When this problem occurs, the zms process on the server still sees the connection as open and will continue to stream video until either the Windows machine is powered down or the zms process is killed (manually, from the command shell). When reviewing surveillance events it's not uncommon to accumulate a dozen or more of these "zombie" zms processes; if I don't log on to the ZoneMinder server machine via SSH and kill these processes manually they will continue to run indefinitely, consuming disk and network I/O bandwidth and bogging down the rest of the system.



Once in the failed state, running netstat on the Windows machine shows the TCP connection is in the FIN_WAIT_2 state. A Wireshark capture shows that the Windows machine is still acknowledging segments on the connection even though there is no longer a running process receiving that data.



I have 3 Windows machines: One desktop running Windows 7 Pro SP1, one desktop running Win 7 Home Premium SP1, and one laptop running Win 7 Home Premium SP1. Of these three, the two desktop machines exhibit the problem intermittently, whereas the laptop never exhibits the problem.



I normally use the Firefox browser, but I also tried Chrome. Both work 100% on the laptop and fail intermittently on the desktops. Using Firefox and Chrome on other platforms that I have tried, such as Linux and Android, never exhibit the problem.



One of the Windows machines that fails is connected to the same gigabit switch as the ZoneMinder server box; the Windows laptop that always works is connected to a WiFi AP and reaches the ZoneMinder server through a second GigE switch. The Android devices connect both from inside and from outside beyond the firewall with no issues.



To eliminate the possibility of a network driver issue, on one of the desktop machines I tried swapping out Realtek network card with an Intel NIC, but the failure still occurs.



I've now run out of ideas; how can I troubleshoot this further? I can provide Wireshark captures if that would be helpful (they are large - ~100MB - so I've left them off for now).



Thanks for your help!










share|improve this question






















  • Interesting - I knew that the CRC calculation could be offloaded to the hardware but not the rest of that stuff. In any case, I don't think this applies to my situation, as I am able to capture everything in Wireshark, not just the 3-way handshake. I added the registry keys anyway (they did not previously exist) but still no success. I think there are at least 2 problems here: 1. Why the server does not send a FIN in response to the client's FIN (which is ACKed by the server), and 2. Why does Windows not send a RST segment to the server when the client process terminates.
    – dvarapala
    Jun 20 '14 at 15:39















up vote
0
down vote

favorite









up vote
0
down vote

favorite











(Note: I originally asked this question on the "Network Engineering" side, but a moderator there rejected it as "off topic" and told me to ask here instead.)



I am running a video surveillance server called ZoneMinder (version 1.26.5) on a Fedora 18 Linux box. ZoneMinder has a web-based user interface and uses a CGI executable called "zms" to transmit an MJPEG video stream to a web browser over TCP. The problem is that sometimes the video stream connection does not terminate properly; if I am viewing a video stream and close the browser window, the underlying TCP connection remains open and the zms process on the server continues to send video frames across the network. This occurs even if I terminate ALL instances of the browser on the Windows machine (verified using Task Manager). My expectation is that Windows should immediately shut down the TCP connection once the browser process terminates, but for some unknown reason that doesn't always happen, and Windows continues to accept packets on the connection indefinitely. When this problem occurs, the zms process on the server still sees the connection as open and will continue to stream video until either the Windows machine is powered down or the zms process is killed (manually, from the command shell). When reviewing surveillance events it's not uncommon to accumulate a dozen or more of these "zombie" zms processes; if I don't log on to the ZoneMinder server machine via SSH and kill these processes manually they will continue to run indefinitely, consuming disk and network I/O bandwidth and bogging down the rest of the system.



Once in the failed state, running netstat on the Windows machine shows the TCP connection is in the FIN_WAIT_2 state. A Wireshark capture shows that the Windows machine is still acknowledging segments on the connection even though there is no longer a running process receiving that data.



I have 3 Windows machines: One desktop running Windows 7 Pro SP1, one desktop running Win 7 Home Premium SP1, and one laptop running Win 7 Home Premium SP1. Of these three, the two desktop machines exhibit the problem intermittently, whereas the laptop never exhibits the problem.



I normally use the Firefox browser, but I also tried Chrome. Both work 100% on the laptop and fail intermittently on the desktops. Using Firefox and Chrome on other platforms that I have tried, such as Linux and Android, never exhibit the problem.



One of the Windows machines that fails is connected to the same gigabit switch as the ZoneMinder server box; the Windows laptop that always works is connected to a WiFi AP and reaches the ZoneMinder server through a second GigE switch. The Android devices connect both from inside and from outside beyond the firewall with no issues.



To eliminate the possibility of a network driver issue, on one of the desktop machines I tried swapping out Realtek network card with an Intel NIC, but the failure still occurs.



I've now run out of ideas; how can I troubleshoot this further? I can provide Wireshark captures if that would be helpful (they are large - ~100MB - so I've left them off for now).



Thanks for your help!










share|improve this question













(Note: I originally asked this question on the "Network Engineering" side, but a moderator there rejected it as "off topic" and told me to ask here instead.)



I am running a video surveillance server called ZoneMinder (version 1.26.5) on a Fedora 18 Linux box. ZoneMinder has a web-based user interface and uses a CGI executable called "zms" to transmit an MJPEG video stream to a web browser over TCP. The problem is that sometimes the video stream connection does not terminate properly; if I am viewing a video stream and close the browser window, the underlying TCP connection remains open and the zms process on the server continues to send video frames across the network. This occurs even if I terminate ALL instances of the browser on the Windows machine (verified using Task Manager). My expectation is that Windows should immediately shut down the TCP connection once the browser process terminates, but for some unknown reason that doesn't always happen, and Windows continues to accept packets on the connection indefinitely. When this problem occurs, the zms process on the server still sees the connection as open and will continue to stream video until either the Windows machine is powered down or the zms process is killed (manually, from the command shell). When reviewing surveillance events it's not uncommon to accumulate a dozen or more of these "zombie" zms processes; if I don't log on to the ZoneMinder server machine via SSH and kill these processes manually they will continue to run indefinitely, consuming disk and network I/O bandwidth and bogging down the rest of the system.



Once in the failed state, running netstat on the Windows machine shows the TCP connection is in the FIN_WAIT_2 state. A Wireshark capture shows that the Windows machine is still acknowledging segments on the connection even though there is no longer a running process receiving that data.



I have 3 Windows machines: One desktop running Windows 7 Pro SP1, one desktop running Win 7 Home Premium SP1, and one laptop running Win 7 Home Premium SP1. Of these three, the two desktop machines exhibit the problem intermittently, whereas the laptop never exhibits the problem.



I normally use the Firefox browser, but I also tried Chrome. Both work 100% on the laptop and fail intermittently on the desktops. Using Firefox and Chrome on other platforms that I have tried, such as Linux and Android, never exhibit the problem.



One of the Windows machines that fails is connected to the same gigabit switch as the ZoneMinder server box; the Windows laptop that always works is connected to a WiFi AP and reaches the ZoneMinder server through a second GigE switch. The Android devices connect both from inside and from outside beyond the firewall with no issues.



To eliminate the possibility of a network driver issue, on one of the desktop machines I tried swapping out Realtek network card with an Intel NIC, but the failure still occurs.



I've now run out of ideas; how can I troubleshoot this further? I can provide Wireshark captures if that would be helpful (they are large - ~100MB - so I've left them off for now).



Thanks for your help!







tcp






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jun 16 '14 at 19:20









dvarapala

112




112












  • Interesting - I knew that the CRC calculation could be offloaded to the hardware but not the rest of that stuff. In any case, I don't think this applies to my situation, as I am able to capture everything in Wireshark, not just the 3-way handshake. I added the registry keys anyway (they did not previously exist) but still no success. I think there are at least 2 problems here: 1. Why the server does not send a FIN in response to the client's FIN (which is ACKed by the server), and 2. Why does Windows not send a RST segment to the server when the client process terminates.
    – dvarapala
    Jun 20 '14 at 15:39




















  • Interesting - I knew that the CRC calculation could be offloaded to the hardware but not the rest of that stuff. In any case, I don't think this applies to my situation, as I am able to capture everything in Wireshark, not just the 3-way handshake. I added the registry keys anyway (they did not previously exist) but still no success. I think there are at least 2 problems here: 1. Why the server does not send a FIN in response to the client's FIN (which is ACKed by the server), and 2. Why does Windows not send a RST segment to the server when the client process terminates.
    – dvarapala
    Jun 20 '14 at 15:39


















Interesting - I knew that the CRC calculation could be offloaded to the hardware but not the rest of that stuff. In any case, I don't think this applies to my situation, as I am able to capture everything in Wireshark, not just the 3-way handshake. I added the registry keys anyway (they did not previously exist) but still no success. I think there are at least 2 problems here: 1. Why the server does not send a FIN in response to the client's FIN (which is ACKed by the server), and 2. Why does Windows not send a RST segment to the server when the client process terminates.
– dvarapala
Jun 20 '14 at 15:39






Interesting - I knew that the CRC calculation could be offloaded to the hardware but not the rest of that stuff. In any case, I don't think this applies to my situation, as I am able to capture everything in Wireshark, not just the 3-way handshake. I added the registry keys anyway (they did not previously exist) but still no success. I think there are at least 2 problems here: 1. Why the server does not send a FIN in response to the client's FIN (which is ACKed by the server), and 2. Why does Windows not send a RST segment to the server when the client process terminates.
– dvarapala
Jun 20 '14 at 15:39












1 Answer
1






active

oldest

votes

















up vote
1
down vote













The TCP state FIN_WAIT_2 means that the application has closed and the client sent a FIN to the server. The server sends an ACK and should tell the application server to begin shutting down. Then it should send a FIN to the client. Your client is waiting on the server to send its FIN.



Your Windows machines exhibiting the behavior maybe using TCP Chimney offloading which offloads some TCP housekeeping chores to the NIC e.g. ACKing data and closing connections. Once the application closes, the NIC takes over handling final closing of the connection. This could be why your machine continues to ACK data even though the browser is closed.



You could try mitigating the problem by disabling TCP Chimney on Windows. Instructions are here.



However, that doesn't address the root cause of why the server doesn't send a FIN. With traffic captures on both client and server, you can:




  1. Verify that the client does send a FIN

  2. Verify that the server receives the FIN

  3. Verify that the server sends a FIN

  4. Verify that the client receives the FIN


Likely, there's a gap in one of those steps. If all steps complete, then the problem is with the client and it might be TCP Chimney offloading.






share|improve this answer





















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "3"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f769564%2fneed-help-troubleshooting-network-issue-client-tcp-connections-stuck-in-fin-wai%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    The TCP state FIN_WAIT_2 means that the application has closed and the client sent a FIN to the server. The server sends an ACK and should tell the application server to begin shutting down. Then it should send a FIN to the client. Your client is waiting on the server to send its FIN.



    Your Windows machines exhibiting the behavior maybe using TCP Chimney offloading which offloads some TCP housekeeping chores to the NIC e.g. ACKing data and closing connections. Once the application closes, the NIC takes over handling final closing of the connection. This could be why your machine continues to ACK data even though the browser is closed.



    You could try mitigating the problem by disabling TCP Chimney on Windows. Instructions are here.



    However, that doesn't address the root cause of why the server doesn't send a FIN. With traffic captures on both client and server, you can:




    1. Verify that the client does send a FIN

    2. Verify that the server receives the FIN

    3. Verify that the server sends a FIN

    4. Verify that the client receives the FIN


    Likely, there's a gap in one of those steps. If all steps complete, then the problem is with the client and it might be TCP Chimney offloading.






    share|improve this answer

























      up vote
      1
      down vote













      The TCP state FIN_WAIT_2 means that the application has closed and the client sent a FIN to the server. The server sends an ACK and should tell the application server to begin shutting down. Then it should send a FIN to the client. Your client is waiting on the server to send its FIN.



      Your Windows machines exhibiting the behavior maybe using TCP Chimney offloading which offloads some TCP housekeeping chores to the NIC e.g. ACKing data and closing connections. Once the application closes, the NIC takes over handling final closing of the connection. This could be why your machine continues to ACK data even though the browser is closed.



      You could try mitigating the problem by disabling TCP Chimney on Windows. Instructions are here.



      However, that doesn't address the root cause of why the server doesn't send a FIN. With traffic captures on both client and server, you can:




      1. Verify that the client does send a FIN

      2. Verify that the server receives the FIN

      3. Verify that the server sends a FIN

      4. Verify that the client receives the FIN


      Likely, there's a gap in one of those steps. If all steps complete, then the problem is with the client and it might be TCP Chimney offloading.






      share|improve this answer























        up vote
        1
        down vote










        up vote
        1
        down vote









        The TCP state FIN_WAIT_2 means that the application has closed and the client sent a FIN to the server. The server sends an ACK and should tell the application server to begin shutting down. Then it should send a FIN to the client. Your client is waiting on the server to send its FIN.



        Your Windows machines exhibiting the behavior maybe using TCP Chimney offloading which offloads some TCP housekeeping chores to the NIC e.g. ACKing data and closing connections. Once the application closes, the NIC takes over handling final closing of the connection. This could be why your machine continues to ACK data even though the browser is closed.



        You could try mitigating the problem by disabling TCP Chimney on Windows. Instructions are here.



        However, that doesn't address the root cause of why the server doesn't send a FIN. With traffic captures on both client and server, you can:




        1. Verify that the client does send a FIN

        2. Verify that the server receives the FIN

        3. Verify that the server sends a FIN

        4. Verify that the client receives the FIN


        Likely, there's a gap in one of those steps. If all steps complete, then the problem is with the client and it might be TCP Chimney offloading.






        share|improve this answer












        The TCP state FIN_WAIT_2 means that the application has closed and the client sent a FIN to the server. The server sends an ACK and should tell the application server to begin shutting down. Then it should send a FIN to the client. Your client is waiting on the server to send its FIN.



        Your Windows machines exhibiting the behavior maybe using TCP Chimney offloading which offloads some TCP housekeeping chores to the NIC e.g. ACKing data and closing connections. Once the application closes, the NIC takes over handling final closing of the connection. This could be why your machine continues to ACK data even though the browser is closed.



        You could try mitigating the problem by disabling TCP Chimney on Windows. Instructions are here.



        However, that doesn't address the root cause of why the server doesn't send a FIN. With traffic captures on both client and server, you can:




        1. Verify that the client does send a FIN

        2. Verify that the server receives the FIN

        3. Verify that the server sends a FIN

        4. Verify that the client receives the FIN


        Likely, there's a gap in one of those steps. If all steps complete, then the problem is with the client and it might be TCP Chimney offloading.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jun 17 '14 at 23:20









        karyhead

        1112




        1112






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Super User!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f769564%2fneed-help-troubleshooting-network-issue-client-tcp-connections-stuck-in-fin-wai%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            QoS: MAC-Priority for clients behind a repeater

            Ивакино (Тотемский район)

            Can't locate Autom4te/ChannelDefs.pm in @INC (when it definitely is there)