windows 套接字接收挂起

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6775065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-15 17:31:56  来源:igfitidea点击:

Sockets receive hangs

c#.netwindowssocketsscreen-scraping

提问by Milan Solanki

I am trying to download, search page of bing, and ask using sockets, i have decided to use sockets, instead of webclient.

我正在尝试下载,搜索 bing 的页面,并使用套接字询问,我决定使用套接字,而不是 webclient。

The socket.Receive();hangs after few loops in case of bing, yahoo, google but works for ask. for google loop will receive for 4 - 5 times, then freeze on the call.

所述socket.Receive(); 在 bing、yahoo、google 的情况下,在几个循环后挂起,但适用于询问。for google loop 将接收 4 - 5 次,然后在通话中冻结。

I am not able to figure out why?

我不明白为什么?

public string Get(string url)
{
    Uri requestedUri = new Uri(url);
    string fulladdress = requestedUri.Host;
    IPHostEntry entry = Dns.GetHostEntry(fulladdress);
    StringBuilder sb = new StringBuilder();

    try
    {
        using (Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.IP))
        {
            socket.Connect(entry.AddressList[0], 80);

            NetworkStream ns = new NetworkStream(socket);

            string part_request = string.Empty;
            string build_request = string.Empty;
            if (jar.Count != 0)
            {
                part_request = "GET {0} HTTP/1.1\r\nHost: {1} \r\nUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nCookie: {2}\r\nConnection: keep-alive\r\n\r\n";
                build_request = string.Format(part_request, requestedUri.PathAndQuery, requestedUri.Host, GetCookies(requestedUri));
            }
            else
            {
                part_request = "GET {0} HTTP/1.1\r\nHost: {1} \r\nUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nConnection: keep-alive\r\n\r\n";
                build_request = string.Format(part_request, requestedUri.PathAndQuery, requestedUri.Host);
            }

            byte[] data = Encoding.UTF8.GetBytes(build_request);
            socket.Send(data, data.Length, 0);

            byte[] bytesReceived = new byte[102400];
            int bytes = 0;

            do
            {
                bytes = socket.Receive(bytesReceived, bytesReceived.Length, 0);
                sb.Append(Encoding.ASCII.GetString(bytesReceived, 0, bytes));
            }
            while (bytes > 0);

            List<String> CookieHeaders = new List<string>();
            foreach (string header in sb.ToString().Split("\n\r".ToCharArray(), StringSplitOptions.RemoveEmptyEntries))
            {
                if (header.StartsWith("Set-Cookie"))
                {
                    CookieHeaders.Add(header.Replace("Set-Cookie: ", ""));
                }
            }

            this.AddCookies(CookieHeaders, requestedUri);

            socket.Close();
        }
    }
    catch (Exception ex)
    {
        string errorMessage = ex.Message;
    }

    return sb.ToString();
}

CookieContainer jar = new CookieContainer();

public string GetCookies(Uri _uri)
{
    StringBuilder sb = new StringBuilder();
    CookieCollection collection = jar.GetCookies(_uri);

    if (collection.Count != 0)
    {
        foreach (Cookie item in collection)
        {
            sb.Append(item.Name + "=" + item.Value + ";");
        }
    }
    return sb.ToString();
}

回答by War

Its because you've reached the end of the content and yet you are still requesting more ...

那是因为你已经到了内容的结尾,但你仍然要求更多......

do
{
   bytes = socket.Receive(bytesReceived, bytesReceived.Length, 0);
   sb.Append(Encoding.ASCII.GetString(bytesReceived, 0, bytes));
}
while (bytes > 0);

This assumes that as long as the last request returned more than 0 bytes theres more available, when in actual fact when a network stream reaches the end the chances are you'll fill some of your buffer on the last loop. (e.g. bytes > 0 but nothing more to get) ... so the server closes the connection.

这假设只要最后一个请求返回的字节数超过 0 字节,那么可用的字节数就会更多,而实际上当网络流到达末尾时,您很有可能会在最后一个循环中填充一些缓冲区。(例如字节> 0,但没有更多内容)...因此服务器关闭连接。

try something like this instead ...

尝试这样的事情而不是......

do
{
   bytes = socket.Receive(bytesReceived, bytesReceived.Length, 0);
   sb.Append(Encoding.ASCII.GetString(bytesReceived, 0, bytes));
}
while (bytes == bytesReceived.Length);

Some servers (ask is probably one of them) obviously don't auto close the connection as you would expect hence the reason it won't always fail.

某些服务器(ask 可能是其中之一)显然不会像您期望的那样自动关闭连接,因此它不会总是失败的原因。

:::EDIT:::

:::编辑:::

My test sample:

我的测试样本:

Load visual studio, create a new console app then paste the following in to the generated program class (in place of all existing code):

加载 Visual Studio,创建一个新的控制台应用程序,然后将以下内容粘贴到生成的程序类中(代替所有现有代码):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using System.Net.Sockets;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string test = Get("http://www.google.co.uk/search?q=test&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-GB:official&client=firefox-a");
            Console.Read();
        }

        public static string Get(string url)
        {
            Uri requestedUri = new Uri(url);
            string fulladdress = requestedUri.Host;
            IPHostEntry entry = Dns.GetHostEntry(fulladdress);
            StringBuilder sb = new StringBuilder();

            try
            {
                using (Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.IP))
                {
                    socket.Connect(entry.AddressList[0], 80);

                    NetworkStream ns = new NetworkStream(socket);

                    string part_request = string.Empty;
                    string build_request = string.Empty;
                    if (jar.Count != 0)
                    {
                        part_request = "GET {0} HTTP/1.1\r\nHost: {1} \r\nUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nCookie: {2}\r\nConnection: keep-alive\r\n\r\n";
                        build_request = string.Format(part_request, requestedUri.PathAndQuery, requestedUri.Host, GetCookies(requestedUri));
                    }
                    else
                    {
                        part_request = "GET {0} HTTP/1.1\r\nHost: {1} \r\nUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nConnection: keep-alive\r\n\r\n";
                        build_request = string.Format(part_request, requestedUri.PathAndQuery, requestedUri.Host);
                    }

                    byte[] data = Encoding.UTF8.GetBytes(build_request);
                    socket.Send(data, data.Length, 0);

                    byte[] bytesReceived = new byte[4096];
                    int bytes = 0;
                    string currentBatch = "";

                    do
                    {
                        bytes = socket.Receive(bytesReceived);
                        currentBatch = Encoding.ASCII.GetString(bytesReceived, 0, bytes);
                        Console.Write(currentBatch);
                        sb.Append(currentBatch);
                    }
                    while (bytes == bytesReceived.Length);

                    List<String> CookieHeaders = new List<string>();
                    foreach (string header in sb.ToString().Split("\n\r".ToCharArray(), StringSplitOptions.RemoveEmptyEntries))
                    {
                        if (header.StartsWith("Set-Cookie"))
                        {
                            CookieHeaders.Add(header.Replace("Set-Cookie: ", ""));
                        }
                    }

                    //this.AddCookies(CookieHeaders, requestedUri);

                    socket.Close();
                }
            }
            catch (Exception ex)
            {
                string errorMessage = ex.Message;
            }

            return sb.ToString();
        }

        static CookieContainer jar = new CookieContainer();

        public static string GetCookies(Uri _uri)
        {
            StringBuilder sb = new StringBuilder();
            CookieCollection collection = jar.GetCookies(_uri);

            if (collection.Count != 0)
            {
                foreach (Cookie item in collection)
                {
                    sb.Append(item.Name + "=" + item.Value + ";");
                }
            }
            return sb.ToString();
        }
        }
    }

I reduced the buffer to ensure that it was filled more than once ... seems ok from my end This post comes with the typical works on my pc garantee :)

我减少了缓冲区以确保它被多次填充......从我的角度来看似乎没问题这篇文章随附在我的电脑保证上的典型作品:)

回答by Christophe Bouin

Testing received bytes quantity should work most of time.

测试接收的字节数应该在大部分时间工作。

However, what happens if the last chunk of data matches the buffer length?

但是,如果最后一块数据与缓冲区长度匹配会发生什么?

byte[] requestBuffer = new byte[100];
int bytesRead;
do
{
    bytesRead = socket.Receive(requestBuffer);
    //do something
}
while (bytes == bytesReceived.Length); // Pretend bytes = 100

In the example above, if bytes == 100the socket has no more content to receive.

在上面的例子中,如果bytes == 100套接字没有更多的内容要接收。

I suggest using the Socket.Availableproperty, to ensure the program stops reading when there is no more content available.

我建议使用该Socket.Available属性,以确保程序在没有更多可用内容时停止读取。

byte[] requestBuffer = new byte[100];
int bytesRead;
while (socket.Available > 0)    
{
    bytesRead = socket.Receive(requestBuffer);
    //do something
}

回答by foxy

You're reading more content from the stream than you've been given.

您从流中阅读的内容比您获得的要多。

  1. So, you open up a connection to Google and ask for the homepage.
  2. Google will give you its homepage, say, 10KB in size.
  3. You allocate a buffer 102400 bytes large (aka. 100KB large) - 10 times more than what you need.
  1. 因此,您打开与 Google 的连接并请求主页。
  2. Google 会给你它的主页,比如 10KB 大小。
  3. 您分配了一个 102400 字节大(又名 100KB 大)的缓冲区 - 比您需要的多 10 倍。

Now, this is where the problem occurs.

现在,这就是问题发生的地方。

  1. You've been reading the homepage, a few bytes at a time and you've now hit the 10KB mark. Google has fed you the entire homepage, BUT you keep trying to read, trying to ask for more data! What happens is now you're just waiting for more data, more data that doesn't come! You just keep waiting forever until your timeout hits. But because you've specified (in your code) to receive until you've read 100KB, but are only given 10KB, you'll never get there and seem to hang in that loop!
  1. 您一直在阅读主页,一次几个字节,现在已达到 10KB 标记。谷歌已经为你提供了整个主页,但你一直试图阅读,试图要求更多的数据!现在发生的事情是,您只是在等待更多数据,更多尚未到来的数据!您只是一直等待直到超时。但是因为你已经(在你的代码中)指定接收直到你读到 100KB,但只给了 10KB,你永远不会到达那里并且似乎挂在那个循环中!

The solution?

解决方案?

Check if you've received any bytes.

检查您是否收到任何字节。

bytes = socket.Receive(...);
if (bytes == 0)
{
    // no more data, exit loop. you can `break;` or use a while loop, as demonstrated below
}

This could be how you implement it cleanly:

这可能是您干净利落地实施的方式:

do
{
   bytes = socket.Receive(...);
   // Process your data
}
while (bytes > 0);