C# 为什么这个 WebRequest 代码很慢?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/754333/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why is this WebRequest code slow?
提问by
I requested 100 pages that all 404. I wrote
我要求 100 页全部为 404。我写了
{
var s = DateTime.Now;
for(int i=0; i < 100;i++)
DL.CheckExist("http://google.com/lol" + i.ToString() + ".jpg");
var e = DateTime.Now;
var d = e-s;
d=d;
Console.WriteLine(d);
}
static public bool CheckExist(string url)
{
HttpWebRequest wreq = null;
HttpWebResponse wresp = null;
bool ret = false;
try
{
wreq = (HttpWebRequest)WebRequest.Create(url);
wreq.KeepAlive = true;
wreq.Method = "HEAD";
wresp = (HttpWebResponse)wreq.GetResponse();
ret = true;
}
catch (System.Net.WebException)
{
}
finally
{
if (wresp != null)
wresp.Close();
}
return ret;
}
Two runs show it takes 00:00:30.7968750 and 00:00:26.8750000. Then i tried firefox and use the following code
两次运行显示它需要 00:00:30.7968750 和 00:00:26.8750000。然后我尝试了 Firefox 并使用以下代码
<html>
<body>
<script type="text/javascript">
for(var i=0; i<100; i++)
document.write("<img src=http://google.com/lol" + i + ".jpg><br>");
</script>
</body>
</html>
Using my comp time and counting it was roughly 4 seconds. 4 seconds is 6.5-7.5faster then my app. I plan to scan through a thousands of files so taking 3.75hours instead of 30mins would be a big problem. How can i make this code faster? I know someone will say firefox caches the images but i want to say 1) it still needs to check the headers from the remote server to see if it has been updated (which is what i want my app to do) 2) I am not receiving the body, my code should only be requesting the header. So, how do i solve this?
使用我的补偿时间并计算它大约为 4 秒。4 秒比我的应用程序快 6.5-7.5 秒。我计划扫描数千个文件,因此花费 3.75 小时而不是 30 分钟将是一个大问题。我怎样才能使这段代码更快?我知道有人会说 Firefox 缓存图像,但我想说 1)它仍然需要检查来自远程服务器的标头以查看它是否已更新(这是我希望我的应用程序执行的操作)2)我不是接收正文,我的代码应该只请求标题。那么,我该如何解决这个问题?
采纳答案by Artelius
Probably Firefox issues multiple requests at once whereas your code does them one by one. Perhaps adding threads will speed up your program.
Firefox 可能会一次发出多个请求,而您的代码则是一一执行。也许添加线程会加速您的程序。
回答by Srikar Doddi
回答by Fung
Have you tried opening the same URL in IE on the machine that your code is deployed to? If it is a Windows Server machine then sometimes it's because the url you're requesting is not in IE's (which HttpWebRequest works off) list of secure sites. You'll just need to add it.
您是否尝试在部署代码的机器上的 IE 中打开相同的 URL?如果它是 Windows Server 机器,那么有时是因为您请求的 url 不在 IE 的(HttpWebRequest 工作的)安全站点列表中。你只需要添加它。
Do you have more info you could post? I've doing something similar and have run into tons of problems with HttpWebRequest before. All unique. So more info would help.
你有更多可以发布的信息吗?我做过类似的事情,之前在 HttpWebRequest 方面遇到过很多问题。都是独一无二的。所以更多的信息会有所帮助。
BTW, calling it using the async methods won't really help in this case. It doesn't shorten the download time. It just doesn't block your calling thread that's all.
顺便说一句,在这种情况下,使用异步方法调用它并没有真正的帮助。它不会缩短下载时间。它只是不会阻塞您的调用线程而已。
回答by Max
I noticed that an HttpWebRequest
hangs on the first request. I did some research and what seems to be happening is that the request is configuring or auto-detecting proxies. If you set
我注意到HttpWebRequest
第一个请求挂起。我做了一些研究,似乎正在发生的是请求正在配置或自动检测代理。如果你设置
request.Proxy = null;
on the web request object, you might be able to avoid an initial delay.
在 Web 请求对象上,您或许能够避免初始延迟。
With proxy auto-detect:
使用代理自动检测:
using (var response = (HttpWebResponse)request.GetResponse()) //6,956 ms
{
}
Without proxy auto-detect:
没有代理自动检测:
request.Proxy = null;
using (var response = (HttpWebResponse)request.GetResponse()) //154 ms
{
}
回答by Alterin
The answer is changing HttpWebRequest/HttpWebResponse to WebRequest/WebResponse only. That fixed the problem.
答案是仅将 HttpWebRequest/HttpWebResponse 更改为 WebRequest/WebResponse。这解决了问题。
回答by Sarvesh
close the response stream when you are done, so in your checkExist(), add wresp.Close() after wresp = (HttpWebResponse)wreq.GetResponse();
完成后关闭响应流,因此在 checkExist() 中,在 wresp = (HttpWebResponse)wreq.GetResponse(); 之后添加 wresp.Close();
回答by Tejaswi Pandava
OK if you are getting status code 404 for all webpages then it is due to not specifying credentials. So you need to add
好的,如果您获得所有网页的状态代码 404,那么这是由于未指定凭据。所以你需要添加
wreq.Credentials = CredentialCache.DefaultCredentials;
Then you may also come across status code= 500 for that you need to specify User Agent. Which looks something like the below line
然后您可能还会遇到状态代码= 500,因为您需要指定用户代理。这看起来像下面的行
wreq.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0";
"A WebClient instance does not send optional HTTP headers by default. If your request requires an optional header, you must add the header to the Headers collection. For example, to retain queries in the response, you must add a user-agent header. Also, servers may return 500 (Internal Server Error) if the user agent header is missing."
reference: https://msdn.microsoft.com/en-us/library/system.net.webclient(v=vs.110).aspx
“默认情况下,WebClient 实例不会发送可选的 HTTP 标头。如果您的请求需要可选标头,则必须将该标头添加到 Headers 集合中。例如,要在响应中保留查询,您必须添加一个用户代理标头。此外,如果缺少用户代理标头,服务器可能会返回 500(内部服务器错误)。”
参考:https: //msdn.microsoft.com/en-us/library/system.net.webclient(v=vs.110).aspx
To improve the Performance of the HttpWebrequest you need to add
要提高 HttpWebrequest 的性能,您需要添加
wreq.Proxy=null
now the code will look like:
现在代码将如下所示:
static public bool CheckExist(string url)
{
HttpWebRequest wreq = null;
HttpWebResponse wresp = null;
bool ret = false;
try
{
wreq = (HttpWebRequest)WebRequest.Create(url);
wreq.Credentials = CredentialCache.DefaultCredentials;
wreq.Proxy=null;
wreq.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0";
wreq.KeepAlive = true;
wreq.Method = "HEAD";
wresp = (HttpWebResponse)wreq.GetResponse();
ret = true;
}
catch (System.Net.WebException)
{
}
finally
{
if (wresp != null)
wresp.Close();
}
return ret;
}
}