C# 从原始 URL 获取重定向的 URL

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/704956/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 14:08:12  来源:igfitidea点击:

Getting the Redirected URL from the Original URL

c#.net

提问by user85594

I have a table in my database which contains the URLs of some websites. I have to open those URLs and verify some links on those pages. The problem is that some URLs get redirected to other URLs. My logic is failing for such URLs.

我的数据库中有一个表,其中包含一些网站的 URL。我必须打开这些 URL 并验证这些页面上的一些链接。问题是某些 URL 被重定向到其他 URL。对于此类 URL,我的逻辑失败。

Is there some way through which I can pass my original URL string and get the redirected URL back?

有什么方法可以传递我的原始 URL 字符串并返回重定向的 URL?

Example: I am trying with this URL: http://individual.troweprice.com/public/Retail/xStaticFiles/FormsAndLiterature/CollegeSavings/trp529Disclosure.pdf

示例:我正在尝试使用此 URL:http: //individual.troweprice.com/public/Retail/xStaticFiles/FormsAndLiterature/CollegeSavings/trp529Disclosure.pdf

It gets redirected to this one: http://individual.troweprice.com/staticFiles/Retail/Shared/PDFs/trp529Disclosure.pdf

它被重定向到这个:http: //individual.troweprice.com/staticFiles/Retail/Shared/PDFs/trp529Disclosure.pdf

I tried to use following code:

我尝试使用以下代码:

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Uris);
req.Proxy = proxy;
req.Method = "HEAD";
req.AllowAutoRedirect = false;

HttpWebResponse myResp = (HttpWebResponse)req.GetResponse();
if (myResp.StatusCode == HttpStatusCode.Redirect)
{
  MessageBox.Show("redirected to:" + myResp.GetResponseHeader("Location"));
}

When I execute the code above it gives me HttpStatusCodeOk. I am surprised why it is not considering it a redirection. If I open the link in Internet Explorer then it will redirect to another URL and open the PDF file.

当我执行上面的代码时,它给了我HttpStatusCodeOk. 我很惊讶为什么它没有将其视为重定向。如果我在 Internet Explorer 中打开该链接,它将重定向到另一个 URL 并打开 PDF 文件。

Can someone help me understand why it is not working properly for the example URL?

有人可以帮助我理解为什么示例 URL 不能正常工作吗?

By the way, I checked with Hotmail's URL (http://www.hotmail.com) and it correctly returns the redirected URL.

顺便说一下,我检查了 Hotmail 的 URL ( http://www.hotmail.com),它正确返回了重定向的 URL。

Thanks,

谢谢,

回答by Can Berk Güder

The URL you mentioned uses a JavaScript redirect, which will only redirect a browser. So there's no easy way to detect the redirect.

您提到的 URL 使用 JavaScript 重定向,它只会重定向浏览器。所以没有简单的方法来检测重定向。

For proper (HTTP Status Code and Location:) redirects, you might want to remove

对于正确的(HTTP 状态代码和位置:)重定向,您可能需要删除

req.AllowAutoRedirect = false;

and get the final URL using

并使用

myResp.ResponseUri

as there can be more than one redirect.

因为可以有多个重定向。

UPDATE: More clarification regarding redirects:

更新:有关重定向的更多说明:

There's more than one way to redirect a browser to another URL.

将浏览器重定向到另一个 URL 的方法不止一种。

The first way is to use a 3xx HTTP status code, and the Location: header. This is the way the gods intended HTTP redirects to work, and is also known as "the one true way." This method will work on all browsers and crawlers.

第一种方法是使用 3xx HTTP 状态代码和 Location: 标头。这是众神希望HTTP重定向工作的方式,也被称为“唯一的方式”。此方法适用于所有浏览器和爬虫。

And then there are the devil's ways. These include meta refresh, the Refresh: header, and JavaScript. Although these methods work in most browsers, they are definitely not guaranteed to work, and occasionally result in strange behavior (aka. breaking the back button).

然后是魔鬼的方式。其中包括meta refresh、 Refresh: 标头和 JavaScript。尽管这些方法在大多数浏览器中都有效,但它们绝对不能保证有效,并且偶尔会导致奇怪的行为(也就是打破后退按钮)。

Most web crawlers, including the Googlebot, ignore these redirection methods, and so should you. If you absolutely haveto detect all redirects, then you would have to parse the HTML for META tags, look for Refresh: headers in the response, and evaluate Javascript. Good luck with the last one.

大多数网络抓取工具,包括 Googlebot,都会忽略这些重定向方法,您也应该如此。如果您绝对必须检测所有重定向,那么您必须解析 META 标记的 HTML,在响应中查找 Refresh: 标头,并评估 Javascript。祝最后一个好运。

回答by Alex

You could check the Request.UrlReferrer.AbsoluteUri to see where i came from. If that doesn't work can you pass the old url as a query string parameter?

你可以检查 Request.UrlReferrer.AbsoluteUri 看看我来自哪里。如果这不起作用,您可以将旧 url 作为查询字符串参数传递吗?

回答by Code.Town

I made this method using your code and it returns the final redirected URL.

我使用您的代码创建了此方法,它返回最终重定向的 URL。

        public string GetFinalRedirectedUrl(string url)
    {
        string result = string.Empty;

        Uri Uris = new Uri(url);

        HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Uris);
        //req3.Proxy = proxy;
        req.Method = "HEAD";
        req.AllowAutoRedirect = false;

        HttpWebResponse myResp = (HttpWebResponse)req.GetResponse();
        if (myResp.StatusCode == HttpStatusCode.Redirect)
        {
            string temp = myResp.GetResponseHeader("Location");
            //Recursive call
            result = GetFinalRedirectedUrl(temp);
        }
        else
        {
            result = url;
        }

        return result;
    }

Note: myResp.ResponseUri does not return the final URL

注意:myResp.ResponseUri 不返回最终 URL

回答by jayson.centeno

This code works for me

这段代码对我有用

var request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "POST";
request.AllowAutoRedirect = true;
request.ContentType = "application/x-www-form-urlencoded";
var response = request.GetResponse();

//After sending the request and the request is expected to redirect to some page of your website, The response.ResponseUri.AbsoluteUri contains that url including the query strings //(www.yourwebsite.com/returnulr?r=""... and so on)

//发送请求后,请求将重定向到您网站的某个页面, response.ResponseUri.AbsoluteUri 包含该 url,包括查询字符串 //(www.yourwebsite.com/returnulr?r="".. 。 等等)

Redirect(response.ResponseUri.AbsoluteUri); //then just do your own redirect.

Hope this helps

希望这可以帮助

回答by Prithvi Raj Nandiwal

use this code to get redirecting url

使用此代码获取重定向网址

public void GrtUrl(string url)
    {
        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
        webRequest.AllowAutoRedirect = false;  // IMPORTANT

        webRequest.Timeout = 10000;           // timeout 10s
        webRequest.Method = "HEAD";
        // Get the response ...
        HttpWebResponse webResponse;
        using (webResponse = (HttpWebResponse)webRequest.GetResponse())
        {
            // Now look to see if it's a redirect
            if ((int)webResponse.StatusCode >= 300 && (int)webResponse.StatusCode <= 399)
            {
                string uriString = webResponse.Headers["Location"];
                Console.WriteLine("Redirect to " + uriString ?? "NULL");
                webResponse.Close(); // don't forget to close it - or bad things happen!
            }

        }

    }

回答by Marcelo Calbucci

This function will return the final destination of a link -- even if there are multiple redirects. It doesn't account for JavaScript-based redirects or META redirects. Notice that the previous solution didn't deal with Absolute & Relative URLs, since the LOCATION header could return something like "/newhome" you need to combine with the URL that served that response to identify the full URL destination.

此函数将返回链接的最终目的地——即使有多个重定向。它不考虑基于 JavaScript 的重定向或 META 重定向。请注意,之前的解决方案没有处理绝对和相对 URL,因为 LOCATION 标头可能返回类似“/newhome”的内容,您需要与提供该响应的 URL 结合以识别完整的 URL 目标。

    public static string GetFinalRedirect(string url)
    {
        if(string.IsNullOrWhiteSpace(url))
            return url;

        int maxRedirCount = 8;  // prevent infinite loops
        string newUrl = url;
        do
        {
            HttpWebRequest req = null;
            HttpWebResponse resp = null;
            try
            {
                req = (HttpWebRequest) HttpWebRequest.Create(url);
                req.Method = "HEAD";
                req.AllowAutoRedirect = false;
                resp = (HttpWebResponse)req.GetResponse();
                switch (resp.StatusCode)
                {
                    case HttpStatusCode.OK:
                        return newUrl;
                    case HttpStatusCode.Redirect:
                    case HttpStatusCode.MovedPermanently:
                    case HttpStatusCode.RedirectKeepVerb:
                    case HttpStatusCode.RedirectMethod:
                        newUrl = resp.Headers["Location"];
                        if (newUrl == null)
                            return url;

                        if (newUrl.IndexOf("://", System.StringComparison.Ordinal) == -1)
                        {
                            // Doesn't have a URL Schema, meaning it's a relative or absolute URL
                            Uri u = new Uri(new Uri(url), newUrl);
                            newUrl = u.ToString();
                        }
                        break;
                    default:
                        return newUrl;
                }
                url = newUrl;
            }
            catch (WebException)
            {
                // Return the last known good URL
                return newUrl;
            }
            catch (Exception ex)
            {
                return null;
            }
            finally
            {
                if (resp != null)
                    resp.Close();
            }
        } while (maxRedirCount-- > 0);

        return newUrl;
    }

回答by Armin

I had the same problem and after tryin a lot I couldn't get what i wanted with HttpWebRequest so i used web browser class to navigate to first url and then i could get the redirected url !

我遇到了同样的问题,经过多次尝试后,我无法通过 HttpWebRequest 获得我想要的内容,因此我使用网络浏览器类导航到第一个 url,然后我可以获得重定向的 url!

WebBrowser browser = new WebBrowser();
browser.Navigating += new System.Windows.Forms.WebBrowserNavigatingEventHandler(this.browser_Navigating);
string urlToNavigate = "your url";
browser.Navigate(new Uri(urlToNavigate));

then on navigating you can get your redirected url. Be careful that the first time browser_Navigating event handler occurs, e.url is the same url you used to start browsing so you can get redirected url on the second call

然后在导航时,您可以获得重定向的网址。请注意第一次 browser_Navigating 事件处理程序发生时,e.url 与您开始浏览时使用的 url 相同,因此您可以在第二次调用时获得重定向的 url

private void browser_Navigating(object sender, WebBrowserNavigatingEventArgs e)
{
    Uri uri = e.Url;
}

回答by Haddad

string url = ".......";
var request = (HttpWebRequest)WebRequest.Create(url);
var response = (HttpWebResponse)request.GetResponse();

string redirectUrl = response.ResponseUri.ToString();

回答by Konstantin S.

Async HttpClientversions:

异步HttpClient版本:

// works in .Net Framework and .Net Core
public static async Task<Uri> GetRedirectedUrlAsync(Uri uri, CancellationToken cancellationToken = default)
{
    using var client = new HttpClient(new HttpClientHandler
    {
        AllowAutoRedirect = false,
    }, true);
    using var response = await client.GetAsync(uri, cancellationToken);

    return new Uri(response.Headers.GetValues("Location").First();
}

// works in .Net Core
public static async Task<Uri> GetRedirectedUrlAsync(Uri uri, CancellationToken cancellationToken = default)
{
    using var client = new HttpClient();
    using var response = await client.GetAsync(uri, cancellationToken);

    return response.RequestMessage.RequestUri;
}

P.S. handler.MaxAutomaticRedirections = 1can be used if you need to limit the number of attempts.

handler.MaxAutomaticRedirections = 1如果需要限制尝试次数,可以使用PS 。