用于链接字符串中的 url 的 C# 代码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/758135/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C# code to linkify urls in a string
提问by Vance Smith
Does anyone have any good c# code (and regular expressions) that will parse a string and "linkify" any urls that may be in the string?
有没有人有任何好的 c# 代码(和正则表达式)来解析字符串并“链接”可能在字符串中的任何 url?
采纳答案by Konstantin Tarkus
It's a pretty simple task you can acheive it with Regexand a ready-to-go regular expression from:
这是一项非常简单的任务,您可以使用Regex和现成的正则表达式来完成它:
Something like:
就像是:
var html = Regex.Replace(html, @"^(http|https|ftp)\://[a-zA-Z0-9\-\.]+" +
"\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?" +
"([a-zA-Z0-9\-\._\?\,\'/\\+&%$#\=~])*$",
"<a href=\"\"></a>");
You may also be interested not only in creating links but in shortening URLs. Here is a good article on this subject:
您可能不仅对创建链接感兴趣,而且对缩短 URL 感兴趣。这是一篇关于这个主题的好文章:
See also:
另见:
- Regular Expression Workbenchat MSDN
- Converting a URL into a Link in C# Using Regular Expressions
- Regex to find URL within text and make them as link
- Regex.Replace Methodat MSDN
- The Problem With URLsby Jeff Atwood
- Parsing URLs with Regular Expressions and the Regex Object
- Format URLs in string to HTML Links in C#
- Automatically hyperlink URL and Email in ASP.NET Pages with C#
回答by M4N
It's not that easy as you can read in this blog post by Jeff Atwood. It's especially hard to detect where an URL ends.
这并不像您在Jeff Atwood 的这篇博文中读到的那样容易。检测 URL 的结束位置尤其困难。
For example, is the trailing parenthesis part of the URL or not:
例如,是否是 URL 的尾括号部分:
- http://en.wikipedia.org/wiki/PCTools(CentralPointSoftware)
- an URL in parentheses (http://en.wikipedia.org) more text
- http://en.wikipedia.org/wiki/PCTools(CentralPointSoftware)
- 括号中的 URL (http://en.wikipedia.org) 更多文本
In the first case, the parentheses are part of the URL. In the second case they are not!
在第一种情况下,括号是 URL 的一部分。在第二种情况下,他们不是!
回答by Vance Smith
protected string Linkify( string SearchText ) {
// this will find links like:
// http://www.mysite.com
// as well as any links with other characters directly in front of it like:
// href="http://www.mysite.com"
// you can then use your own logic to determine which links to linkify
Regex regx = new Regex( @"\b(((\S+)?)(@|mailto\:|(news|(ht|f)tp(s?))\://)\S+)\b", RegexOptions.IgnoreCase );
SearchText = SearchText.Replace( " ", " " );
MatchCollection matches = regx.Matches( SearchText );
foreach ( Match match in matches ) {
if ( match.Value.StartsWith( "http" ) ) { // if it starts with anything else then dont linkify -- may already be linked!
SearchText = SearchText.Replace( match.Value, "<a href='" + match.Value + "'>" + match.Value + "</a>" );
}
}
return SearchText;
}
回答by josefresno
well, after a lot of research on this, and several attempts to fix times when
好吧,经过对此的大量研究,并多次尝试修复
- people enter in http://www.sitename.comand www.sitename.com in the same post
- fixes to parenthisis like (http://www.sitename.com) and http://msdn.microsoft.com/en-us/library/aa752574(vs.85).aspx
- long urls like: http://www.amazon.com/gp/product/b000ads62g/ref=s9_simz_gw_s3_p74_t1?pf_rd_m=atvpdkikx0der&pf_rd_s=center-2&pf_rd_r=04eezfszazqzs8xfm9yd&pf_rd_t=101&pf_rd_p=470938631&pf_rd_i=507846
- 人们在同一帖子中输入http://www.sitename.com和 www.sitename.com
- 修复括号(http://www.sitename.com)和http://msdn.microsoft.com/en-us/library/aa752574(vs.85).aspx
- 长网址,例如:http://www.amazon.com/gp/product/b000ads62g/ref=s9_simz_gw_s3_p74_t1?pf_rd_m=atvpdkikx0der&pf_rd_s=center-2&pf_rd_r=04eezfszazq1rd_40p3_p3_p3_p3_p3_p3_p3_p3_p3_p3_p7f_3_p7f_3_p74_t1d_pf_3_p74_t1&pf_rd_m=atvpdkikx0der&pf_rd_s
we are now using this HtmlHelper extension... thought I would share and get any comments:
我们现在正在使用这个 HtmlHelper 扩展......以为我会分享并得到任何评论:
private static Regex regExHttpLinks = new Regex(@"(?<=\()\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\))|(?<=(?<wrap>[=~|_#]))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\k<wrap>)|\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]", RegexOptions.Compiled | RegexOptions.IgnoreCase);
public static string Format(this HtmlHelper htmlHelper, string html)
{
if (string.IsNullOrEmpty(html))
{
return html;
}
html = htmlHelper.Encode(html);
html = html.Replace(Environment.NewLine, "<br />");
// replace periods on numeric values that appear to be valid domain names
var periodReplacement = "[[[replace:period]]]";
html = Regex.Replace(html, @"(?<=\d)\.(?=\d)", periodReplacement);
// create links for matches
var linkMatches = regExHttpLinks.Matches(html);
for (int i = 0; i < linkMatches.Count; i++)
{
var temp = linkMatches[i].ToString();
if (!temp.Contains("://"))
{
temp = "http://" + temp;
}
html = html.Replace(linkMatches[i].ToString(), String.Format("<a href=\"{0}\" title=\"{0}\">{1}</a>", temp.Replace(".", periodReplacement).ToLower(), linkMatches[i].ToString().Replace(".", periodReplacement)));
}
// Clear out period replacement
html = html.Replace(periodReplacement, ".");
return html;
}
回答by Yauhen.F
Have found following regular expression http://daringfireball.net/2010/07/improved_regex_for_matching_urls
找到以下正则表达式 http://daringfireball.net/2010/07/improved_regex_for_matching_urls
for me looks very good. Jeff Atwood solution doesn't handle many cases. josefresnoseem to me handle all cases. But when I have tried to understand it (in case of any support requests) my brain was boiled.
对我来说看起来很好。Jeff Atwood 解决方案无法处理很多情况。josfresno在我看来处理所有情况。但是当我试图理解它时(在任何支持请求的情况下)我的大脑都沸腾了。
回答by Berezh
There is class:
有课:
public class TextLink
{
#region Properties
public const string BeginPattern = "((http|https)://)?(www.)?";
public const string MiddlePattern = @"([a-z0-9\-]*\.)+[a-z]+(:[0-9]+)?";
public const string EndPattern = @"(/\S*)?";
public static string Pattern { get { return BeginPattern + MiddlePattern + EndPattern; } }
public static string ExactPattern { get { return string.Format("^{0}$", Pattern); } }
public string OriginalInput { get; private set; }
public bool Valid { get; private set; }
private bool _isHttps;
private string _readyLink;
#endregion
#region Constructor
public TextLink(string input)
{
this.OriginalInput = input;
var text = Regex.Replace(input, @"(^\s)|(\s$)", "", RegexOptions.IgnoreCase);
Valid = Regex.IsMatch(text, ExactPattern);
if (Valid)
{
_isHttps = Regex.IsMatch(text, "^https:", RegexOptions.IgnoreCase);
// clear begin:
_readyLink = Regex.Replace(text, BeginPattern, "", RegexOptions.IgnoreCase);
// HTTPS
if (_isHttps)
{
_readyLink = "https://www." + _readyLink;
}
// Default
else
{
_readyLink = "http://www." + _readyLink;
}
}
}
#endregion
#region Methods
public override string ToString()
{
return _readyLink;
}
#endregion
}
Use it in this method:
在这个方法中使用它:
public static string ReplaceUrls(string input)
{
var result = Regex.Replace(input.ToSafeString(), TextLink.Pattern, match =>
{
var textLink = new TextLink(match.Value);
return textLink.Valid ?
string.Format("<a href=\"{0}\" target=\"_blank\">{1}</a>", textLink, textLink.OriginalInput) :
textLink.OriginalInput;
});
return result;
}
Test cases:
测试用例:
[TestMethod]
public void RegexUtil_TextLink_Parsing()
{
Assert.IsTrue(new TextLink("smthing.com").Valid);
Assert.IsTrue(new TextLink("www.smthing.com/").Valid);
Assert.IsTrue(new TextLink("http://smthing.com").Valid);
Assert.IsTrue(new TextLink("http://www.smthing.com").Valid);
Assert.IsTrue(new TextLink("http://www.smthing.com/").Valid);
Assert.IsTrue(new TextLink("http://www.smthing.com/publisher").Valid);
// port
Assert.IsTrue(new TextLink("http://www.smthing.com:80").Valid);
Assert.IsTrue(new TextLink("http://www.smthing.com:80/").Valid);
// https
Assert.IsTrue(new TextLink("https://smthing.com").Valid);
Assert.IsFalse(new TextLink("").Valid);
Assert.IsFalse(new TextLink("smthing.com.").Valid);
Assert.IsFalse(new TextLink("smthing.com-").Valid);
}
[TestMethod]
public void RegexUtil_TextLink_ToString()
{
// default
Assert.AreEqual("http://www.smthing.com", new TextLink("smthing.com").ToString());
Assert.AreEqual("http://www.smthing.com", new TextLink("http://www.smthing.com").ToString());
Assert.AreEqual("http://www.smthing.com/", new TextLink("smthing.com/").ToString());
Assert.AreEqual("https://www.smthing.com", new TextLink("https://www.smthing.com").ToString());
}
回答by Muhammad Awais
This works for me:
这对我有用:
str = Regex.Replace(str,
@"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)",
"<a target='_blank' href=''></a>");