Ruby-on-rails Rails:验证链接(URL)的好方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7167895/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Rails: What's a good way to validate links (URLs)?
提问by jay
I was wondering how I would best validate URLs in Rails. I was thinking of using a regular expression, but am not sure if this is the best practice.
我想知道如何最好地验证 Rails 中的 URL。我正在考虑使用正则表达式,但不确定这是否是最佳实践。
And, if I were to use a regex, could someone suggest one to me? I am still new to Regex.
而且,如果我要使用正则表达式,有人可以向我推荐一个吗?我还是 Regex 的新手。
回答by Simone Carletti
Validating an URL is a tricky job. It's also a very broad request.
验证 URL 是一项棘手的工作。这也是一个非常广泛的要求。
What do you want to do, exactly? Do you want to validate the format of the URL, the existence, or what? There are several possibilities, depending on what you want to do.
你到底想做什么?您要验证 URL 的格式、存在性还是什么?有几种可能性,这取决于您想要做什么。
A regular expression can validate the format of the URL. But even a complex regular expression cannot ensure you are dealing with a valid URL.
正则表达式可以验证 URL 的格式。但即使是复杂的正则表达式也不能确保您正在处理有效的 URL。
For instance, if you take a simple regular expression, it will probably reject the following host
例如,如果您采用简单的正则表达式,它可能会拒绝以下主机
http://invalid##host.com
but it will allow
但它会允许
http://invalid-host.foo
that is a valid host, but not a valid domain if you consider the existing TLDs. Indeed, the solution would work if you want to validate the hostname, not the domain because the following one is a valid hostname
这是一个有效的主机,但如果您考虑现有的 TLD,则它不是一个有效的域。实际上,如果您想验证主机名而不是域,该解决方案将起作用,因为以下是有效的主机名
http://host.foo
as well the following one
还有下面的
http://localhost
Now, let me give you some solutions.
现在,让我给你一些解决方案。
If you want to validate a domain, then you need to forget about regular expressions. The best solution available at the moment is the Public Suffix List, a list maintained by Mozilla. I created a Ruby library to parse and validate domains against the Public Suffix List, and it's called PublicSuffix.
如果要验证域,则需要忘记正则表达式。目前可用的最佳解决方案是公共后缀列表,这是一个由 Mozilla 维护的列表。我创建了一个 Ruby 库来根据公共后缀列表解析和验证域,它被称为PublicSuffix。
If you want to validate the format of an URI/URL, then you might want to use regular expressions. Instead of searching for one, use the built-in Ruby URI.parsemethod.
如果要验证 URI/URL 的格式,则可能需要使用正则表达式。不要搜索,而是使用内置的 RubyURI.parse方法。
require 'uri'
def valid_url?(uri)
uri = URI.parse(uri) && !uri.host.nil?
rescue URI::InvalidURIError
false
end
You can even decide to make it more restrictive. For instance, if you want the URL to be an HTTP/HTTPS URL, then you can make the validation more accurate.
您甚至可以决定使其更具限制性。例如,如果您希望 URL 是 HTTP/HTTPS URL,那么您可以使验证更加准确。
require 'uri'
def valid_url?(url)
uri = URI.parse(url)
uri.is_a?(URI::HTTP) && !uri.host.nil?
rescue URI::InvalidURIError
false
end
Of course, there are tons of improvements you can apply to this method, including checking for a path or a scheme.
当然,您可以将大量改进应用于此方法,包括检查路径或方案。
Last but not least, you can also package this code into a validator:
最后但并非最不重要的是,您还可以将此代码打包到验证器中:
class HttpUrlValidator < ActiveModel::EachValidator
def self.compliant?(value)
uri = URI.parse(value)
uri.is_a?(URI::HTTP) && !uri.host.nil?
rescue URI::InvalidURIError
false
end
def validate_each(record, attribute, value)
unless value.present? && self.class.compliant?(value)
record.errors.add(attribute, "is not a valid HTTP URL")
end
end
end
# in the model
validates :example_attribute, http_url: true
回答by Matteo Collina
I use a one liner inside my models:
我在我的模型中使用了一个衬垫:
validates :url, format: URI::regexp(%w[http https])
validates :url, format: URI::regexp(%w[http https])
I think is good enough and simple to use. Moreover it should be theoretically equivalent to the Simone's method, as it use the very same regexp internally.
我认为足够好且易于使用。此外,它在理论上应该等同于 Simone 的方法,因为它在内部使用完全相同的正则表达式。
回答by jlfenaux
Following Simone's idea, you can easily create you own validator.
按照 Simone 的想法,您可以轻松创建自己的验证器。
class UrlValidator < ActiveModel::EachValidator
def validate_each(record, attribute, value)
return if value.blank?
begin
uri = URI.parse(value)
resp = uri.kind_of?(URI::HTTP)
rescue URI::InvalidURIError
resp = false
end
unless resp == true
record.errors[attribute] << (options[:message] || "is not an url")
end
end
end
and then use
然后使用
validates :url, :presence => true, :url => true
in your model.
在你的模型中。
回答by dolzenko
There is also validate_url gem(which is just a nice wrapper for Addressable::URI.parsesolution).
还有validate_url gem(它只是一个很好的Addressable::URI.parse解决方案包装器)。
Just add
只需添加
gem 'validate_url'
to your Gemfile, and then in models you can
到您的Gemfile,然后在模型中您可以
validates :click_through_url, url: true
回答by Stefan Pettersson
This question is already answered, but what the heck, I propose the solution I'm using.
这个问题已经回答了,但到底是什么,我提出了我正在使用的解决方案。
The regexp works fine with all urls I've met. The setter method is to take care if no protocol is mentioned (let's assume http://).
正则表达式适用于我遇到的所有网址。如果没有提到协议(让我们假设 http://),setter 方法会小心。
And finally, we make a try to fetch the page. Maybe I should accept redirects and not only HTTP 200 OK.
最后,我们尝试获取页面。也许我应该接受重定向,而不仅仅是 HTTP 200 OK。
# app/models/my_model.rb
validates :website, :allow_blank => true, :uri => { :format => /(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix }
def website= url_str
unless url_str.blank?
unless url_str.split(':')[0] == 'http' || url_str.split(':')[0] == 'https'
url_str = "http://" + url_str
end
end
write_attribute :website, url_str
end
and...
和...
# app/validators/uri_vaidator.rb
require 'net/http'
# Thanks Ilya! http://www.igvita.com/2006/09/07/validating-url-in-ruby-on-rails/
# Original credits: http://blog.inquirylabs.com/2006/04/13/simple-uri-validation/
# HTTP Codes: http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTPResponse.html
class UriValidator < ActiveModel::EachValidator
def validate_each(object, attribute, value)
raise(ArgumentError, "A regular expression must be supplied as the :format option of the options hash") unless options[:format].nil? or options[:format].is_a?(Regexp)
configuration = { :message => I18n.t('errors.events.invalid_url'), :format => URI::regexp(%w(http https)) }
configuration.update(options)
if value =~ configuration[:format]
begin # check header response
case Net::HTTP.get_response(URI.parse(value))
when Net::HTTPSuccess then true
else object.errors.add(attribute, configuration[:message]) and false
end
rescue # Recover on DNS failures..
object.errors.add(attribute, configuration[:message]) and false
end
else
object.errors.add(attribute, configuration[:message]) and false
end
end
end
回答by Roman Ralovets
You can also try valid_urlgem which allows URLs without the scheme, checks domain zone and ip-hostnames.
您还可以尝试valid_urlgem,它允许没有方案的 URL,检查域区域和 ip-hostnames。
Add it to your Gemfile:
将其添加到您的 Gemfile 中:
gem 'valid_url'
gem 'valid_url'
And then in model:
然后在模型中:
class WebSite < ActiveRecord::Base
validates :url, :url => true
end
回答by heriberto perez
The solution that worked for me was:
对我有用的解决方案是:
validates_format_of :url, :with => /\A(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\.-]*)*\/?\Z/i
I did try to use some of the example that you attached but I'm supporting url like so:
我确实尝试使用您附加的一些示例,但我支持这样的 url:
Notice the use of A and Z because if you use ^ and $ you will see this warning security from Rails validators.
请注意 A 和 Z 的使用,因为如果您使用 ^ 和 $,您将看到来自 Rails 验证器的警告安全性。
Valid ones:
'www.crowdint.com'
'crowdint.com'
'http://crowdint.com'
'http://www.crowdint.com'
Invalid ones:
'http://www.crowdint. com'
'http://fake'
'http:fake'
回答by lafeber
Just my 2 cents:
只是我的 2 美分:
before_validation :format_website
validate :website_validator
private
def format_website
self.website = "http://#{self.website}" unless self.website[/^https?/]
end
def website_validator
errors[:website] << I18n.t("activerecord.errors.messages.invalid") unless website_valid?
end
def website_valid?
!!website.match(/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-=\?]*)*\/?$/)
end
EDIT: changed regex to match parameter urls.
编辑:更改正则表达式以匹配参数 url。
回答by severin
I ran into the same problem lately (I needed to validate urls in a Rails app) but I had to cope with the additional requirement of unicode urls (e.g. http://кц.рф)...
我最近遇到了同样的问题(我需要在 Rails 应用程序中验证 url),但我不得不应对 unicode url 的额外要求(例如http://кц.рф)...
I researched a couple of solutions and came across the following:
我研究了几个解决方案并遇到了以下问题:
- The first and most suggested thing is using
URI.parse. Check the answer by Simone Carletti for details. This works ok, but not for unicode urls. - The second method I saw was the one by Ilya Grigorik: http://www.igvita.com/2006/09/07/validating-url-in-ruby-on-rails/Basically, he tries to make a request to the url; if it works, it is valid...
- The third method I found (and the one I prefer) is an approach similar to
URI.parsebut using theaddressablegem instead of theURIstdlib. This approach is detailed here: http://rawsyntax.com/blog/url-validation-in-rails-3-and-ruby-in-general/
- 第一个也是最建议的事情是使用
URI.parse. 查看 Simone Carletti 的答案以了解详细信息。这可以正常工作,但不适用于 unicode url。 - 我看到的第二种方法是 Ilya Grigorik 的方法:http://www.igvita.com/2006/09/07/validating-url-in-ruby-on-rails/基本上,他试图向网址;如果它有效,它是有效的......
- 我发现的第三种方法(也是我更喜欢的方法)是一种类似于
URI.parse但使用addressablegem 而不是URIstdlib 的方法。这种方法在这里有详细说明:http: //rawsyntax.com/blog/url-validation-in-rails-3-and-ruby-in-general/
回答by JJD
Here is an updated version of the validator posted by David James. It has been published by Benjamin Fleischer. Meanwhile, I pushed an updated fork which can be found here.
这是David James 发布的验证器的更新版本。它已由本杰明·弗莱舍 (Benjamin Fleischer) 出版。同时,我推送了一个更新的 fork,可以在这里找到。
require 'addressable/uri'
# Source: http://gist.github.com/bf4/5320847
# Accepts options[:message] and options[:allowed_protocols]
# spec/validators/uri_validator_spec.rb
class UriValidator < ActiveModel::EachValidator
def validate_each(record, attribute, value)
uri = parse_uri(value)
if !uri
record.errors[attribute] << generic_failure_message
elsif !allowed_protocols.include?(uri.scheme)
record.errors[attribute] << "must begin with #{allowed_protocols_humanized}"
end
end
private
def generic_failure_message
options[:message] || "is an invalid URL"
end
def allowed_protocols_humanized
allowed_protocols.to_sentence(:two_words_connector => ' or ')
end
def allowed_protocols
@allowed_protocols ||= [(options[:allowed_protocols] || ['http', 'https'])].flatten
end
def parse_uri(value)
uri = Addressable::URI.parse(value)
uri.scheme && uri.host && uri
rescue URI::InvalidURIError, Addressable::URI::InvalidURIError, TypeError
end
end
...
...
require 'spec_helper'
# Source: http://gist.github.com/bf4/5320847
# spec/validators/uri_validator_spec.rb
describe UriValidator do
subject do
Class.new do
include ActiveModel::Validations
attr_accessor :url
validates :url, uri: true
end.new
end
it "should be valid for a valid http url" do
subject.url = 'http://www.google.com'
subject.valid?
subject.errors.full_messages.should == []
end
['http://google', 'http://.com', 'http://ftp://ftp.google.com', 'http://ssh://google.com'].each do |invalid_url|
it "#{invalid_url.inspect} is a invalid http url" do
subject.url = invalid_url
subject.valid?
subject.errors.full_messages.should == []
end
end
['http:/www.google.com','<>hi'].each do |invalid_url|
it "#{invalid_url.inspect} is an invalid url" do
subject.url = invalid_url
subject.valid?
subject.errors.should have_key(:url)
subject.errors[:url].should include("is an invalid URL")
end
end
['www.google.com','google.com'].each do |invalid_url|
it "#{invalid_url.inspect} is an invalid url" do
subject.url = invalid_url
subject.valid?
subject.errors.should have_key(:url)
subject.errors[:url].should include("is an invalid URL")
end
end
['ftp://ftp.google.com','ssh://google.com'].each do |invalid_url|
it "#{invalid_url.inspect} is an invalid url" do
subject.url = invalid_url
subject.valid?
subject.errors.should have_key(:url)
subject.errors[:url].should include("must begin with http or https")
end
end
end
Please notice that there are still strange HTTP URIs that are parsed as valid addresses.
请注意,仍有一些奇怪的 HTTP URI 被解析为有效地址。
http://google
http://.com
http://ftp://ftp.google.com
http://ssh://google.com
Here is a issue for the addressablegemwhich covers the examples.
这是涵盖示例的addressablegem的问题。

