如何忽略高光功能中的重音符号
时间:2020-03-06 14:48:26 来源:igfitidea点击:
我有一个微型微型搜索引擎,该引擎在Rails应用程序中突出显示搜索词。搜索忽略重音符号,并且突出显示不区分大小写。几乎完美。
但是,例如,如果我有一条文本为" p?o de queijo"的记录,并搜索" pao de queijo",则返回记录,但不会突出显示iext。同样,如果我搜索" p?o de queijo",则返回记录,但未正确突出显示。
我的代码很简单:
<%= highlight(result_pessoa.observacoes, search_string, '<span style="background-color: yellow;"></span>') %>
解决方案
听起来我们正在使用两种不同的方法来确定是否已发生匹配:一种用于搜索,另一种用于高亮。使用与搜索突出显示相同的方法,它应该将其选中,不是吗?
也许我们是直接针对MySQL数据库搜索UTF-8字符串?
正确配置的MySQL服务器(以及可能的任何其他主流数据库服务器)将通过不区分大小写和不区分重音的比较正确执行。
但是,Ruby并非如此。从1.8版开始,Ruby不支持Unicode字符串。因此,我们可以从数据库服务器获得正确的结果,但是使用gsub的Rails突出显示功能无法找到搜索字符串。我们需要使用支持Unicode的字符串库(例如ICU4R)来重新实现突出显示。
我刚刚向Rails提交了一个补丁,可以解决此问题。
http://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/3593-patch-support-for-highlighting-with-ignoring-special-chars
# Highlights one or more +phrases+ everywhere in +text+ by inserting it into # a <tt>:highlighter</tt> string. The highlighter can be specialized by passing <tt>:highlighter</tt> # as a single-quoted string with where the phrase is to be inserted (defaults to # '<strong class="highlight"></strong>') # # ==== Examples # highlight('You searched for: rails', 'rails') # # => You searched for: <strong class="highlight">rails</strong> # # highlight('You searched for: ruby, rails, dhh', 'actionpack') # # => You searched for: ruby, rails, dhh # # highlight('You searched for: rails', ['for', 'rails'], :highlighter => '<em></em>') # # => You searched <em>for</em>: <em>rails</em> # # highlight('You searched for: rails', 'rails', :highlighter => '<a href="search?q="></a>') # # => You searched for: <a href="search?q=rails">rails</a> # # highlight('?umné diev?atá', ['?umňe', 'dievca'], :ignore_special_chars => true) # # => <strong class="highlight">?umné</strong> <strong class="highlight">diev?a</strong>tá # # You can still use <tt>highlight</tt> with the old API that accepts the # +highlighter+ as its optional third parameter: # highlight('You searched for: rails', 'rails', '<a href="search?q="></a>') # => You searched for: <a href="search?q=rails">rails</a> def highlight(text, phrases, *args) options = args.extract_options! unless args.empty? options[:highlighter] = args[0] || '<strong class="highlight"></strong>' end options.reverse_merge!(:highlighter => '<strong class="highlight"></strong>') if text.blank? || phrases.blank? text else haystack = text.clone match = Array(phrases).map { |p| Regexp.escape(p) }.join('|') if options[:ignore_special_chars] haystack = haystack.mb_chars.normalize(:kd) match = match.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]+/n, '').gsub(/\w/, '##代码##[^\x00-\x7F]*') end highlighted = haystack.gsub(/(#{match})(?!(?:[^<]*?)(?:["'])[^<>]*>)/i, options[:highlighter]) highlighted = highlighted.mb_chars.normalize(:kc) if options[:ignore_special_chars] highlighted end end
这是我的文章,它解释了不需要Rails或者ActiveSupport的优雅解决方案。