使用 Javascript 正则表达式匹配重音字符

Question

提问by nickf

Here's a fun snippet I ran into today:

这是我今天遇到的一个有趣的片段：

/\ba/.test("a") --> true
/\bà/.test("à") --> false

However,

然而，

/à/.test("à") --> true

Firstly, wtf?

首先，wtf？

Secondly, if I want to match an accented character at the start of a word, how can I do that? (I'd really like to avoid using over-the-top selectors like /(?:^|\s|'| ....)

其次，如果我想在单词的开头匹配一个带重音的字符，我该怎么做？（我真的很想避免使用像那样的顶级选择器/(?:^|\s|'| ....）

Answer 1

回答by Wak

This worked for me:

这对我有用：

/^[a-z\u00E0-\u00FC]+$/i

With help from here

在此处的帮助下

Answer 2

回答by Riimu

The reason why /\bà/.test("à")doesn't match is because "à" is not a word character. The escape sequence \bmatches only between a boundary of word character and a non word character. /\ba/.test("a")matches because "a" is a word character. Because of that, there is a boundary between the beginning of the string (which is not a word character) and the letter "a" which is a word character.

/\bà/.test("à")不匹配的原因是因为“à”不是单词字符。转义序列\b仅在单词字符和非单词字符的边界之间匹配。/\ba/.test("a")匹配，因为“a”是一个单词字符。因此，字符串的开头（不是单词字符）和作为单词字符的字母“a”之间存在边界。

Word characters in JavaScript's regex is defined as [a-zA-Z0-9_].

JavaScript 正则表达式中的单词字符定义为[a-zA-Z0-9_].

To match an accented character at the start of a string, just use the ^character at the beginning of the regex (e.g. /^à/). That character means the beginning of the string (unlike \bwhich matches at any word boundary within the string). It's most basic and standard regular expression, so it's definitely not over the top.

要匹配字符串开头的重音字符，只需使用^正则表达式开头的字符（例如/^à/）。该字符表示字符串的开头（与\b在字符串内的任何单词边界处匹配的字符不同）。它是最基本和标准的正则表达式，所以它绝对不是最重要的。

Answer 3

回答by stema

Stack Overflow had also an issue with non ASCII characters in regex, you can find it here. They are not coping with word boundaries, but maybe gives you anyway useful hints.

Stack Overflow 也存在正则表达式中非 ASCII 字符的问题，您可以在此处找到它。它们不处理单词边界，但可能会给您提供有用的提示。

There is another page, but he wants to match strings and not words.

还有另一个page，但他想匹配字符串而不是单词。

I don't know, and did not find now, an anchor for your problem, but when I see what monster regexes in my first link are used, your group, that you want to avoid, is not over the top and to my opinion your solution.

我不知道，现在也没有找到解决您问题的锚点，但是当我看到在我的第一个链接中使用了哪些怪物正则表达式时，您想要避免的组并没有超出我的意见你的解决方案。

Answer 4

回答by Craig1123

const regex = /^[\-/A-Za-z\u00C0-\u017F ]+$/;
const test1 = regex.test("à");
const test2 = regex.test("Martinez-Cortez");
const test3 = regex.test("Leonardo da vinci");
const test4 = regex.test("?");

console.log('test1', test1);
console.log('test2', test2);
console.log('test3', test3);
console.log('test4', test4);

Building off of Wak's and C?ur's answer:

基于 Wak 和 C?ur 的回答：

/^[\-/A-Za-z\u00C0-\u017F ]+$/

Works for spaces and dashes too.

也适用于空格和破折号。

Example: Leonardo da vinci, Martinez-Cortez

示例：列奥纳多·达·芬奇、马丁内斯-科尔特斯

使用 Javascript 正则表达式匹配重音字符

提问by nickf

回答by Wak

回答by Riimu

回答by stema

回答by Craig1123

相关推荐

最近更新

标签

使用 Javascript 正则表达式匹配重音字符

提问by nickf

回答by Wak

回答by Riimu

回答by stema

回答by Craig1123

相关推荐

Javascript 使用 JQuery 获取 onclick() 事件的字符串值

Javascript 如何在字符串数组中搜索字符串

Javascript 如何正确验证模态表单

Javascript 谷歌地图显示路线

相关推荐

最近更新

标签