python 有人可以解释一个只检查值是否与某种模式匹配的货币正则表达式吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2150205/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 23:54:23  来源:igfitidea点击:

Can somebody explain a money regex that just checks if the value matches some pattern?

pythonregexcurrency

提问by orokusaki

There are multiple posts on here that capture value, but I'm just looking to check to see if the value is something. More vaguely put; I'm looking to understand the difference between checking a value, and "capturing" a value. In the current case the value would be the following acceptable money formats:

这里有多个帖子捕获 value,但我只是想检查一下该值是否是某些东西。比较含糊;我希望了解检查值和“捕获”值之间的区别。在当前情况下,该值将是以下可接受的货币格式:

Here is a postthat explains some about a money regex but I don't understand it a bit.

这是一篇解释有关货币正则表达式的文章,但我有点不明白。

.50
50
50.00
50.0
00.00
$.50

I don't want commas (people should know that's ridiculous).

我不想要逗号(人们应该知道这很荒谬)。

The thing I'm having trouble with are:

我遇到的问题是:

  1. Allowing for a $ at the starting of the value (but still optional)
  2. Allowing for only 1 decimal point (but not allowing it at the end)
  3. Understanding how it's working inside
  4. Also understanding out to get a normalized version (only digits and a the optional decimal point) out of it that strips the dollar sign.
  1. 允许在值的开头使用 $(但仍然是可选的)
  2. 只允许 1 个小数点(但不允许在最后)
  3. 了解它的内部运作方式
  4. 还理解从中获取规范化版本(只有数字和可选的小数点),去除美元符号。

My current regex (which obviously doesn't work right) is:

我当前的正则表达式(显然不起作用)是:

# I'm checking the Boolean of the following:
re.compile(r'^[$][\d\.]$').search(value)

(Note: I'm working in Python)

(注意:我在 Python 中工作)

回答by Greg Bacon

Assuming you want to allow $5.but not 5., the following will accept your language:

假设您想允许$5.但不允许5.,以下将接受您的语言:

money = re.compile('|'.join([
  r'^$?(\d*\.\d{1,2})$',  # e.g., $.50, .50, .50, $.5, .5
  r'^$?(\d+)$',           # e.g., 0, , 500, 5
  r'^$(\d+\.?)$',         # e.g., .
]))

Important pieces to understand:

需要理解的重要部分:

  • ^and $match only at the beginning and end of the input string, respectively.
  • \.matches a literal dot
  • \$matches a literal dollar sign
    • \$?matches a dollar sign or nothing (i.e., an optional dollar sign)
  • \dmatches any single digit (0-9)
    • \d*matches runs of zero or more digits
    • \d+matches runs of one or more digits
    • \d{1,2}matches any single digit or a run of two digits
  • ^和分别$只在输入字符串的开头和结尾匹配。
  • \.匹配一个文字点
  • \$匹配文字美元符号
    • \$?一个美元符号或没有(匹配,一个可选的美元符号)
  • \d匹配任何一位数字 (0-9)
    • \d*匹配零个或多个数字的运行
    • \d+匹配一位或多位数字
    • \d{1,2}匹配任何一位数字或一串两位数字

The parenthesized subpatterns are capture groups: all text in the input matched by the subexpression in a capture group will be available in matchobj.group(index). The dollar sign won't be captured because it's outside the parentheses.

括号中的子模式是捕获组:输入中与捕获组中的子表达式匹配的所有文本都将在matchobj.group(index). 美元符号不会被捕获,因为它在括号之外。

Because Python doesn't support multiple capture groups with the same name (!!!) we must search through matchobj.groups()for the one that isn't None. This also means you have to be careful when modifying the pattern to use (?:...)for every group except the amount.

因为 Python 不支持多个具有相同名称的捕获组 (!!!) 我们必须搜索matchobj.groups()不是None. 这也意味着您在修改模式以(?:...)用于除数量之外的每个组时必须小心。

Tweaking Mark's nice test harness, we get

调整 Mark 不错的测试工具,我们得到

for test, expected in tests:
    result = money.match(test) 
    is_match = result is not None
    if is_match == expected:
      status = 'OK'
      if result:
        amt = [x for x in result.groups() if x is not None].pop()
        status += ' (%s)' % amt
    else:
      status = 'Fail'
    print test + '\t' + status

Output:

输出:

.50     OK (.50)
50      OK (50)
50.00   OK (50.00)
50.0    OK (50.0)
00   OK (5000)
$.50    OK (.50)
.     OK (5.)
5.      OK
.000  OK
5000$   OK
.00$  OK
$-5.00  OK
,00   OK
        OK
$       OK
.       OK
.5      OK (.5)

回答by Mark Byers

Here's a regex you can use:

这是您可以使用的正则表达式:

regex = re.compile(r'^$?(\d*(\d\.?|\.\d{1,2}))$')

Here's a test-bed I used to test it. I've included all your tests, plus some of my own. I've also included some negative tests, as making sure that it doesn't match when it shouldn't is just as important as making sure that it does match when it should.

这是我用来测试它的测试台。我已经包括了你所有的测试,还有一些我自己的。我还包括了一些负面测试,因为确保它在不应该匹配的时候不匹配与确保它在应该匹配时匹配一样重要。

tests = [
    ('.50', True),
    ('50', True),
    ('50.00', True),
    ('50.0', True),
    ('00', True),
    ('$.50', True),
    ('.', True),
    ('.000', False),
    ('5000$', False),
    ('.00$', False),
    ('$-5.00', False),
    (',00', False),
    ('', False),
    ('$', False),
    ('.', False),
]

import re
regex = re.compile(r'^$?(\d*(\d\.?|\.\d{1,2}))$')
for test, expected in tests:
    result = regex.match(test) 
    is_match = result is not None
    print test + '\t' + ('OK' if is_match == expected else 'Fail')

To get the value without the $, you can use the captured group:

要获取没有 $ 的值,您可以使用捕获的组:

print result.group(1)

回答by Anon.

Also understanding out to get a normalized version (only digits and a the optional decimal point) out of it that strips the dollar sign.

还理解从中获取规范化版本(只有数字和可选的小数点),去除美元符号。

This is also known as "capturing" the value ;)

这也称为“捕获”值;)

Working off Aaron's base example:

处理 Aaron 的基本示例:

/^$?(\d+(?:\.\d{1,2})?)$/

Then the amount (without the dollar sign) will be in capture group 1.

然后金额(不带美元符号)将在捕获组 1 中。

回答by Aaron

I believe the following regex will meet your needs:

我相信以下正则表达式将满足您的需求:

/^$?(\d*(\.\d\d?)?|\d+)$/

It allows for an optional '$'. It allows for an optional decimal, but requires at least one but not more than two digits after the decimal if the decimal is present.

它允许可选的“$”。它允许使用可选的小数,但如果存在小数,则要求小数后至少一位但不超过两位。

Edit:The outer parentheses will catch the whole numeric value for you.

编辑:外括号将为您捕获整个数值。