Python 如何从字符串中提取浮点数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4703390/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 17:03:18  来源:igfitidea点击:

How to extract a floating number from a string

pythonregexfloating-pointdata-extraction

提问by Ben Keating

I have a number of strings similar to Current Level: 13.4 db.and I would like to extract just the floating point number. I say floating and not decimal as it's sometimes whole. Can RegEx do this or is there a better way?

我有许多类似的字符串Current Level: 13.4 db.,我只想提取浮点数。我说浮动而不是十进制,因为它有时是完整的。RegEx 可以这样做还是有更好的方法?

采纳答案by miku

If your float is always expressed in decimal notation something like

如果您的浮点数始终以十进制表示法表示,例如

>>> import re
>>> re.findall("\d+\.\d+", "Current Level: 13.4 db.")
['13.4']

may suffice.

可能就足够了。

A more robust version would be:

更强大的版本是:

>>> re.findall(r"[-+]?\d*\.\d+|\d+", "Current Level: -13.2 db or 14.2 or 3")
['-13.2', '14.2', '3']

If you want to validate user input, you could alternatively also check for a float by stepping to it directly:

如果您想验证用户输入,您也可以通过直接步进来检查浮点数:

user_input = "Current Level: 1e100 db"
for token in user_input.split():
    try:
        # if this succeeds, you have your (first) float
        print float(token), "is a float"
    except ValueError:
        print token, "is something else"

# => Would print ...
#
# Current is something else
# Level: is something else
# 1e+100 is a float
# db is something else

回答by Tim McNamara

Another approach that may be more readable is simple type conversion. I've added a replacement function to cover instances where people may enter European decimals:

另一种可能更具可读性的方法是简单的类型转换。我添加了一个替换函数来涵盖人们可能输入欧洲小数的情况:

>>> for possibility in "Current Level: -13.2 db or 14,2 or 3".split():
...     try:
...         str(float(possibility.replace(',', '.')))
...     except ValueError:
...         pass
'-13.2'
'14.2'
'3.0'

This has disadvantages too however. If someone types in "1,000", this will be converted to 1. Also, it assumes that people will be inputting with whitespace between words. This is not the case with other languages, such as Chinese.

然而,这也有缺点。如果有人输入“1,000”,这将被转换为 1。此外,它假设人们将在单词之间输入空格。这不是其他语言的情况,例如中文。

回答by John Machin

You may like to try something like this which covers all the bases, including not relying on whitespace after the number:

你可能想尝试这样的事情,它涵盖了所有的基础,包括不依赖于数字后面的空格:

>>> import re
>>> numeric_const_pattern = r"""
...     [-+]? # optional sign
...     (?:
...         (?: \d* \. \d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc
...         |
...         (?: \d+ \.? ) # 1. 12. 123. etc 1 12 123 etc
...     )
...     # followed by optional exponent part if desired
...     (?: [Ee] [+-]? \d+ ) ?
...     """
>>> rx = re.compile(numeric_const_pattern, re.VERBOSE)
>>> rx.findall(".1 .12 9.1 98.1 1. 12. 1 12")
['.1', '.12', '9.1', '98.1', '1.', '12.', '1', '12']
>>> rx.findall("-1 +1 2e9 +2E+09 -2e-9")
['-1', '+1', '2e9', '+2E+09', '-2e-9']
>>> rx.findall("current level: -2.03e+99db")
['-2.03e+99']
>>>

For easy copy-pasting:

为了方便复制粘贴:

numeric_const_pattern = '[-+]? (?: (?: \d* \. \d+ ) | (?: \d+ \.? ) )(?: [Ee] [+-]? \d+ ) ?'
rx = re.compile(numeric_const_pattern, re.VERBOSE)
rx.findall("Some example: Jr. it. was .23 between 2.3 and 42.31 seconds")

回答by Martin

re.findall(r"[-+]?\d*\.?\d+|\d+", "Current Level: -13.2 db or 14.2 or 3")

as described above, works really well! One suggestion though:

如上所述,效果非常好!不过有一个建议:

re.findall(r"[-+]?\d*\.?\d+|[-+]?\d+", "Current Level: -13.2 db or 14.2 or 3 or -3")

will also return negative int values (like -3 in the end of this string)

还将返回负的 int 值(例如此字符串末尾的 -3)

回答by eyquem

I think that you'll find interesting stuff in the following answer of mine that I did for a previous similar question:

我认为您会在我为之前的类似问题所做的以下回答中找到有趣的内容:

https://stackoverflow.com/q/5929469/551449

https://stackoverflow.com/q/5929469/551449

In this answer, I proposed a pattern that allows a regex to catch any kind of number and since I have nothing else to add to it, I think it is fairly complete

在这个答案中,我提出了一种模式,允许正则表达式捕获任何类型的数字,并且由于我没有其他内容可以添加,我认为它相当完整

回答by IceArdor

Python docshas an answer that covers +/-, and exponent notation

Python 文档有一个涵盖 +/- 和指数符号的答案

scanf() Token      Regular Expression
%e, %E, %f, %g     [-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?
%i                 [-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)

This regular expression does not support international formats where a comma is used as the separator character between the whole and fractional part (3,14159). In that case, replace all \.with [.,]in the above float regex.

此正则表达式不支持将逗号用作整数部分和小数部分之间的分隔符的国际格式 (3,14159)。在这种情况下,更换所有\.[.,]在上述浮子正则表达式。

                        Regular Expression
International float     [-+]?(\d+([.,]\d*)?|[.,]\d+)([eE][-+]?\d+)?

回答by user3613331

You can use the following regex to get integer and floating values from a string:

您可以使用以下正则表达式从字符串中获取整数和浮点值:

re.findall(r'[\d\.\d]+', 'hello -34 42 +34.478m 88 cricket -44.3')

['34', '42', '34.478', '88', '44.3']

Thanks Rex

谢谢雷克斯