Python 以位置格式将浮点数转换为字符串(无科学记数法和错误精度)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38847690/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert float to string in positional format (without scientific notation and false precision)
提问by Antti Haapala
I want to print some floating point numbers so that they're always written in decimal form (e.g. 12345000000000000000000.0
or 0.000000000000012345
, not in scientific notation, yet I'd want to the result to have the up to ~15.7 significant figuresof a IEEE 754 double, and no more.
我想打印一些浮点数,以便它们始终以十进制形式写入(例如12345000000000000000000.0
或0.000000000000012345
,而不是科学记数法,但我希望结果具有IEEE 754 双精度数的高达 ~15.7 个有效数字,没有了。
What I want is ideallyso that the result is the shorteststring in positional decimal format that still results in the same value when converted to a float
.
理想情况下,float
我想要的是结果是位置十进制格式的最短字符串,当转换为.
It is well-known that the repr
of a float
is written in scientific notation if the exponent is greater than 15, or less than -4:
众所周知,如果指数大于 15 或小于 -4 ,则repr
afloat
的 用科学记数法表示:
>>> n = 0.000000054321654321
>>> n
5.4321654321e-08 # scientific notation
If str
is used, the resulting string again is in scientific notation:
如果str
使用,则结果字符串再次采用科学计数法:
>>> str(n)
'5.4321654321e-08'
It has been suggested that I can use format
with f
flag and sufficient precision to get rid of the scientific notation:
有人建议我可以用format
用f
标志和足够的精度摆脱了科学计数法:
>>> format(0.00000005, '.20f')
'0.00000005000000000000'
It works for that number, though it has some extra trailing zeroes. But then the same format fails for .1
, which gives decimal digits beyond the actual machine precision of float:
它适用于该数字,尽管它有一些额外的尾随零。但随后相同的格式失败了.1
,这给出了超出浮点数实际机器精度的十进制数字:
>>> format(0.1, '.20f')
'0.10000000000000000555'
And if my number is 4.5678e-20
, using .20f
would still lose relative precision:
如果我的号码是4.5678e-20
,使用.20f
仍然会失去相对精度:
>>> format(4.5678e-20, '.20f')
'0.00000000000000000005'
Thus these approaches do not match my requirements.
因此这些方法不符合我的要求。
This leads to the question: what is the easiest and also well-performing way to print arbitrary floating point number in decimal format, having the same digits as in repr(n)
(or str(n)
on Python 3), but always using the decimal format, not the scientific notation.
这引出了一个问题:以十进制格式打印任意浮点数的最简单且性能良好的方法是什么,具有与repr(n)
(或str(n)
在 Python 3 中)相同的数字,但始终使用十进制格式,而不是科学记数法.
That is, a function or operation that for example converts the float value 0.00000005
to string '0.00000005'
; 0.1
to '0.1'
; 420000000000000000.0
to '420000000000000000.0'
or 420000000000000000
and formats the float value -4.5678e-5
as '-0.000045678'
.
也就是说,例如将浮点值转换0.00000005
为字符串的函数或操作'0.00000005'
;0.1
到'0.1'
; 420000000000000000.0
to '420000000000000000.0'
or420000000000000000
并将浮点值格式化-4.5678e-5
为'-0.000045678'
.
After the bounty period: It seems that there are at least 2 viable approaches, as Karin demonstrated that using string manipulation one can achieve significant speed boost compared to my initial algorithm on Python 2.
赏金期之后:似乎至少有 2 种可行的方法,正如 Karin 所证明的,与我在 Python 2 上的初始算法相比,使用字符串操作可以显着提高速度。
Thus,
因此,
- If performance is important and Python 2 compatibility is required; or if the
decimal
module cannot be used for some reason, then Karin's approach using string manipulationis the way to do it. - On Python 3, my somewhat shorter code will also be faster.
- 如果性能很重要并且需要 Python 2 兼容性;或者如果
decimal
模块由于某种原因无法使用,那么Karin 使用字符串操作的方法就是这样做的方法。 - 在 Python 3 上,我稍微短一些的代码也会更快。
Since I am primarily developing on Python 3, I will accept my own answer, and shall award Karin the bounty.
由于我主要在 Python 3 上进行开发,因此我将接受我自己的答案,并将奖励 Karin。
采纳答案by Antti Haapala
Unfortunately it seems that not even the new-style formatting with float.__format__
supports this. The default formatting of float
s is the same as with repr
; and with f
flag there are 6 fractional digits by default:
不幸的是,似乎甚至不float.__format__
支持这种新型格式。float
s的默认格式与 with 相同repr
;并且带有f
标志,默认情况下有 6 个小数位:
>>> format(0.0000000005, 'f')
'0.000000'
However there is a hack to get the desired result - not the fastest one, but relatively simple:
然而,有一个技巧可以得到想要的结果——不是最快的,但相对简单:
- first the float is converted to a string using
str()
orrepr()
- then a new
Decimal
instance is created from that string. Decimal.__format__
supportsf
flag which gives the desired result, and, unlikefloat
s it prints the actual precision instead of default precision.
- 首先使用
str()
或将浮点数转换为字符串repr()
- 然后
Decimal
从该字符串创建一个新实例。 Decimal.__format__
支持f
提供所需结果的标志,并且与float
s不同,它打印实际精度而不是默认精度。
Thus we can make a simple utility function float_to_str
:
因此我们可以制作一个简单的效用函数float_to_str
:
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
Care must be taken to not use the global decimal context, so a new context is constructed for this function. This is the fastest way; another way would be to use decimal.local_context
but it would be slower, creating a new thread-local context and a context manager for each conversion.
必须注意不要使用全局十进制上下文,因此为此函数构造了一个新的上下文。这是最快的方式;另一种方法是使用decimal.local_context
但它会更慢,为每个转换创建一个新的线程本地上下文和一个上下文管理器。
This function now returns the string with all possible digits from mantissa, rounded to the shortest equivalent representation:
此函数现在返回带有尾数中所有可能数字的字符串,四舍五入到最短的等效表示:
>>> float_to_str(0.1)
'0.1'
>>> float_to_str(0.00000005)
'0.00000005'
>>> float_to_str(420000000000000000.0)
'420000000000000000'
>>> float_to_str(0.000000000123123123123123123123)
'0.00000000012312312312312313'
The last result is rounded at the last digit
最后一个结果在最后一位四舍五入
As @Karin noted, float_to_str(420000000000000000.0)
does not strictly match the format expected; it returns 420000000000000000
without trailing .0
.
正如@Karin 所指出的,float_to_str(420000000000000000.0)
与预期的格式不严格匹配;它返回420000000000000000
而没有尾随.0
。
回答by Karin
If you are satisfied with the precision in scientific notation, then could we just take a simple string manipulation approach? Maybe it's not terribly clever, but it seems to work (passes all of the use cases you've presented), and I think it's fairly understandable:
如果您对科学记数法的精度感到满意,那么我们可以采用简单的字符串操作方法吗?也许它不是非常聪明,但它似乎有效(通过了您提供的所有用例),我认为这是可以理解的:
def float_to_str(f):
float_string = repr(f)
if 'e' in float_string: # detect scientific notation
digits, exp = float_string.split('e')
digits = digits.replace('.', '').replace('-', '')
exp = int(exp)
zero_padding = '0' * (abs(int(exp)) - 1) # minus 1 for decimal point in the sci notation
sign = '-' if f < 0 else ''
if exp > 0:
float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
else:
float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
return float_string
n = 0.000000054321654321
assert(float_to_str(n) == '0.000000054321654321')
n = 0.00000005
assert(float_to_str(n) == '0.00000005')
n = 420000000000000000.0
assert(float_to_str(n) == '420000000000000000.0')
n = 4.5678e-5
assert(float_to_str(n) == '0.000045678')
n = 1.1
assert(float_to_str(n) == '1.1')
n = -4.5678e-5
assert(float_to_str(n) == '-0.000045678')
Performance:
性能:
I was worried this approach may be too slow, so I ran timeit
and compared with the OP's solution of decimal contexts. It appears the string manipulation is actually quite a bit faster. Edit: It appears to only be much faster in Python 2. In Python 3, the results were similar, but with the decimal approach slightly faster.
我担心这种方法可能太慢,所以我运行timeit
并与 OP 的十进制上下文解决方案进行了比较。看来字符串操作实际上要快得多。编辑:它似乎只在 Python 2 中快得多。在 Python 3 中,结果相似,但使用十进制方法稍快。
Result:
结果:
Python 2: using
ctx.create_decimal()
:2.43655490875
Python 2: using string manipulation:
0.305557966232
Python 3: using
ctx.create_decimal()
:0.19519368198234588
Python 3: using string manipulation:
0.2661344590014778
Python 2:使用
ctx.create_decimal()
:2.43655490875
Python 2:使用字符串操作:
0.305557966232
Python 3:使用
ctx.create_decimal()
:0.19519368198234588
Python 3:使用字符串操作:
0.2661344590014778
Here is the timing code:
这是时间代码:
from timeit import timeit
CODE_TO_TIME = '''
float_to_str(0.000000054321654321)
float_to_str(0.00000005)
float_to_str(420000000000000000.0)
float_to_str(4.5678e-5)
float_to_str(1.1)
float_to_str(-0.000045678)
'''
SETUP_1 = '''
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
'''
SETUP_2 = '''
def float_to_str(f):
float_string = repr(f)
if 'e' in float_string: # detect scientific notation
digits, exp = float_string.split('e')
digits = digits.replace('.', '').replace('-', '')
exp = int(exp)
zero_padding = '0' * (abs(int(exp)) - 1) # minus 1 for decimal point in the sci notation
sign = '-' if f < 0 else ''
if exp > 0:
float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
else:
float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
return float_string
'''
print(timeit(CODE_TO_TIME, setup=SETUP_1, number=10000))
print(timeit(CODE_TO_TIME, setup=SETUP_2, number=10000))
回答by user2357112 supports Monica
As of NumPy 1.14.0, you can just use numpy.format_float_positional
. For example, running against the inputs from your question:
从 NumPy 1.14.0 开始,您可以只使用numpy.format_float_positional
. 例如,针对您问题的输入运行:
>>> numpy.format_float_positional(0.000000054321654321)
'0.000000054321654321'
>>> numpy.format_float_positional(0.00000005)
'0.00000005'
>>> numpy.format_float_positional(0.1)
'0.1'
>>> numpy.format_float_positional(4.5678e-20)
'0.000000000000000000045678'
numpy.format_float_positional
uses the Dragon4 algorithm to produce the shortest decimal representation in positional format that round-trips back to the original float input. There's also numpy.format_float_scientific
for scientific notation, and both functions offer optional arguments to customize things like rounding and trimming of zeros.
numpy.format_float_positional
使用 Dragon4 算法以位置格式生成最短的十进制表示,该格式可以返回到原始浮点输入。还有numpy.format_float_scientific
科学记数法,两个函数都提供可选参数来自定义诸如舍入和修整零之类的内容。
回答by gukoff
If you are ready to lose your precision arbitrary by calling str()
on the float number, then it's the way to go:
如果您准备通过调用str()
浮点数来任意失去精度,那么这是要走的路:
import decimal
def float_to_string(number, precision=20):
return '{0:.{prec}f}'.format(
decimal.Context(prec=100).create_decimal(str(number)),
prec=precision,
).rstrip('0').rstrip('.') or '0'
It doesn't include global variables and allows you to choose the precision yourself. Decimal precision 100 is chosen as an upper bound for str(float)
length. The actual supremum is much lower. The or '0'
part is for the situation with small numbers and zero precision.
它不包括全局变量,并允许您自己选择精度。选择十进制精度 100 作为str(float)
长度的上限。实际的上限要低得多。该or '0'
部分适用于小数和零精度的情况。
Note that it still has its consequences:
请注意,它仍然有其后果:
>> float_to_string(0.10101010101010101010101010101)
'0.10101010101'
Otherwise, if the precision is important, format
is just fine:
否则,如果精度很重要,format
那就没问题了:
import decimal
def float_to_string(number, precision=20):
return '{0:.{prec}f}'.format(
number, prec=precision,
).rstrip('0').rstrip('.') or '0'
It doesn't miss the precision being lost while calling str(f)
.
The or
它不会错过调用时丢失的精度str(f)
。这or
>> float_to_string(0.1, precision=10)
'0.1'
>> float_to_string(0.1)
'0.10000000000000000555'
>>float_to_string(0.1, precision=40)
'0.1000000000000000055511151231257827021182'
>>float_to_string(4.5678e-5)
'0.000045678'
>>float_to_string(4.5678e-5, precision=1)
'0'
Anyway, maximum decimal places are limited, since the float
type itself has its limits and cannot express really long floats:
无论如何,最大小数位数是有限的,因为float
类型本身有其限制并且不能表达真正长的浮点数:
>> float_to_string(0.1, precision=10000)
'0.1000000000000000055511151231257827021181583404541015625'
Also, whole numbers are being formatted as-is.
此外,整数按原样格式化。
>> float_to_string(100)
'100'
回答by BPL
Interesting question, to add a little bit more of content to the question, here's a litte test comparing @Antti Haapala and @Harold solutions outputs:
有趣的问题,要为问题添加更多内容,这是一个比较@Antti Haapala 和@Harold 解决方案输出的小测试:
import decimal
import math
ctx = decimal.Context()
def f1(number, prec=20):
ctx.prec = prec
return format(ctx.create_decimal(str(number)), 'f')
def f2(number, prec=20):
return '{0:.{prec}f}'.format(
number, prec=prec,
).rstrip('0').rstrip('.')
k = 2*8
for i in range(-2**8,2**8):
if i<0:
value = -k*math.sqrt(math.sqrt(-i))
else:
value = k*math.sqrt(math.sqrt(i))
value_s = '{0:.{prec}E}'.format(value, prec=10)
n = 10
print ' | '.join([str(value), value_s])
for f in [f1, f2]:
test = [f(value, prec=p) for p in range(n)]
print '\t{0}'.format(test)
Neither of them gives "consistent" results for all cases.
对于所有情况,它们都没有给出“一致”的结果。
- With Anti's you'll see strings like '-000' or '000'
- With Harolds's you'll see strings like ''
- 使用 Anti's,您会看到像“-000”或“000”这样的字符串
- 使用 Harolds,你会看到像 '' 这样的字符串
I'd prefer consistency even if I'm sacrificing a little bit of speed. Depends which tradeoffs you want to assume for your use-case.
即使我牺牲一点速度,我也更喜欢一致性。取决于您要为您的用例假设哪些权衡。
回答by silgon
I think rstrip
can get the job done.
我认为rstrip
可以完成工作。
a=5.4321654321e-08
'{0:.40f}'.format(a).rstrip("0") # float number and delete the zeros on the right
# '0.0000000543216543210000004442039220863003' # there's roundoff error though
Let me know if that works for you.
让我知道这是否适合您。