bash 如何让 bc 以科学(又名指数)表示法处理数字?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12882611/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:49:57  来源:igfitidea点击:

How to get bc to handle numbers in scientific (aka exponential) notation?

bashnumericfloating-accuracybc

提问by Ferdinando Randisi

bcdoesn't like numbers expressed in scientific notation (aka exponential notation).

bc不喜欢用科学记数法(又名指数记数法)表示的数字。

$ echo "3.1e1*2" | bc -l
(standard_in) 1: parse error

but I need to use it to handle a few records that are expressed in this notation. Is there a way to get bcto understand exponential notation? If not, what can I do to translate them into a format that bcwill understand?

但我需要用它来处理一些用这种表示法表示的记录。有没有办法bc理解指数表示法?如果没有,我该怎么做才能将它们翻译成bc可以理解的格式?

采纳答案by Ferdinando Randisi

Unfortunately, bc doesn't support scientific notation.

不幸的是, bc 不支持科学记数法。

However, it can be translated into a format that bc can handle, using extended regex as per POSIXin sed:

但是,它可以转换为 bc 可以处理的格式,使用sed 中的POSIX 扩展正则表达式

sed -E 's/([+-]?[0-9.]+)[eE]\+?(-?)([0-9]+)/(*10^)/g' <<<"$value"

you can replace the "e" (or "e+", if the exponent is positive) with "*10^", which bc will promptly understand. This works even if the exponent is negative or if the number is subsequently multiplied by another power, and allows keeping track of significant digits.

您可以用“*10^”替换“e”(或“e+”,如果指数为正),bc 会立即理解。即使指数为负数或该数字随后乘以另一个幂,这也有效,并允许跟踪有效数字。

If you need to stick to basic regex (BRE), then this should be used:

如果您需要坚持使用基本正则表达式 (BRE),则应使用:

sed 's/\([+-]\{0,1\}[0-9]*\.\{0,1\}[0-9]\{1,\}\)[eE]+\{0,1\}\(-\{0,1\}\)\([0-9]\{1,\}\)/(*10^)/g' <<<"$value"


From Comments:

来自评论:

  • A simple bash patternmatch could not work (thanks @mklement0) as there is no way to match a e+ and keep the - from a e- at the same time.

  • A correctly working perl solution (thanks @mklement0)

    $ perl -pe 's/([-\d.]+)e(?:\+|(-))?(\d+)/(*10^)/gi' <<<"$value"
    
  • Thanks to @jwpat7and @Paul Tomblinfor clarifying aspects of sed's syntax, as well as @isaacand @mklement0for improving the answer.

  • 简单的 bash模式匹配无法工作(感谢@mklement0),因为无法同时匹配 e+ 并保留 - 与 e-。

  • 正确工作的 perl 解决方案(感谢@mklement0

    $ perl -pe 's/([-\d.]+)e(?:\+|(-))?(\d+)/(*10^)/gi' <<<"$value"
    
  • 感谢@jwpat7@Paul Tomblin澄清了sed 语法的各个方面,以及@isaac@mklement0改进了答案。

Edit:

编辑:

The answer changed quite a bit over the years. The answer above is the latest iteration as of 17th May 2018. Previous attempts reported here were a solution in pure bash (by @ormaaj) and one in sed (by @me), that fail in at least some cases. I'll keep them here just to make sense of the comments, which contain much nicer explanations of the intricacies of all this than this answer does.

多年来,答案发生了很大变化。上面的答案是截至 2018 年 5 月 17 日的最新迭代。 此处报告的先前尝试是纯 bash(@ormaaj)和 sed(@me)中的解决方案,至少在某些情况下失败。我将它们保留在这里只是为了理解评论,其中包含对所有这些复杂性的更好解释,而不是这个答案。

value=${value/[eE]+*/*10^}  ------> Can not work.
value=`echo ${value} | sed -e 's/[eE]+*/\*10\^/'` ------> Fail in some conditions

回答by mklement0

Let me try to summarizethe existing answers, with comments on each below:

让我尝试总结现有的答案,并在下面对每个答案进行评论

  • (a) If you indeed need to use bcfor arbitrary-precision calculations- as the OP does - use the OP's own clever approach, which textuallyreformats the scientific notation to an equivalent expressionthat bcunderstands.

  • If potentially losing precisionis nota concern,

    • (b) consider using awkor perlas bcalternatives; both natively understand scientific notation, as demonstrated in jwpat7'sanswer for awk.
    • (c) consider using printf '%.<precision>f'to simply textually convertto regular floating point representation (decimal fractions, without the e/E)(a solution proposed in a since-deleted post by ormaaj).
  • (一)如果你确实需要使用bc任意-精密的计算-因为OP不-使用OP自己聪明的做法,其文本上重新格式化科学记数法的等价表达bc理解。

  • 如果可能丢失的精度不是一个问题

    • (b) 考虑使用awkperl作为bc替代品;如jwpat7对 awk回答所示,两者都本机理解科学记数法。
    • (c) 考虑使用printf '%.<precision>f'简单地将文本转换为常规浮点表示(小数,没有e/ E(在ormaaj已删除的帖子中提出的解决方案)。


(a) Reformatting scientific notation to an equivalent bcexpression

(a) 将科学记数法重新格式化为等价bc表达式

The advantage of this solution is that precision is preserved: the textual representation is transformed into an equivalenttextual representation that bccan understand, and bcitself is capable of arbitrary-precision calculations.

这种解决方案的优点是保留精度:将文本表示转换为可以理解的等效文本表示bc,并且bc本身能够进行任意精度的计算。

See the OP's own answer, whose updated form is now capable of transforming an entire expression containing multiple numbers in exponential notation into an equivalent bcexpression.

请参阅OP's own answer,其更新后的形式现在能够将包含多个指数表示法的整个表达式转换为等效bc表达式。



(b) Using awkor perlinstead of bcas the calculator

(b) 使用awkperl代替bc作为计算器

Note: The following approaches assume use of the built-in support for double-precision floating-point values in awkand perl. As is in inherent in floating-point arithmetic,
"given any fixed number of bits, most calculations with real numbers will produce quantities that cannot be exactly represented using that many bits. Therefore the result of a floating-point calculation must often be rounded in order to fit back into its finite representation. This rounding erroris the characteristic feature of floating-point computation." (http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

注意:以下方法假定利用内置在双精度浮点值的支持awkperl。正如浮点算术固有的那样,
“给定任何固定数量的位,大多数实数计算将产生无法使用那么多位精确表示的数量。因此,浮点计算的结果通常必须四舍五入以适应其有限表示。这种舍入误差是浮点计算的特征。” ( http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

That said,

那说,

awk

awk

awknatively understands decimal exponential (scientific) notation.
(You should generally only use decimalrepresentation, because awkimplementations differ with respect to whether they support number literals with other bases.)

awk本机理解十进制指数(科学)表示法。
(您通常应该只使用十进制表示,因为awk实现在是否支持具有其他基数的数字文字方面有所不同。)

awk 'BEGIN { print 3.1e1 * 2 }'  # -> 62

If you use the default printfunction, the OFMTvariable controls the output format by way of a printfformat string; the (POSIX-mandated) default is %.6g, meaning 6 significant digits, which notably includes the digits in the integer part.

如果使用默认print函数,该OFMT变量通过printf格式字符串控制输出格式;(POSIX 规定的)默认值是%.6g,意思是 6位有效数字,其中特别包括整数部分中的数字

Note that if the number in scientific notation is supplied as input(as opposed to a literal part of the awk program), you must add +0to force it to the default output format, if used by itselfwith print:

请注意,如果在科学记数法数量为提供输入(而不是awk程序的文本部分),您必须添加+0到强制其默认的输出格式,如果使用本身具有print

Depending on your locale and the awkimplementation you use, you may have to replace the decimal point(.) with the locale-appropriate radix character, such as ,in a German locale; applies to BSD awk, mawk, and to GNU awkwith the --posixoption.

根据您所在位置和awk使用的实施,您可能需要更换小数点.)与区域相适应的基数字符,如,在德文场所; 使用选项适用于 BSD awkmawk和 GNU 。awk--posix

awk '{ print +0 }' <<<'3.1e1' # -> 31; without `+0`, output would be the same as input

Modifying variable OFMTchanges the default output format (for numbers with fractional parts; (effective) integers are always output as such).
Alternatively, use the printffunctionwith an explicit output format:

修改变量会OFMT更改默认输出格式(对于带有小数部分的数字;(有效)整数始终如此输出)。
或者,使用具有显式输出格式的printf函数

awk 'BEGIN { printf "%.4f", 3.1e1 * 2.1234 }' # -> 65.8254

Perl

珀尔

perltoo natively understands decimal exponential (scientific) notation.

perl太自然地理解十进制指数(科学)表示法。

Note: Perl, unlike awk, isn't available on all POSIX-like platforms by default; furthermore, it's not as lightweight as awk.
However, it offers more features than awk, such as natively understanding hexadecimal and octal integers.

注意:Perl 与 awk 不同,默认情况下并非在所有类似 POSIX 的平台上都可用;此外,它不如 awk 轻量级
但是,它提供了比 awk 更多的功能,例如本机理解十六进制和八进制整数

perl -le 'print 3.1e1 * 2'  # -> 62

I'm unclear on what Perl's default output format is, but it appears to be %.15g. As with awk, you can use printfto choose the desired output format:

我不清楚 Perl 的默认输出格式是什么,但它似乎是%.15g. 与 awk 一样,您可以使用printf来选择所需的输出格式:

perl -e 'printf "%.4f\n", 3.1e1 * 2.1234' # -> 65.8254


(c) Using printfto convert scientific notation to decimal fractions

(c)printf用于将科学记数法转换为十进制分数

If you simply want to convert scientific notation (e.g., 1.2e-2) into a decimal fraction (e.g., 0.012), printf '%f'can do that for you. Note that you'll convert one textualrepresentation into anothervia floating-point arithmetic, which is subject to the same rounding errors as the awkand perlapproaches.

如果您只是想将科学记数法(例如1.2e-2)转换为小数(例如0.012),printf '%f'可以为您完成。请注意,您将通过浮点运算将一种文本表示形式转换为一种形式,这方法存在相同的舍入误差awkperl

printf '%.4f' '1.2e-2' # -> '0.0120'; `.4` specifies 4 decimal digits.

回答by James Waldby - jwpat7

One can use awk for this; for example,

可以为此使用 awk;例如,

awk '{ print +, +, + }' <<< '12345678e-6 0.0314159e2 54321e+13'

produces (via awk's default format %.6g) output like
12.3457 3.14159 543210000000000000
while commands like the following two produce the output shown after each, given that file edatacontains data as shown later.

产生(通过 awk 的默认格式 %.6g)输出,就像
12.3457 3.14159 543210000000000000
下面两个命令产生每个输出后显示的输出,假设该文件edata包含稍后显示的数据。

$ awk '{for(i=1;i<=NF;++i)printf"%.13g ",+$i; printf"\n"}' < edata`
31 0.0312 314.15 0 
123000 3.1415965 7 0.04343 0 0.1 
1234567890000 -56.789 -30 

$ awk '{for(i=1;i<=NF;++i)printf"%9.13g ",+$i; printf"\n"}' < edata
       31    0.0312    314.15         0 
   123000 3.1415965         7   0.04343         0       0.1 
1234567890000   -56.789       -30 


$ cat edata 
3.1e1 3.12e-2 3.1415e+2 xyz
123e3 0.031415965e2 7 .4343e-1 0e+0 1e-1
.123456789e13 -56789e-3 -30

Also, regarding solutions using sed, it probably is better to delete the plus sign in forms like 45e+3at the same time as the e, via regex [eE]+*, rather than in a separate sedexpression. For example, on my linux machine with GNU sed version 4.2.1 and bash version 4.2.24, commands
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34'
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34' | bc -l
produce output
7.11*10^-2 + 323*10^34
3230000000000000000000000000000000000.07110000000000000000

此外,对于使用 的解决方案sed,最好通过 regex45e+3e,同时删除表单中的加号[eE]+*,而不是在单独的sed表达式中。例如,在我的 GNU sed 版本 4.2.1 和 bash 版本 4.2.24 的 linux 机器上,命令
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34'
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34' | bc -l
产生输出
7.11*10^-2 + 323*10^34
3230000000000000000000000000000000000.07110000000000000000

回答by Jo H

You can also define a bash function which calls awk (a good name would be the equal sign "="):

您还可以定义一个调用 awk 的 bash 函数(一个好名字是等号“=”):

= ()
{
    local in="$(echo "$@" | sed -e 's/\[/(/g' -e 's/\]/)/g')";
    awk 'BEGIN {print '"$in"'}' < /dev/null
}

Then you can use all type of floating point math in the shell. Note that square brackets are used here instead of round brackets, since the latter would have to be protected from the bash by quotes.

然后你可以在 shell 中使用所有类型的浮点数学。请注意,此处使用方括号而不是圆括号,因为后者必须通过引号保护免受 bash 的影响。

> = 1+sin[3.14159] + log[1.5] - atan2[1,2] - 1e5 + 3e-10
0.94182

Or in a script to assign the result

或者在脚本中分配结果

a=$(= 1+sin[4])
echo $a   # 0.243198

回答by Fridtjof Stein

Luckily there is printf, which does the formatting job:

幸运的是有 printf,它可以完成格式化工作:

The above example:

上面的例子:

printf "%.12f * 2\n" 3.1e1 | bc -l

Or a float comparison:

或浮动比较:

n=8.1457413437133669e-02
m=8.1456839223809765e-02

n2=`printf "%.12f" $n`
m2=`printf "%.12f" $m`

if [ $(echo "$n2 > $m2" | bc -l) == 1  ]; then 
   echo "n is bigger"
else
   echo "m is bigger"
fi

回答by Anton

Piping version of OPs accepted answer

OP 的管道版本接受了答案

$ echo 3.82955e-5 | sed 's/[eE]+*/\*10\^/'
3.82955*10^-5

Piping the input to the OPs accepted sed command gave extra backslashes like

管道输入到 OPs 接受 sed 命令给出了额外的反斜杠,如

$ echo 3.82955e-5 | sed 's/[eE]+*/\*10\^/'
3.82955\*10\^-5

回答by markroxor

I managed to do it with a little hack. You can do something like this -

我设法通过一个小技巧做到了。你可以做这样的事情 -

scientific='4.8844221e+002'
base=$(echo $scientific | cut -d 'e' -f1)
exp=$(($(echo $scientific | cut -d 'e' -f2)*1))
converted=$(bc -l <<< "$base*(10^$exp)")
echo $converted 
>> 488.4422100

回答by Ma-tri-x

try this (found this in an example for a CFD input data for processing with m4:)

试试这个(在使用 m4 处理的 CFD 输入数据的示例中找到了这个:)

T0=4e-5
deltaT=2e-6
m4 <<< "esyscmd(perl -e 'printf (${T0} + ${deltaT})')"

回答by cpu

Try this: (using bash)

试试这个:(使用bash)

printf "scale=20\n0.17879D-13\n" | sed -e 's/D/*10^/' | bc

or this:

或这个:

 num="0.17879D-13"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D/*10^/' | bc`" ; echo $convert
.00000000000001787900
num="1230.17879"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D/*10^/' | bc`" ; echo $convert
1230.17879

If you have positive exponents you should use this:

如果你有正指数,你应该使用这个:

num="0.17879D+13"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D+/*10^/' -e 's/D/*10^/' | bc`" ; echo $convert
1787900000000.00000

That last one would handle every numbers thrown at it. You can adapt the 'sed' if you have numbers with 'e' or 'E' as exponents.

最后一个将处理抛出的每个数字。如果您有以“e”或“E”作为指数的数字,则可以调整“sed”。

You get to chose the scale you want.

你可以选择你想要的比例。