BASH 中的数字格式与千位分隔符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9374868/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Number formatting in BASH with thousand separator
提问by Shiplu Mokaddim
I have a number 12343423455.23353. I want to format the number with thousand separator. So th output would be 12,343,423,455.23353
我有一个号码12343423455.23353。我想用千位分隔符格式化数字。所以输出将是 12,343,423,455.23353
回答by Ignacio Vazquez-Abrams
$ printf "%'.3f\n" 12345678.901
12,345,678.901
回答by mklement0
tl;dr
tl;博士
Use
numfmt, if GNUutilities are available, such as on Linux by default:numfmt --grouping 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
Otherwise, use
printfwith the'field flag wrapped in a shell functionthat preserves the number of input decimal places(does not hard-code the number of outputdecimal places).groupDigits 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US- See the bottom of this answer for the definition of
groupDigits(), which also supports multipleinput numbers.
Ad-hoc alternatives involving subshellsthat also preserve the number of input decimal places(assumes that the input decimal mark is either
.or,):- A modular, but somewhat inefficient variant that accepts the input number via stdin(and can therefore also be used with pipeline input):
(n=$(</dev/stdin); f=${n#*[.,]}; printf "%'.${#f}f\n" "$n") <<<12343423455.23353 - Significantly faster, but less modular alternative that uses intermediate variable
$n:n=12343423455.23353; (f=${n#*[.,]} printf "%'.${#f}f\n" "$n")
- A modular, but somewhat inefficient variant that accepts the input number via stdin(and can therefore also be used with pipeline input):
Alternatively, consider use of my Linux/macOS
grpCLI(installable withnpm install -g grp-cli):grp -n 12343423455.23353
使用
numfmt,如果GNU实用程序可用,例如在默认情况下在 Linux 上:numfmt --grouping 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
否则,与封装在shell 函数中的字段标志一起使用
printf,该函数'保留输入小数位数(不硬编码输出小数位数)。groupDigits 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US- 有关 的定义,请参阅此答案的底部
groupDigits(),它也支持多个输入数字。
特设的替代品,涉及子shell是还保留的输入小数位数(假定输入小数点符号或者是
.或,):- 一个模块化但效率低下的变体,它通过stdin接受输入数字(因此也可以与管道 input一起使用):
(n=$(</dev/stdin); f=${n#*[.,]}; printf "%'.${#f}f\n" "$n") <<<12343423455.23353 - 使用中间变量的明显更快,但模块化程度较低的替代方案
$n:n=12343423455.23353; (f=${n#*[.,]} printf "%'.${#f}f\n" "$n")
- 一个模块化但效率低下的变体,它通过stdin接受输入数字(因此也可以与管道 input一起使用):
或者,考虑使用我的 Linux/macOS
grpCLI(可通过 安装npm install -g grp-cli):grp -n 12343423455.23353
In all cases there are caveats; see below.
在所有情况下都有警告;见下文。
Ignacio Vazquez-Abrams's answercontains the crucial pointer for use with printf: the 'field flag (following the %) formats a number with the active locale's thousand separator:
Ignacio Vazquez-Abrams 的回答包含与 一起使用的关键指针printf:'字段标志(在 之后%)使用活动语言环境的千位分隔符格式化数字:
- Note that
man printf(man 1 printf) does not contain this information itself: the utility/ shell builtinprintfultimately calls the library functionprintf(), and onlyman 3 printfgives the full picture with respect to supported formats. - Environment variables
LC_NUMERICand, indirectly,LANGorLC_ALLcontrol the active locale with respect to number formatting. - Both
numfmtandprintfrespect the active locale, both with respect to the thousands separator and the decimal mark ("decimal point"). - Using just
printfby itself, as in Ignacio's answer, requires that you hard-codethe number of outputdecimal places, rather than preserving however many decimal places the input has; it is this limitation thatgroupDigits()below overcomes. printf "%'.<numDecPlaces>f"does have one advantage overnumfmt --grouping, however:numfmtonly accepts decimalnumbers, whereasprintf's%falso accepts hexadecimalintegers (e.g.,0x3e8) and numbers in decimal scientific notation(e.g.,1e3).
- 请注意,
man printf(man 1 printf) 本身不包含此信息:实用程序/shell 内置程序printf最终会调用库函数printf(),并且仅man 3 printf提供有关支持格式的完整图片。 - 环境变量
LC_NUMERIC,并间接地LANG或LC_ALL控制相对于数字格式的活性区域设置。 - 既
numfmt和printf尊重有源区域设置,两者相对于所述千位分隔和十进制标记(“小数点”)。 - 单独使用
printf,如 Ignacio 的回答,要求您对输出小数位数进行硬编码,而不是保留输入的小数位数;下面克服了这个限制。groupDigits() printf "%'.<numDecPlaces>f"numfmt --grouping然而,确实有一个优势:numfmt只接受十进制数,而printf's%f也接受十六进制整数(例如,0x3e8)和十进制科学记数法中的数字(例如,1e3)。
Caveats
注意事项
Locales without grouping: Some locales, notably
CandPOSIX, by definition do NOT apply grouping, so use of'has no effect in that event.Real-world locale inconsistencies across platforms:
(LC_ALL='de_DE.UTF-8'; printf "%'.1f\n" 1000) # SHOULD yield: 1.000,0- Linux: yields
1.000,0, as expected. - macOS/BSD: Unexpectedly yields
1000,0- NO grouping(!).
- Input number format: When you pass a number to
numfmtorprintf, it:- mustn'talready contain digit grouping
- mustalready use the activelocale's decimal mark
- For example:
(LC_ALL='lt_LT.UTF-8'; printf "%'.1f\n" 1000,1) # -> '1 000,1'- OK: input number is not grouped and uses Lithuanian decimal mark (comma).
Portability: POSIX doesn't requirethe
printfutility(as opposed to the Cprintf()library function) to support floating-point format characters such as%f, given that POSIX[-like] shells are integer-only; in practice, however, I'm not aware of any shells/platforms that do not.Rounding errors and overflow:
- When using
numfmtandprintfas described, round-trip conversion occurs (string -> number -> string), which is subject to rounding errors; in other words: reformatting with digit grouping can lead to a different number. - Using format character
fto employ IEEE-754 double-precision floating-point values, only up to 15 significant digits(digits irrespective of the location of the decimal mark) are guaranteedto be accurately preserved (though for specific numbers it may work with more digits). In practice,numfmtand GNUprintfcan accurately handle morethan that; see below. If anyone knows how and why, let me know. - With too many significant digits or too-large a value present, the behavior differs between
numfmtandprintfin general, and betweenprintfimplementations across platforms; for example:
- When using
没有分组的区域设置:一些区域设置,特别是
C和POSIX,根据定义不应用分组,因此'在该事件中使用无效。跨平台的真实语言环境不一致:
(LC_ALL='de_DE.UTF-8'; printf "%'.1f\n" 1000) # SHOULD yield: 1.000,0- Linux:
1.000,0如预期的那样产生。 - macOS/BSD:出乎意料地产生
1000,0- 没有分组(!)。
- 输入数字格式:当您将数字传递给
numfmtor 时printf,它:- 不能已经包含数字分组
- 必须已经使用活动语言环境的小数点
- 例如:
(LC_ALL='lt_LT.UTF-8'; printf "%'.1f\n" 1000,1) # -> '1 000,1'- OK:输入数字未分组并使用立陶宛小数点(逗号)。
可移植性:POSIX不要求的
printf实用程序(如相对于在Cprintf()库函数),以支持浮点格式的字符,如%f,假定POSIX [样]壳是整数仅; 然而,在实践中,我不知道任何不知道的外壳/平台。舍入错误和溢出:
- 使用
numfmt和printf描述时,会发生往返转换(字符串 -> 数字 -> 字符串),会出现舍入错误;换句话说:用数字分组重新格式化会导致不同的数字。 - 使用格式字符
f采用IEEE-754双精度浮点值,只有最多15显著位(位不论十进制标记的位置)都保证要保持精度(尽管具体数字可能有更多的数字工作)。在实践中,numfmt并且GNUprintf可以精确地处理更多的比; 见下文。如果有人知道如何以及为什么,请告诉我。 - 如果存在太多有效数字或太大的值,行为在不同平台之间
numfmt以及printf在跨平台实现之间会printf有所不同;例如:
- 使用
numft:
numft:
[Fixed in coreutils 8.24, according to @pixelbeat]Starting with 20 significant digits, the value overflows quietly(!) - presumably a bug (as of GNU coreutils 8.23):
[在 coreutils 8.24 中修复,根据@pixelbeat]从 20 位有效数字开始,该值会悄悄溢出(!) - 大概是一个错误(从 GNU coreutils 8.23 开始):
# 20 significant digits cause quiet overflow:
$ (fractPart=0000000000567890; num="1000.${fractPart}"; numfmt --grouping "$num")
-92.23372036854775807 # QUIET OVERFLOW
By contrast, a number that is too large doesgenerate an error by default.
相比之下,过大的数字默认会产生错误。
printf:
printf:
Linux printfhandles up to 20 significant digits accurately, whereas the BSD/macOS implementation is limited to 17:
Linuxprintf可以准确处理多达 20 个有效数字,而 BSD/macOS 实现仅限于 17 个:
# Linux: 21 significant digits cause rounding error:
$ (fractPart=00000000005678901; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005678902 # ROUNDING ERROR
# BSD/macOS: 18 significant digits cause rounding error:
$ (fractPart=00000000005678; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005673 # ROUNDING ERROR
The Linux version never seems to overflow, whereas the BSD/macOS version reports an error with numbers that are too large.
Linux 版本似乎永远不会溢出,而 BSD/macOS 版本则报告数字过大的错误。
Bash shell function groupDigits():
Bash 外壳功能groupDigits():
# SYNOPSIS
# groupDigits num ...
# DESCRIPTION
# Formats the specified number(s) according to the rules of the
# current locale in terms of digit grouping (thousands separators).
# Note that input numbers
# - must not already be digit-grouped themselves,
# - must use the *current* locale's decimal mark.
# Numbers can be integers or floats.
# Processing stops at the first number that can't be formatted, and a
# non-zero exit code is returned.
# CAVEATS
# - No input validation is performed.
# - printf(1) is not guaranteed to support non-integer formats by POSIX,
# though not doing so is rare these days.
# - Round-trip number conversion is involved (string > double > string)
# so rounding errors can occur.
# EXAMPLES
# groupDigits 1000 # -> '1,000'
# groupDigits 1000.5 # -> '1,000.5'
# (LC_ALL=lt_LT.UTF-8; groupDigits 1000,5) # -> '1 000,5'
groupDigits() {
local decimalMark fractPart
decimalMark=$(printf "%.1f" 0); decimalMark=${decimalMark:1:1}
for num; do
fractPart=${num##*${decimalMark}}; [[ "$num" == "$fractPart" ]] && fractPart=''
printf "%'.${#fractPart}f\n" "$num" || return
done
}

