BASH 中的数字格式与千位分隔符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9374868/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 01:36:46  来源:igfitidea点击:

Number formatting in BASH with thousand separator

bashunixlocalizationnumber-formatting

提问by Shiplu Mokaddim

I have a number 12343423455.23353. I want to format the number with thousand separator. So th output would be 12,343,423,455.23353

我有一个号码12343423455.23353。我想用千位分隔符格式化数字。所以输出将是 12,343,423,455.23353

回答by Ignacio Vazquez-Abrams

$ printf "%'.3f\n" 12345678.901
12,345,678.901

回答by mklement0

tl;dr

tl;博士

  • Use numfmt, if GNUutilities are available, such as on Linux by default:

    • numfmt --grouping 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
  • Otherwise, use printfwith the 'field flag wrapped in a shell functionthat preserves the number of input decimal places(does not hard-code the number of outputdecimal places).

    • groupDigits 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
    • See the bottom of this answer for the definition of groupDigits(), which also supports multipleinput numbers.
  • Ad-hoc alternatives involving subshellsthat also preserve the number of input decimal places(assumes that the input decimal mark is either .or ,):

    • A modular, but somewhat inefficient variant that accepts the input number via stdin(and can therefore also be used with pipeline input):
      (n=$(</dev/stdin); f=${n#*[.,]}; printf "%'.${#f}f\n" "$n") <<<12343423455.23353
    • Significantly faster, but less modular alternative that uses intermediate variable $n: n=12343423455.23353; (f=${n#*[.,]} printf "%'.${#f}f\n" "$n")
  • Alternatively, consider use of my Linux/macOS grpCLI(installable with npm install -g grp-cli):

    • grp -n 12343423455.23353
  • 使用numfmt,如果GNU实用程序可用,例如在默认情况下在 Linux 上:

    • numfmt --grouping 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
  • 否则,封装在shell 函数字段标志一起使用printf,该函数'保留输入小数位数(不硬编码输出小数位数)。

    • groupDigits 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
    • 有关 的定义,请参阅此答案的底部groupDigits(),它也支持多个输入数字。
  • 特设的替代品,涉及子shell还保留的输入小数位数(假定输入小数点符号或者是.,):

    • 一个模块化但效率低下的变体,它通过stdin接受输入数字(因此也可以与管道 input一起使用):
      (n=$(</dev/stdin); f=${n#*[.,]}; printf "%'.${#f}f\n" "$n") <<<12343423455.23353
    • 使用中间变量的明显更快,但模块化程度较低的替代方案$nn=12343423455.23353; (f=${n#*[.,]} printf "%'.${#f}f\n" "$n")
  • 或者,考虑使用我的 Linux/macOS grpCLI(可通过 安装npm install -g grp-cli):

    • grp -n 12343423455.23353

In all cases there are caveats; see below.

在所有情况下都有警告;见下文。



Ignacio Vazquez-Abrams's answercontains the crucial pointer for use with printf: the 'field flag (following the %) formats a number with the active locale's thousand separator:

Ignacio Vazquez-Abrams 的回答包含与 一起使用的关键指针printf'字段标志(在 之后%)使用活动语言环境的千位分隔符格式化数字:

  • Note that man printf(man 1 printf) does not contain this information itself: the utility/ shell builtin printfultimately calls the library functionprintf(), and only man 3 printfgives the full picture with respect to supported formats.
  • Environment variables LC_NUMERICand, indirectly, LANGor LC_ALLcontrol the active locale with respect to number formatting.
  • Both numfmtand printfrespect the active locale, both with respect to the thousands separator and the decimal mark ("decimal point").
  • Using just printfby itself, as in Ignacio's answer, requires that you hard-codethe number of outputdecimal places, rather than preserving however many decimal places the input has; it is this limitation that groupDigits()below overcomes.
  • printf "%'.<numDecPlaces>f"does have one advantage over numfmt --grouping, however:
    • numfmtonly accepts decimalnumbers, whereas printf's %falso accepts hexadecimalintegers (e.g., 0x3e8) and numbers in decimal scientific notation(e.g., 1e3).
  • 请注意,man printf( man 1 printf) 本身不包含此信息:实用程序/shell 内置程序printf最终会调用库函数printf(),并且仅man 3 printf提供有关支持格式的完整图片。
  • 环境变量LC_NUMERIC,并间接地LANGLC_ALL控制相对于数字格式的活性区域设置。
  • numfmtprintf尊重有源区域设置,两者相对于所述千位分隔和十进制标记(“小数点”)。
  • 单独使用printf,如 Ignacio 的回答,要求您对输出小数位数进行硬编码,而不是保留输入的小数位数;下面克服了这个限制。groupDigits()
  • printf "%'.<numDecPlaces>f"numfmt --grouping然而,确实有一个优势:
    • numfmt只接受十进制数,而printf's%f也接受十六进制整数(例如,0x3e8)和十进制科学记数法中的数字(例如,1e3)。

Caveats

注意事项

  • Locales without grouping: Some locales, notably Cand POSIX, by definition do NOT apply grouping, so use of 'has no effect in that event.

  • Real-world locale inconsistencies across platforms:

    • (LC_ALL='de_DE.UTF-8'; printf "%'.1f\n" 1000) # SHOULD yield: 1.000,0
    • Linux: yields 1.000,0, as expected.
    • macOS/BSD: Unexpectedly yields 1000,0- NO grouping(!).
  • Input number format: When you pass a number to numfmtor printf, it:
    • mustn'talready contain digit grouping
    • mustalready use the activelocale's decimal mark
    • For example:
      • (LC_ALL='lt_LT.UTF-8'; printf "%'.1f\n" 1000,1) # -> '1 000,1'
      • OK: input number is not grouped and uses Lithuanian decimal mark (comma).
  • Portability: POSIX doesn't requirethe printfutility(as opposed to the C printf()library function) to support floating-point format characters such as %f, given that POSIX[-like] shells are integer-only; in practice, however, I'm not aware of any shells/platforms that do not.

  • Rounding errors and overflow:

    • When using numfmtand printfas described, round-trip conversion occurs (string -> number -> string), which is subject to rounding errors; in other words: reformatting with digit grouping can lead to a different number.
    • Using format character fto employ IEEE-754 double-precision floating-point values, only up to 15 significant digits(digits irrespective of the location of the decimal mark) are guaranteedto be accurately preserved (though for specific numbers it may work with more digits). In practice, numfmtand GNUprintfcan accurately handle morethan that; see below. If anyone knows how and why, let me know.
    • With too many significant digits or too-large a value present, the behavior differs between numfmtand printfin general, and between printfimplementations across platforms; for example:
  • 没有分组的区域设置:一些区域设置,特别是CPOSIX,根据定义不应用分组,因此'在该事件中使用无效。

  • 跨平台的真实语言环境不一致

    • (LC_ALL='de_DE.UTF-8'; printf "%'.1f\n" 1000) # SHOULD yield: 1.000,0
    • Linux1.000,0如预期的那样产生。
    • macOS/BSD:出乎意料地产生1000,0- 没有分组(!)。
  • 输入数字格式:当您将数字传递给numfmtor 时printf,它:
    • 不能已经包含数字分组
    • 必须已经使用活动语言环境的小数点
    • 例如:
      • (LC_ALL='lt_LT.UTF-8'; printf "%'.1f\n" 1000,1) # -> '1 000,1'
      • OK:输入数字未分组并使用立陶宛小数点(逗号)。
  • 可移植性:POSIX不要求printf实用程序(如相对于在Cprintf()库函数),以支持浮点格式的字符,如%f,假定POSIX [样]壳是整数仅; 然而,在实践中,我不知道任何不知道的外壳/平台。

  • 舍入错误和溢出

    • 使用numfmtprintf描述时,会发生往返转换(字符串 -> 数字 -> 字符串),会出现舍入错误;换句话说:用数字分组重新格式化会导致不同的数字
    • 使用格式字符f采用IEEE-754双精度浮点值,只有最多15显著位(位不论十进制标记的位置)都保证要保持精度(尽管具体数字可能有更多的数字工作)。在实践中,numfmt并且GNUprintf可以精确地处理更多的比; 见下文。如果有人知道如何以及为什么,请告诉我。
    • 如果存在太多有效数字或太大的值,行为在不同平台之间numfmt以及printf跨平台实现之间会printf有所不同;例如:

numft:

numft

[Fixed in coreutils 8.24, according to @pixelbeat]Starting with 20 significant digits, the value overflows quietly(!) - presumably a bug (as of GNU coreutils 8.23):

[在 coreutils 8.24 中修复,根据@pixelbeat]从 20 位有效数字开始,该值会悄悄溢出(!) - 大概是一个错误(从 GNU coreutils 8.23 开始):

# 20 significant digits cause quiet overflow:
$ (fractPart=0000000000567890; num="1000.${fractPart}"; numfmt --grouping "$num")
-92.23372036854775807    # QUIET OVERFLOW

By contrast, a number that is too large doesgenerate an error by default.

相比之下,过大的数字默认产生错误。

printf:

printf

Linux printfhandles up to 20 significant digits accurately, whereas the BSD/macOS implementation is limited to 17:

Linuxprintf可以准确处理多达 20 个有效数字,而 BSD/macOS 实现仅限于 17 个:

# Linux: 21 significant digits cause rounding error:
$  (fractPart=00000000005678901; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005678902  # ROUNDING ERROR

# BSD/macOS: 18 significant digits cause rounding error:
$  (fractPart=00000000005678; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005673  # ROUNDING ERROR

The Linux version never seems to overflow, whereas the BSD/macOS version reports an error with numbers that are too large.

Linux 版本似乎永远不会溢出,而 BSD/macOS 版本则报告数字过大的错误。



Bash shell function groupDigits():

Bash 外壳功能groupDigits()

# SYNOPSIS
#   groupDigits num ...
# DESCRIPTION
#   Formats the specified number(s) according to the rules of the
#   current locale in terms of digit grouping (thousands separators).
#   Note that input numbers
#     - must not already be digit-grouped themselves,
#     - must use the *current* locale's decimal mark.
#   Numbers can be integers or floats.
#   Processing stops at the first number that can't be formatted, and a
#   non-zero exit code is returned.
# CAVEATS
#   - No input validation is performed.
#   - printf(1) is not guaranteed to support non-integer formats by POSIX,
#     though not doing so is rare these days.
#   - Round-trip number conversion is involved (string > double > string)
#     so rounding errors can occur.
# EXAMPLES
#   groupDigits 1000 # -> '1,000'
#   groupDigits 1000.5 # -> '1,000.5'
#   (LC_ALL=lt_LT.UTF-8; groupDigits 1000,5) # -> '1 000,5'
groupDigits() {
  local decimalMark fractPart
  decimalMark=$(printf "%.1f" 0); decimalMark=${decimalMark:1:1}
  for num; do
    fractPart=${num##*${decimalMark}}; [[ "$num" == "$fractPart" ]] && fractPart=''
    printf "%'.${#fractPart}f\n" "$num" || return
  done
}