python中的str.isdigit、isnumeric和isdecimal有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44891070/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the difference between str.isdigit, isnumeric and isdecimal in python?
提问by user8225026
When I run these methods
当我运行这些方法时
s.isdigit()
s.isnumeric()
s.isdecimal()
I always got as output or all True, or all False for each value of s (which is of course a string). What's? the difference between the three? Can you provide an example that gives two trues and one false (or viceversa)?
对于 s 的每个值(当然是字符串),我总是得到输出或全部为 True 或全部为 False 。什么是?三者的区别?你能提供一个例子,给出两个真一个假(或反之亦然)?
采纳答案by wim
It's mostly about unicode classifications. Here's some examples to show discrepancies:
它主要是关于 unicode 分类。以下是一些显示差异的示例:
>>> def spam(s):
... for attr in 'isnumeric', 'isdecimal', 'isdigit':
... print(attr, getattr(s, attr)())
...
>>> spam('?')
isnumeric True
isdecimal False
isdigit False
>>> spam('3')
isnumeric True
isdecimal False
isdigit True
Specific behaviour is in the official docs here.
具体行为在此处的官方文档中。
Script to find all of them:
找到所有这些的脚本:
import sys
import unicodedata
from collections import defaultdict
d = defaultdict(list)
for i in range(sys.maxunicode + 1):
s = chr(i)
t = s.isnumeric(), s.isdecimal(), s.isdigit()
if len(set(t)) == 2:
try:
name = unicodedata.name(s)
except ValueError:
name = f'codepoint{i}'
print(s, name)
d[t].append(s)
回答by Christian Dean
The Python documentation notes the difference between the three methods.
Python 文档指出了这三种方法之间的区别。
str.isdigit
str.isdigit
Return true if all characters in the string are digits and there is at least one character, false otherwise. Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. This covers digits which cannot be used to form numbers in base 10, like the Kharosthi numbers. Formally, a digit is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal.
如果字符串中的所有字符都是数字并且至少有一个字符,则返回 true,否则返回 false。数字包括十进制字符和需要特殊处理的数字,例如兼容性上标数字。这涵盖了不能用于形成以 10 为基数的数字的数字,例如 Kharosthi 数字。正式地,数字是具有属性值 Numeric_Type=Digit 或 Numeric_Type=Decimal 的字符。
str.isnumeric
str.isnumeric
Return true if all characters in the string are numeric characters, and there is at least one character, false otherwise. Numeric characters include digit characters, and all characters that have the Unicode numeric value property, e.g. U+2155, VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property value Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric.
如果字符串中的所有字符都是数字字符,并且至少有一个字符,则返回 true,否则返回 false。数字字符包括数字字符,以及所有具有Unicode 数值属性的字符,例如U+2155、庸俗分数五分之一。正式地,数字字符是那些具有属性值 Numeric_Type=Digit、Numeric_Type=Decimal 或 Numeric_Type=Numeric 的字符。
str.isdecimal
str.isdecimal
Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters are those that can be used to form numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Formally a decimal character is a character in the Unicode General Category “Nd”.
如果字符串中的所有字符都是十进制字符并且至少有一个字符,则返回 true,否则返回 false。十进制字符是那些可用于形成以 10 为基数的数字的字符,例如 U+0660、ARABIC-INDIC DIGIT ZERO。正式的十进制字符是 Unicode 通用类别“Nd”中的字符。
Like @Wim said, the main difference between the three methods is the way they handle specific unicode characters.
就像@Wim 所说,这三种方法的主要区别在于它们处理特定 unicode 字符的方式。
回答by AnnieFromTaiwan
By definition, isdecimal()
? isdigit()
? isnumeric()
. That is, if a string is decimal
, then it'll also be digit
and numeric
.
根据定义,isdecimal()
? isdigit()
? isnumeric()
. 也就是说,如果一个字符串是decimal
,那么它也将是digit
和numeric
。
Therefore, given a string s
and test it with those three methods, there'll only be 4 types of results.
因此,给定一个字符串s
并使用这三种方法对其进行测试,只会有 4 种类型的结果。
+-------------+-----------+-------------+----------------------------------+
| isdecimal() | isdigit() | isnumeric() | Example |
+-------------+-----------+-------------+----------------------------------+
| True | True | True | "038", "???", "038" |
| False | True | True | "?3?", "⒊⒏", "?③⑧" |
| False | False | True | "???", "ⅠⅢⅧ", "⑩??", "壹貳參" |
| False | False | False | "abc", "38.0", "-38" |
+-------------+-----------+-------------+----------------------------------+
1. Some examples of characters isdecimal()==True
1. 一些字符的例子 isdecimal()==True
(thus isdigit()==True
and isnumeric()==True
)
(因此isdigit()==True
和isnumeric()==True
)
"0123456789" DIGIT ZERO~NINE
"??????????" ARABIC-INDIC DIGIT ZERO~NINE
"??????????" DEVANAGARI DIGIT ZERO~NINE
"??????????" BENGALI DIGIT ZERO~NINE
"??????????" GURMUKHI DIGIT ZERO~NINE
"??????????" GUJARATI DIGIT ZERO~NINE
"??????????" ORIYA DIGIT ZERO~NINE
"??????????" TAMIL DIGIT ZERO~NINE
"??????????" TELUGU DIGIT ZERO~NINE
"??????????" KANNADA DIGIT ZERO~NINE
"??????????" MALAYALAM DIGIT ZERO~NINE
"??????????" THAI DIGIT ZERO~NINE
"??????????" LAO DIGIT ZERO~NINE
"??????????" TIBETAN DIGIT ZERO~NINE
"??????????" MYANMAR DIGIT ZERO~NINE
"??????????" KHMER DIGIT ZERO~NINE
"0123456789" FULLWIDTH DIGIT ZERO~NINE
"" MATHEMATICAL BOLD DIGIT ZERO~NINE
"" MATHEMATICAL DOUBLE-STRUCK DIGIT ZERO~NINE
"" MATHEMATICAL SANS-SERIF DIGIT ZERO~NINE
"" MATHEMATICAL SANS-SERIF BOLD DIGIT ZERO~NINE
"" MATHEMATICAL MONOSPACE DIGIT ZERO~NINE
2. Some examples of characters isdecimal()==False
but isdigit()==True
2. 一些字符的例子isdecimal()==False
但是isdigit()==True
(thus isnumeric()==True
)
(因此isnumeric()==True
)
"?123??????" SUPERSCRIPT ZERO~NINE
"??????????" SUBSCRIPT ZERO~NINE
"⒈⒉⒊⒋⒌⒍⒎⒏⒐" DIGIT ZERO~NINE FULL STOP
"" DIGIT ZERO~NINE COMMA
"?①②③④⑤⑥⑦⑧⑨" CIRCLED DIGIT ZERO~NINE
"??????????" NEGATIVE CIRCLED DIGIT ZERO~NINE
"⑴⑵⑶⑷⑸⑹⑺⑻⑼" PARENTHESIZED DIGIT ONE~NINE
"?????????" DINGBAT CIRCLED SANS-SERIF DIGIT ONE~NINE
"?????????" DOUBLE CIRCLED DIGIT ONE~NINE
"?????????" DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ONE~NINE
"?????????" ETHIOPIC DIGIT ONE~NINE
3. Some examples of characters isdecimal()==False
and isdigit()==False
but isnumeric()==True
3. 一些字符isdecimal()==False
和isdigit()==False
但是的例子isnumeric()==True
"????????????????????" VULGAR FRACTION
"??????" BENGALI CURRENCY NUMERATOR
"???" TAMIL NUMBER TEN, ONE HUNDRED, ONE THOUSAND
"???????" TELUGU FRACTION DIGIT
"??????" MALAYALAM NUMBER, MALAYALAM FRACTION
"??????????" TIBETAN DIGIT HALF ZERO~NINE
"???????????" ETHIOPIC NUMBER TEN~NINETY, HUNDRED, TEN THOUSAND
"??????????" KHMER SYMBOL LEK ATTAK
"ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫ????" ROMAN NUMERAL
"ⅰⅱⅲⅳⅴⅵⅶⅷⅸⅹ??????" SMALL ROMAN NUMERAL
"?????" ROMAN NUMERAL
"⑩????????????????????????????????????????" CIRCLED NUMBER TEN~FIFTY
"????????" CIRCLED NUMBER TEN~EIGHTY ON BLACK SQUARE
"⑽⑾⑿⒀⒁⒂⒃⒄⒅⒆⒇" PARENTHESIZED NUMBER TEN~TWENTY
"⒑⒒⒓⒔⒕⒖⒗⒘⒙⒚⒛" NUMBER TEN~TWENTY FULL STOP
"??????????" NEGATIVE CIRCLED NUMBER ELEVEN
"????" various styles of CIRCLED NUMBER TEN
"" DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
"〇" IDEOGRAPHIC NUMBER ZERO
"〡〢〣〤〥〦〧〨〩???" HANGZHOU NUMERAL ONE~TEN, TWENTY, THIRTY
"一二三四" IDEOGRAPHIC ANNOTATION ONE~FOUR MARK
"㈠㈡㈢㈣㈤㈥㈦㈧㈨㈩" PARENTHESIZED IDEOGRAPH ONE~TEN
"一二三四五六七八九十" CIRCLED IDEOGRAPH ONE~TEN
"一二三四五六七八九十壹貳參肆伍陸柒捌玖拾零百千萬億兆弐貮贰??漆什?陌阡佰仟万亿幺兩?亖卄卅卌廾廿" CJK UNIFIED IDEOGRAPH
"參拾兩零六陸什" CJK COMPATIBILITY IDEOGRAPH
"" AEGEAN NUMBER ONE~NINE, TEN~NINETY
"" AEGEAN NUMBER ONE~NINE HUNDRED, ONE~NINE THOUSAND
"" AEGEAN NUMBER TEN~NINETY THOUSAND
"" GREEK ACROPHONIC ATTIC
"" COUNTING ROD UNIT DIGIT ONE~NINE
"" COUNTING ROD TENS DIGIT ONE~NINE
回答by Sree
a negative number a = "-10"
would be false for all of these three
a = "-10"
对于所有这三个,负数都是错误的
a.isdecimal(), a.isdigit(), a.isnumeric()
are False, False, False isdecimal() will have only 0 to 9 in any language, but with out negative signs isdigit() will have only 0 to 9 in any language, also in the "to the power of" positions. (decimal numbers in power, ex: 2 to the power of 5). isnumeric() is even broader spectrum.. it will also include more than 0 to 9 in any position, but it will also have Tens, hundred, thousands in any language, ex. roman 10 is X, its a valid isnumeric(). But all the three are false for: Negative numbers, ex: -10 and floating point numbers, ex: 10.1
is False, False, False isdecimal() 在任何语言中都只有 0 到 9,但是如果没有负号,isdigit() 在任何语言中都只有 0 到 9,也在“to the power of”位置。(十进制数的幂,例如:2 的 5 次方)。isnumeric() 的范围更广。它也将在任何位置包含 0 到 9 个以上,但它也将在任何语言中包含数十、数百、数千,例如。罗马 10 是 X,它是一个有效的 isnumeric()。但是这三个都是错误的:负数,例如:-10 和浮点数,例如:10.1
回答by Douglas
Related question: which one is equivalent "\d" in regular expression?
相关问题:正则表达式中哪一个等价于“\d”?
"\d": For Unicode (str) patterns: Matches any Unicode decimal digit (that is, any character in Unicode character category [Nd]). This includes [0-9], and also many other digit characters. If the ASCII flag is used only [0-9] is matched.
“\d”:对于 Unicode (str) 模式:匹配任何 Unicode 十进制数字(即 Unicode 字符类别 [Nd] 中的任何字符)。这包括 [0-9] 以及许多其他数字字符。如果使用 ASCII 标志,则仅匹配 [0-9]。