Java 是否存在不被视为空白的不可见字符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19936374/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is there an invisible character that is not regarded as whitespace?
提问by CodeBlue
I am working with an existing framework where I have to set a certain attribute to blank if some conditions are satisfied. Unfortunately, the framework doesn't allow setting only whitespace to the attribute value. Specifically, it does a
我正在使用现有框架,如果满足某些条件,我必须将某个属性设置为空白。不幸的是,该框架不允许只为属性值设置空格。具体来说,它做了一个
!(org.apache.commons.lang.StringUtils.isBlank(value))
check on the value
!(org.apache.commons.lang.StringUtils.isBlank(value))
检查值
Is it possible to somehow bypass this and set a value that looks blank/invisible to the eye but is not regarded as whitespace?
是否有可能以某种方式绕过它并设置一个看起来空白/肉眼不可见但不被视为空白的值?
I am using a dash "-" right now, but I think it would be interesting to know if it's possible.
我现在正在使用破折号“-”,但我认为知道是否可能会很有趣。
采纳答案by Michael Konietzka
Try Unicode Character 'ZERO WIDTH SPACE' (U+200B). It is not a Whitespace according to WP: Whitespace#Unicode
试试Unicode 字符“零宽度空间”(U+200B)。根据WP,它不是空格:Whitespace#Unicode
The code of StringUtils.isBlankwill not bother it:
StringUtils.isBlank的代码不会打扰它:
public static boolean isBlank(String str) {
int strLen;
if (str == null || (strLen = str.length()) == 0) {
return true;
}
for (int i = 0; i < strLen; i++) {
if ((Character.isWhitespace(str.charAt(i)) == false)) {
return false;
}
}
return true;
}
回答by Bugs Bunny
That Unicode Character 'ZERO WIDTH SPACE' (U+200B) Michael Konietzka shared didn't work for me, but found a different one that did:
Michael Konietzka 共享的那个 Unicode 字符“零宽度空间”(U+200B)对我不起作用,但发现了一个不同的字符:
??? ?
??? ?
It actually identifies as combination of
它实际上标识为的组合
U+200F : RIGHT-TO-LEFT MARK [RLM]
U+200F : RIGHT-TO-LEFT MARK [RLM]
U+200E : LEFT-TO-RIGHT MARK [LRM]
U+0020 : SPACE [SP]
U+200E : LEFT-TO-RIGHT MARK [LRM]
and it's ASCII value is 8207
它的 ASCII 值是 8207
???'??? ?'.charCodeAt(0) // 8207
???'??? ?'.charCodeAt(0) // 8207
Source: http://emptycharacter.com/
回答by M. I. Wright
There's also ?
(U+2800 BRAILLE PATTERN BLANK), which is a blank Braille block rather than a space character.
还有?
(U+2800 BRAILLE PATTERN BLANK),这是一个空白的盲文块而不是空格字符。