Java - 从字符串中删除 \u0000

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28989970/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 07:08:59  来源:igfitidea点击:

Java - removing \u0000 from an String

javastringcharacter-encoding

提问by FeanDoe

I'm using the Twitter API and I have the following string that is bugging me Proyecto de ingeniera comercial, actual Profesora de matemáticas \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000Ense?a Chile
I want to store that in PostgreSql, but \u0000is not accepted, so I want to replace it.
I try to use string= string.replaceAll("\\u0000", "");but it doesn't work. I just get the following

我正在使用 Twitter API 并且我有以下字符串困扰着我Proyecto de ingeniera comercial, actual Profesora de matemáticas \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000Ense?a Chile
我想将它存储在 PostgreSql 中,但\u0000不被接受,所以我想替换它。
我尝试使用string= string.replaceAll("\\u0000", "");但它不起作用。我只是得到以下

String json = TwitterObjectFactory.getRawJSON(user);
System.out.println(json);
json = json.replaceAll("\u0000", "");
System.out.println(json);

The output (only the part that matters)

输出(仅重要的部分)

Proyecto de ingeniera comercial, actual Profesora de matemáticas \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000Ense?a Chile
Proyecto de ingeniera comercial, actual Profesora de matemáticas \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000Ense?a Chile

If I put that part in an String in java the replacement works, but if I put it in an text file or I read it directly for Twitter it doesnt work
So my question is, How do I replace \u0000 from an string?
By the way, the full string is this

如果我将该部分放在 java 中的 String 中,则替换有效,但如果我将其放在文本文件中或直接为 Twitter 读取它,则它不起作用
所以我的问题是,如何从字符串中替换 \u0000?
顺便全串是这个

{"utc_offset":null,"friends_count":83,"profile_image_url_https":"https://pbs.twimg.com/profile_images/2636139584/3a8455cd94045fa6980402add14796a9_normal.jpeg","listed_count":1,"profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","default_profile_image":false,"favourites_count":0,"description":"Proyecto de ingeniera comercial, actual Profesora de matemáticas \u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000Ense?a Chile","created_at":"Sat May 28 14:24:06 +0000 2011","is_translator":false,"profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","protected":false,"screen_name":"Fsquadritto","id_str":"306825274","profile_link_color":"0084B4","is_translation_enabled":false,"id":306825274,"geo_enabled":false,"profile_background_color":"C0DEED","lang":"es","profile_sidebar_border_color":"C0DEED","profile_location":null,"profile_text_color":"333333","verified":false,"profile_image_url":"http://pbs.twimg.com/profile_images/2636139584/3a8455cd94045fa6980402add14796a9_normal.jpeg","time_zone":null,"url":null,"contributors_enabled":false,"profile_background_tile":false,"entities":{"description":{"urls":[]}},"statuses_count":2,"follow_request_sent":false,"followers_count":36,"profile_use_background_image":true,"default_profile":true,"following":false,"name":"Fiorella Squadritto","location":"","profile_sidebar_fill_color":"DDEEF6","notifications":false,"status":{"in_reply_to_status_id_str":null,"in_reply_to_status_id":null,"possibly_sensitive":false,"coordinates":null,"created_at":"Fri Oct 12 17:40:35 +0000 2012","truncated":false,"in_reply_to_user_id_str":null,"source":"<a href=\"http://instagram.com\" rel=\"nofollow\">Instagram<\/a>","retweet_count":1,"retweeted":false,"geo":null,"in_reply_to_screen_name":null,"entities":{"urls":[{"display_url":"instagr.am/p/QsOQxTNfvQ/","indices":[49,69],"expanded_url":"http://instagr.am/p/QsOQxTNfvQ/","url":"http://t.co/GKziME7N"}],"hashtags":[{"indices":[24,34],"text":"eduinnova"}],"user_mentions":[{"indices":[35,47],"screen_name":"ensenachile","id_str":"57099132","name":"Ense?a Chile","id":57099132}],"symbols":[]},"id_str":"256811615171792896","in_reply_to_user_id":null,"favorite_count":1,"id":256811615171792896,"text":"Amando las matemáticas! #eduinnova @ensenachile  http://t.co/GKziME7N","place":null,"contributors":null,"lang":"es","favorited":false}}

采纳答案by Joop Eggen

string = string.replace("\u0000", ""); // removes NUL chars
string = string.replace("\u0000", ""); // removes backslash+u0000

The character with u-escaping is done on java source level. For instance "class" is:

带有 u 转义的字符是在 java 源代码级别完成的。例如“类”是:

public \u0063lass C {

Also you do not need regex.

你也不需要正则表达式。

回答by Ian Roberts

The first argument to replaceAllis a regular expression, and the Java regex engine understands \uNNNNescapes so

的第一个参数replaceAll是一个正则表达式,Java 正则表达式引擎理解\uNNNN转义所以

json.replaceAll("\u0000", "")

will search for the regular expression\u0000, which matches instances of the Unicode NUL character (U+0000), notinstances of the actual string \u0000. If you want to match the string \u0000then you need to use the regular expression \\u0000, which in turn means the Java string literal "\\\\u0000"

将搜索正则表达式\u0000,它匹配 Unicode NUL 字符 (U+0000) 的实例,而不是实际字符串的实例\u0000。如果要匹配字符串,\u0000则需要使用正则表达式\\u0000,这意味着 Java 字符串文字"\\\\u0000"

json.replaceAll("\\u0000", "")

Or more simply, use replace(whose first argument is a literal string rather than a regex) instead of replaceAll

或者更简单地说,使用replace(其第一个参数是文字字符串而不是正则表达式)而不是replaceAll

json.replace("\u0000", "")