如何在android中将Html文本转换为纯文本?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22573319/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert Html text to plain text in android?
提问by Hiren Patel
I required convert HTMLtext to Plaintext in String form.
我需要转换的HTML文本,以纯文本字符串的形式。
String mHtmlString = "<p class="MsoNormal" style="margin-bottom:10.5pt;text-align:justify;line-height: 10.5pt"><b><span style="font-size: 8.5pt; font-family: Arial, sans-serif;">Lorem Ipsum</span></b><span style="font-size: 8.5pt; font-family: Arial, sans-serif;">&nbsp;is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.<o:p></o:p></span></p> <p class="MsoNormal" style="margin-bottom: 0.0001pt;"><span style="font-size: 8.5pt; font-family: Arial, sans-serif;">&nbsp;</span><span style="font-family: Arial, sans-serif; font-size: 8.5pt; line-height: 10.5pt; text-align: justify;">Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</span></p>"
What I did so far:
到目前为止我做了什么:
TextView textView = (TextView) findViewById(R.id.textView);
String plainText = Html.fromHtml(mHtmlString).toString()
textView.setText(plainText);
Bad luck, not working for nested HTML.
运气不好,不适用于嵌套 HTML。
Any help would appreciate.
任何帮助将不胜感激。
回答by Hiren Patel
I am giving my answer.
我给出我的答案。
String mHtmlString = "<p class="MsoNormal" style="margin-bottom:10.5pt;text-align:justify;line-height: 10.5pt"><b><span style="font-size: 8.5pt; font-family: Arial, sans-serif;">Lorem Ipsum</span></b><span style="font-size: 8.5pt; font-family: Arial, sans-serif;">&nbsp;is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.<o:p></o:p></span></p> <p class="MsoNormal" style="margin-bottom: 0.0001pt;"><span style="font-size: 8.5pt; font-family: Arial, sans-serif;">&nbsp;</span><span style="font-family: Arial, sans-serif; font-size: 8.5pt; line-height: 10.5pt; text-align: justify;">Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.</span></p>";
Set Htmltext string on TextView:
在TextView上设置Html文本字符串:
TextView textView = (TextView) findViewById(R.id.textView);
textView.setText(Html.fromHtml(Html.fromHtml(mHtmlString).toString()));
Hope this will help you.
希望这会帮助你。
回答by Libin
If you're looking for removing the html tags from html then use Jsoup( http://jsoup.org)
如果您正在寻找从 html 中删除 html 标签,请使用 Jsoup( http://jsoup.org)
String textFromHtml = Jsoup.parse(MY_HTML_STRING_HERE).text();
TextView desc = (TextView) dialog.findViewById(R.id.description);
desc.setText(textFromHtml);
回答by feheren.fekete
This works for me:
这对我有用:
Spanned spanned = Html.fromHtml(textWithMarkup);
char[] chars = new char[spanned.length()];
TextUtils.getChars(spanned, 0, spanned.length(), chars, 0);
String plainText = new String(chars);
I use it with simple tags like <b> and <i>. Did not test with more complex HTML.
我将它与诸如 <b> 和 <i> 之类的简单标签一起使用。没有使用更复杂的 HTML 进行测试。