Java 例如,在编组为 XML 时,我可以强制 JAXB 不将 " 转换为 " 吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1506663/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 13:35:26  来源:igfitidea点击:

Can I force JAXB not to convert " into ", for example, when marshalling to XML?

javajaxbxml-serializationmarshallinghtml-entities

提问by Elliot

I have an Object that is being marshalled to XML using JAXB. One element contains a String that includes quotes ("). The resulting XML has "where the " existed.

我有一个使用 JAXB 编组为 XML 的对象。一个元素包含一个包含引号 (") 的字符串。生成的 XML 包含 ""所在的位置。

Even though this is normally preferred, I need my output to match a legacysystem. How do I force JAXB to NOT convert the HTML entities?

尽管这通常是首选,但我需要我的输出来匹配系统。如何强制 JAXB 不转换 HTML 实体?

--

——

Thank you for the replies. However, I never see the handler escape() called. Can you take a look and see what I'm doing wrong? Thanks!

感谢您的答复。但是,我从未见过调用处理程序 escape() 。你能看看我做错了什么吗?谢谢!

package org.dc.model;

import java.io.IOException;
import java.io.Writer;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;

import org.dc.generated.Shiporder;

import com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler;

public class PleaseWork {
    public void prettyPlease() throws JAXBException {
        Shiporder shipOrder = new Shiporder();
        shipOrder.setOrderid("Order's ID");
        shipOrder.setOrderperson("The woman said, \"How ya doin & stuff?\"");

        JAXBContext context = JAXBContext.newInstance("org.dc.generated");
        Marshaller marshaller = context.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
        marshaller.setProperty(CharacterEscapeHandler.class.getName(),
                new CharacterEscapeHandler() {
                    @Override
                    public void escape(char[] ch, int start, int length,
                            boolean isAttVal, Writer out) throws IOException {
                        out.write("Called escape for characters = " + ch.toString());
                    }
                });
        marshaller.marshal(shipOrder, System.out);
    }

    public static void main(String[] args) throws Exception {
        new PleaseWork().prettyPlease();
    }
}

--

——

The output is this:

输出是这样的:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<shiporder orderid="Order's ID">
    <orderperson>The woman said, &quot;How ya doin &amp; stuff?&quot;</orderperson>
</shiporder>

and as you can see, the callback is never displayed. (Once I get the callback being called, I'll worry about having it actually do what I want.)

正如您所看到的,回调永远不会显示。(一旦我得到回调被调用,我会担心让它真正做我想做的事。)

--

——

采纳答案by Elliot

Solution my teammate found:

我的队友找到的解决方案:

PrintWriter printWriter = new PrintWriter(new FileWriter(xmlFile));
DataWriter dataWriter = new DataWriter(printWriter, "UTF-8", DumbEscapeHandler.theInstance);
marshaller.marshal(request, dataWriter);

Instead of passing the xmlFile to marshal(), pass the DataWriter which knows both the encoding and an appropriate escape handler, if any.

不是将 xmlFile 传递给 marshal(),而是传递知道编码和适当转义处理程序(如果有)的 DataWriter。

Note: Since DataWriter and DumbEscapeHandler are both within the com.sun.xml.internal.bind.marshaller package, you must bootstrap javac.

注意:由于 DataWriter 和 DumbEscapeHandler 都在 com.sun.xml.internal.bind.marshaller 包中,您必须引导 javac。

回答by laz

Seems like it is possible with Sun's JAXB implementation, although I've not done it myself.

Sun's JAXB implementation似乎是可能,尽管我自己没有这样做。

回答by Grzegorz Oledzki

I've been playing with your example a bit and debugging the JAXB code. And it seems it's something specific about UTF-8 encoding used. The escapeHandler property of MarshallerImplseems to be set properly. However it's being used not in every context. If I searched for calls of MarshallerImpl.createEscapeHandler()I found:

我一直在玩你的例子并调试 JAXB 代码。它似乎是关于使用的 UTF-8 编码的特定内容。的 escapeHandler 属性MarshallerImpl似乎设置正确。然而,并非在所有情况下都使用它。如果我搜索了MarshallerImpl.createEscapeHandler()我发现的电话:

public XmlOutput createWriter( OutputStream os, String encoding ) throws JAXBException {
    // UTF8XmlOutput does buffering on its own, and
    // otherwise createWriter(Writer) inserts a buffering,
    // so no point in doing a buffering here.

    if(encoding.equals("UTF-8")) {
        Encoded[] table = context.getUTF8NameTable();
        final UTF8XmlOutput out;
        if(isFormattedOutput())
            out = new IndentingUTF8XmlOutput(os,indent,table);
        else {
            if(c14nSupport)
                out = new C14nXmlOutput(os,table,context.c14nSupport);
            else
                out = new UTF8XmlOutput(os,table);
        }
        if(header!=null)
            out.setHeader(header);
        return out;
    }

    try {
        return createWriter(
            new OutputStreamWriter(os,getJavaEncoding(encoding)),
            encoding );
    } catch( UnsupportedEncodingException e ) {
        throw new MarshalException(
            Messages.UNSUPPORTED_ENCODING.format(encoding),
            e );
    }
}

Note that in your setup the top section (...equals("UTF-8")...)is taken into consideration. However this one doesn't take the escapeHandler. However if you set the encoding to any other, the bottom part of this method is called (createWriter(OutputStream, String)) and this one uses escapeHandler, so EH plays its role. So, adding...

请注意,在您的设置中,顶部部分(...equals("UTF-8")...)已被考虑在内。然而,这个不接受escapeHandler. 但是,如果您将编码设置为任何其他,则此方法的底部称为 ( createWriter(OutputStream, String)) 并且此方法使用escapeHandler,因此 EH 发挥了作用。所以,补充...

    marshaller.setProperty(Marshaller.JAXB_ENCODING, "ASCII");

makes your custom CharacterEscapeHandlerbe called. Not really sure, but I would guess this is kind of bug in JAXB.

使您的自定义CharacterEscapeHandler被调用。不太确定,但我猜这是 JAXB 中的一种错误。

回答by Thorbj?rn Ravn Andersen

I checked the XML specification. http://www.w3.org/TR/REC-xml/#sec-referencessays "well-formed documents need not declare any of the following entities: amp, lt, gt, apos, quot. " so it appears that the XML parser used by the legacy system is not conformant.

我检查了 XML 规范。 http://www.w3.org/TR/REC-xml/#sec-references说“格式良好的文档不需要声明以下任何实体:amp、lt、gt、apos、quot。”所以看起来遗留系统使用的 XML 解析器不符合标准。

(I know that it does not solve your problem, but it is at least nice to be able to say which component is broken).

(我知道它不能解决您的问题,但至少能说出哪个组件坏了是很好的)。

回答by jurisz

interesting but with strings you can try out

有趣,但您可以尝试使用字符串

Marshaller marshaller = jaxbContext.createMarshaller();
StringWriter sw = new StringWriter();
marshaller.marshal(data, sw);
sw.toString();

at least for me this do not escape quotes

至少对我来说这不会转义引号

回答by fred

The simplest way, when using sun's Marshaller implementation is to provide your own implementation of the CharacterEscapeEncoder which does not escape anything.

在使用 sun 的 Marshaller 实现时,最简单的方法是提供您自己的 CharacterEscapeEncoder 实现,它不会转义任何内容。

    Marshaller m = jcb.createMarshaller();
m.setProperty(
    "com.sun.xml.bind.marshaller.CharacterEscapeHandler",
    new NullCharacterEscapeHandler());

With

public class NullCharacterEscapeHandler implements CharacterEscapeHandler {

    public NullCharacterEscapeHandler() {
        super();
    }


    public void escape(char[] ch, int start, int length, boolean isAttVal, Writer writer) throws IOException {
        writer.write( ch, start, length );
    }
}

回答by Javatar

@Elliotyou can use this in order to enable marshaller to enter characterEscape function. It is wierd but it works if you set "Unicode" instead of "UTF-8". Add this just before or after you set CharacterEscapeHandler property.

@艾略特,你可以为了用它来使编组进入characterEscape功能。这很奇怪,但如果您设置“ Unicode”而不是“ UTF-8”,它就可以工作。在设置 CharacterEscapeHandler 属性之前或之后添加它。

marshaller.setProperty(Marshaller.JAXB_ENCODING, "Unicode");

However don't be sure just only by checking your consolewithin your IDE, because it should be shown depend on the workspace encoding. It is better to check it also from a file like that:

但是,不要仅通过检查IDE 中的控制台来确定,因为它应该根据工作区编码显示。最好也从这样的文件中检查它:

marshaller.marshal(shipOrder, new File("C:\shipOrder.txt"));

回答by Laura Liparulo

I have just made my custom handler as a class like this:

我刚刚将我的自定义处理程序作为这样的类:

import java.io.IOException;
import java.io.StringWriter;
import java.io.Writer;

import com.sun.xml.bind.marshaller.CharacterEscapeHandler;

public class XmlCharacterHandler implements CharacterEscapeHandler {

    public void escape(char[] buf, int start, int len, boolean isAttValue,
            Writer out) throws IOException {
        StringWriter buffer = new StringWriter();

        for (int i = start; i < start + len; i++) {
            buffer.write(buf[i]);
        }

        String st = buffer.toString();

        if (!st.contains("CDATA")) {
            st = buffer.toString().replace("&", "&amp;").replace("<", "&lt;")
                .replace(">", "&gt;").replace("'", "&apos;")
                .replace("\"", "&quot;");

        }
        out.write(st);
        System.out.println(st);
    }

}

in the marshaller method simply call:

在编组器方法中只需调用:

marshaller.setProperty(CharacterEscapeHandler.class.getName(),
                new XmlCharacterHandler());

it works fine.

它工作正常。

回答by user3240843

This works for me after reading other posts:

阅读其他帖子后,这对我有用:

javax.xml.bind.JAXBContext jc = javax.xml.bind.JAXBContext.newInstance(object);
marshaller = jc.createMarshaller();         marshaller.setProperty(javax.xml.bind.Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setProperty(javax.xml.bind.Marshaller.JAXB_ENCODING, "UTF-8");                   marshaller.setProperty(CharacterEscapeHandler.class.getName(), new CustomCharacterEscapeHandler());


public static class CustomCharacterEscapeHandler implements CharacterEscapeHandler {
        /**
         * Escape characters inside the buffer and send the output to the Writer.
         * (prevent <b> to be converted &lt;b&gt; but still ok for a<5.)
         */
        public void escape(char[] buf, int start, int len, boolean isAttValue, Writer out) throws IOException {
            if (buf != null){
                StringBuilder sb = new StringBuilder();
                for (int i = start; i < start + len; i++) {
                    char ch = buf[i];

                    //by adding these, it prevent the problem happened when unmarshalling
                    if (ch == '&') {
                        sb.append("&amp;");
                        continue;
                    }

                    if (ch == '"' && isAttValue) {
                        sb.append("&quot;");
                        continue;
                    }

                    if (ch == '\'' && isAttValue) {
                        sb.append("&apos;");
                        continue;
                    }


                    // otherwise print normally
                    sb.append(ch);
                }

                //Make corrections of unintended changes
                String st = sb.toString();

                st = st.replace("&amp;quot;", "&quot;")
                       .replace("&amp;lt;", "&lt;")
                       .replace("&amp;gt;", "&gt;")
                       .replace("&amp;apos;", "&apos;")
                       .replace("&amp;amp;", "&amp;");

                out.write(st);
            }
        }
    }

回答by mamuso

For some reason I have no time to find out, it worked for me when setting

出于某种原因,我没有时间找出来,它在设置时对我有用

marshaller.setProperty(Marshaller.JAXB_ENCODING, "utf-8");

As opposed to using "UTF-8"or "Unicode"

与使用"UTF-8""Unicode"

I suggest you try them, and as @Javatar said, check them dumping to file using:

我建议您尝试它们,正如@Javatar 所说,使用以下命令检查它们是否转储到文件:

marshaller.marshal(shipOrder, new File("<test_file_path>"));

and opening it with a a decent text editor like notepad++

并使用像记事本++这样的体面的文本编辑器打开它