Html XSL 字符转义问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/646194/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 23:27:02  来源:igfitidea点击:

XSL character escape problem

htmlxsltescaping

提问by Marcos Buarque

I am writing this because I have really hit the wall and cannot go ahead. In my database I have escaped HTML like this: "<p>My name is Freddy and I was".

我写这篇文章是因为我真的撞到了墙,无法继续。在我的数据库中,我像这样转义了 HTML:"<p>My name is Freddy and I was".

I want to show it as HTML OR strip the HTML tags in my XSL template. Both solutions will work for me and I will choose the quicker solution.

我想将其显示为 HTML 或去除 XSL 模板中的 HTML 标签。两种解决方案都适合我,我会选择更快的解决方案。

I have read several posts online but cannot find a solution. I have also tried disable-output-escape with no success. Basically it seems the problem is that somewhere in the XSL execution the engine is changing this <p>into this: <p>.

我在网上阅读了几篇文章,但找不到解决方案。我也试过 disable-output-escape 没有成功。基本上它看来问题是在执行XSL引擎正在改变这一某处<p>成这样:<p>

It is converting the &into &. If it helps, here is my XSL code. I have tried several combinations with and without the output tag on the top.

它正在将 转换&&。如果有帮助,这是我的 XSL 代码。我尝试了几种组合,顶部有和没有输出标签。

Any help will be appreciated. Thanks in advance.

任何帮助将不胜感激。提前致谢。

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="html" omit-xml-declaration="yes"/>

  <xsl:template match="DocumentElement">
    <div>
      <xsl:attribute name="id">mySlides</xsl:attribute>
      <xsl:apply-templates>
        <xsl:with-param name="templatenumber" select="0"/>
      </xsl:apply-templates>
    </div>

    <div>
      <xsl:attribute name="id">myController</xsl:attribute>
      <xsl:apply-templates>
        <xsl:with-param name="templatenumber" select="1"/>
      </xsl:apply-templates>
    </div>
  </xsl:template>

  <xsl:template match="DocumentElement/QueryResults">
    <xsl:param name="templatenumber">tobereplace</xsl:param>

    <xsl:if test="$templatenumber=0">
      <div>
        <xsl:attribute name="id">myController</xsl:attribute>
        <div>
          <xsl:attribute name="class">article</xsl:attribute>
          <h2>
            <a>
              <xsl:attribute name="class">title</xsl:attribute>
              <xsl:attribute name="title"><xsl:value-of select="Title"/></xsl:attribute>
              <xsl:attribute name="href">/stories/stories-details/articletype/articleview/articleid/<xsl:value-of select="ArticleId"/>/<xsl:value-of select="SEOTitle"/>.aspx</xsl:attribute>
              <xsl:value-of select="Title"/>
            </a>
          </h2>
          <div>
            <xsl:attribute name="style">text-indent: 25px;</xsl:attribute>
            <xsl:attribute name="class">articlesummary</xsl:attribute>
            <xsl:call-template name="removeHtmlTags">
              <xsl:with-param name="html" select="Summary" />
            </xsl:call-template>
          </div>
        </div>
      </div>
    </xsl:if>
    <xsl:if test="$templatenumber=1">
      <div>
        <xsl:attribute name="id">myController</xsl:attribute>
        <span>
          <xsl:attribute name="class">jFlowControl</xsl:attribute>
          aa
        </span>
      </div>
    </xsl:if>
  </xsl:template>

  <xsl:template name="removeHtmlTags">
    <xsl:param name="html"/>
    <xsl:choose>
      <xsl:when test="contains($html, '&lt;')">
        <xsl:value-of select="substring-before($html, '&lt;')"/>
        <!-- Recurse through HTML -->
        <xsl:call-template name="removeHtmlTags">
          <xsl:with-param name="html" select="substring-after($html, '&gt;')"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$html"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>

回答by Tomalak

Based in the assumption that you have thisHTML string,

假设您拥有HTML 字符串,

<p>My name is Freddy &amp; I was

then if you escape it and store it in a database it would become this:

然后,如果您将其转义并将其存储在数据库中,它将变成这样

&lt;p&gt;My name is Freddy &amp;amp; I was

Consequently, if you retrieve it as XML (without unescaping it beforehand), the result would be this:

因此,如果你找回它作为XML(事先不进行反向转义吧),结果会是这样

&amp;lt;p&amp;gt;My name is Freddy &amp;amp;amp; I was

and <xsl:value-of select="." disable-output-escaping="yes" />would produce:

并且<xsl:value-of select="." disable-output-escaping="yes" />会产生:

&lt;p&gt;My name is Freddy &amp;amp; I was

You are getting exactly the same thing you have in your database, but of course you see the HTML tags in the output. So what you need is a mechanism that does the following string replacements:

您将获得与数据库中完全相同的内容,但您当然会在输出中看到 HTML 标记。因此,您需要的是一种执行以下字符串替换的机制:

  • "&amp;lt;"with "&lt;"(effectively changing &lt;to <in unescaped ouput)
  • "&amp;gt;"with "&gt;"(effectively changing &gt;to >in unescaped ouput)
  • "&amp;quot;"with "&quot;"(effectively changing &quot;to "in unescaped ouput)
  • "&amp;amp;"with "&amp;"(effectively changing &amp;to &in unescaped ouput)
  • "&amp;lt;"with "&lt;"(有效地更改&lt;为未<转义的输出)
  • "&amp;gt;"with "&gt;"(有效地更改&gt;为未>转义的输出)
  • "&amp;quot;"with "&quot;"(有效地更改&quot;为未"转义的输出)
  • "&amp;amp;"with "&amp;"(有效地更改&amp;为未&转义的输出)

From your XSL I have inferred the following test input XML:

从您的 XSL 我推断出以下测试输入 XML:

<DocumentElement>
  <QueryResults>
    <Title>Article 1</Title>
    <ArticleId>1</ArticleId>
    <SEOTitle>Article_1</SEOTitle>
    <Summary>&amp;lt;p&amp;gt;Article 1 summary &amp;amp;amp; description.&amp;lt;/p&amp;gt;</Summary>
  </QueryResults>
  <QueryResults>
    <Title>Article 2</Title>
    <ArticleId>2</ArticleId>
    <SEOTitle>Article_2</SEOTitle>
    <Summary>&amp;lt;p&amp;gt;Article 2 summary &amp;amp;amp; description.&amp;lt;/p&amp;gt;</Summary>
  </QueryResults>
</DocumentElement>

I have changed the stylesheet you supplied and implemented such a replacement mechanism. If you apply the following XSLT 1.0 template to it:

我已经更改了您提供的样式表并实现了这样的替换机制。如果您将以下 XSLT 1.0 模板应用于它:

<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:my="my:namespace"
  exclude-result-prefixes="my"
>

  <xsl:output method="html" omit-xml-declaration="yes"/>

  <my:unescape>
    <my:char literal="&lt;" escaped="&amp;lt;" />
    <my:char literal="&gt;" escaped="&amp;gt;" />
    <my:char literal="&quot;" escaped="&amp;quot;" />
    <my:char literal="&amp;" escaped="&amp;amp;" />
  </my:unescape>

  <xsl:template match="DocumentElement">
    <div id="mySlides">
      <xsl:apply-templates mode="slides" />
    </div>
    <div id="myController">
      <xsl:apply-templates mode="controller" />
    </div>
  </xsl:template>

  <xsl:template match="DocumentElement/QueryResults" mode="slides">
    <div class="article">
      <h2>
        <a class="title" title="{Title}" href="{concat('/stories/stories-details/articletype/articleview/articleid/', ArticleId, '/', SEOTitle, '.aspx')}">
          <xsl:value-of select="Title"/>
        </a>
      </h2>
      <div class="articlesummary" style="text-indent: 25px;">
        <xsl:apply-templates select="document('')/*/my:unescape/my:char[1]">
          <xsl:with-param name="html" select="Summary" />
        </xsl:apply-templates>
      </div>
    </div>
  </xsl:template>

  <xsl:template match="DocumentElement/QueryResults" mode="controller">
    <span class="jFlowControl">
      <xsl:text>aa </xsl:text>
      <xsl:value-of select="Title" />
    </span>
  </xsl:template>

  <xsl:template match="my:char">
    <xsl:param name="html" />
    <xsl:variable name="intermediate">
      <xsl:choose>
        <xsl:when test="following-sibling::my:char">
          <xsl:apply-templates select="following-sibling::my:char[1]">
            <xsl:with-param name="html" select="$html" />
          </xsl:apply-templates>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$html" disable-output-escaping="yes" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>
    <xsl:call-template name="unescape">
      <xsl:with-param name="html" select="$intermediate" />
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="unescape">
    <xsl:param name="html" />
    <xsl:choose>
      <xsl:when test="contains($html, @escaped)">
        <xsl:value-of select="substring-before($html, @escaped)" disable-output-escaping="yes"/>
        <xsl:value-of select="@literal" disable-output-escaping="yes" />
        <xsl:call-template name="unescape">
          <xsl:with-param name="html" select="substring-after($html, @escaped)"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$html" disable-output-escaping="yes"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

Then this output HTML is produced:

然后生成此输出 HTML:

<div id="mySlides">
  <div class="article">
    <h2>
      <a class="title" title="Article 1" href="/stories/stories-details/articletype/articleview/articleid/1/Article_1.aspx">Article 1</a>
    </h2>
    <div class="articlesummary" style="text-indent: 25px;">
      <p>Article 1 summary &amp; description.</p>
    </div>
  </div>
  <div class="article">
    <h2>
      <a class="title" title="Article 2" href="/stories/stories-details/articletype/articleview/articleid/2/Article_2.aspx">Article 2</a>
    </h2>
    <div class="articlesummary" style="text-indent: 25px;">
      <p>Article 2 summary &amp; description.</p>
    </div>
  </div>
</div>
<div id="myController">
  <span class="jFlowControl">aa Article 1</span>
  <span class="jFlowControl">aa Article 2</span>
</div>

Note

笔记

  • the use of a temporary namespace and embedded elements (<my:unescape>) to create a list of characters to replace
  • the use of recursion to emulate an iterative replacement of all affected characters in the input
  • the use of the implicit context within the unescapetemplate to transport the information which character is to be replaced at the moment
  • 使用临时命名空间和嵌入元素 ( <my:unescape>) 创建要替换的字符列表
  • 使用递归来模拟输入中所有受影响字符的迭代替换
  • 使用unescape模板中的隐式上下文来传输当前要替换哪个字符的信息

Furthermore note:

另外注意:

  • the use of template modes to get different output for the same input (this replaces your templatenumberparameter)
  • most of the time there is no need for <xsl:attribute>elements. They can safely be replaced by inline notation (attributename="{attributevalue}")
  • the use of the concat()function to create the URL
  • 使用模板模式为相同的输入获得不同的输出(这将替换您的templatenumber参数)
  • 大多数时候不需要<xsl:attribute>元素。它们可以安全地替换为内联符号 ( attributename="{attributevalue}")
  • 使用concat()函数创建URL

Generally speaking, it is a bad idea to store escaped HTML in a database (more generally speaking: It is a bad idea to store HTML in a database.). You set yourself up to get all kinds of problems, this being one of them. If you can't change this setup, I hope that the solution helps you.

一般来说,将转义的 HTML 存储在数据库中是一个坏主意(更一般地说:将 HTML 存储在数据库中是一个坏主意。)。你让自己遇到各种各样的问题,这就是其中之一。如果您无法更改此设置,希望该解决方案对您有所帮助。

I cannot guarantee that it does the right thing in all situations, and it may open up security holes (think XSS), but dealing with this was not part of the question. In any case, consider yourself warned.

我不能保证它在所有情况下都能做正确的事情,它可能会打开安全漏洞(想想 XSS),但处理这个问题不是问题的一部分。无论如何,请考虑一下自己的警告。

I need a break now. ;-)

我现在需要休息一下。;-)

回答by David

You shouldn't store escaped HTML in your database. If your database contained the actual "<" character, then the "disable-output-escaping" command would do what you wanted.

你不应该在你的数据库中存储转义的 HTML。如果您的数据库包含实际的“<”字符,那么“disable-output-escaping”命令将执行您想要的操作。

If you can't change the data then you'll have to unescape the data before your perform the transform.

如果您无法更改数据,那么您必须在执行转换之前取消转义数据。

回答by Tim Zhou

Add this line to your stylesheet

将此行添加到您的样式表

<xsl:output method="html" indent="yes" version="4.0"/>

回答by Shiggity

It is a bad idea to store HTML in a database

将 HTML 存储在数据库中是个坏主意

What? How are you supposed to store it then? In an XML doc so you have to use XSLT anyway? As a web developer, we've always used SQL databases to store user-defined HTML data. There's nothing wrong with that method as long as it is sanitized properly for your purposes.

什么?那你应该怎么保存呢?在 XML 文档中,所以您无论如何都必须使用 XSLT?作为 Web 开发人员,我们一直使用 SQL 数据库来存储用户定义的 HTML 数据。只要根据您的目的对它进行了适当的消毒,该方法就没有任何问题。