Scala - 修改 xml 中的嵌套元素

Question

提问by ed.

I'm learning scala, and I'm looking to update a nested node in some xml. I've got something working but i'm wondering if its the most elegant way.

我正在学习 Scala，我正在寻找更新一些 xml 中的嵌套节点。我有一些工作，但我想知道它是否是最优雅的方式。

I have some xml:

我有一些 xml：

val InputXml : Node =
<root>
    <subnode>
        <version>1</version>
    </subnode>
    <contents>
        <version>1</version>
    </contents>
</root>

And i want to update the versionnode in subnode, but not the one in contents.

我想更新子节点中的版本节点，而不是内容中的版本。

Here is my function:

这是我的功能：

def updateVersion( node : Node ) : Node = 
 {
   def updateElements( seq : Seq[Node]) : Seq[Node] = 
   {
        var subElements = for( subNode <- seq ) yield
        {
            updateVersion( subNode )
        }   
        subElements
   }

   node match
   {
     case <root>{ ch @ _* }</root> =>
     {
        <root>{ updateElements( ch ) }</root>
     }
     case <subnode>{ ch @ _* }</subnode> =>
     {
         <subnode>{ updateElements( ch ) }</subnode> 
     }
     case <version>{ contents }</version> =>
     {
        <version>2</version>
     }
     case other @ _ => 
     {
         other
     }
   }
 }

Is there a more succint way of writing this function?

有没有更简洁的方法来编写这个函数？

Answer 1

采纳答案by GClaramunt

I think the original logic is good. This is the same code with (shall I dare to say?) a more Scala-ish flavor:

我觉得原来的逻辑很好。这是相同的代码（我敢说吗？）更具 Scala 风格：

def updateVersion( node : Node ) : Node = {
   def updateElements( seq : Seq[Node]) : Seq[Node] = 
     for( subNode <- seq ) yield updateVersion( subNode )  

   node match {
     case <root>{ ch @ _* }</root> => <root>{ updateElements( ch ) }</root>
     case <subnode>{ ch @ _* }</subnode> => <subnode>{ updateElements( ch ) }</subnode>
     case <version>{ contents }</version> => <version>2</version>
     case other @ _ => other
   }
 }

It looks more compact (but is actually the same :) )

它看起来更紧凑（但实际上是一样的:)）

I got rid of all the unnecessary brackets
If a bracket is needed, it starts in the same line
updateElements just defines a var and returns it, so I got rid of that and returned the result directly

我去掉了所有不必要的括号
如果需要括号，它从同一行开始
updateElements 只是定义了一个 var 并返回它，所以我摆脱了它并直接返回了结果

if you want, you can get rid of the updateElements too. You want to apply the updateVersion to all the elements of the sequence. That's the map method. With that, you can rewrite the line

如果你愿意，你也可以去掉 updateElements。您希望将 updateVersion 应用于序列的所有元素。那就是地图方法。有了这个，您可以重写该行

case <subnode>{ ch @ _* }</subnode> => <subnode>{ updateElements( ch ) }</subnode>

with

和

case <subnode>{ ch @ _* }</subnode> => <subnode>{ ch.map(updateVersion (_)) }</subnode>

As update version takes only 1 parameter I'm 99% sure you can omit it and write:

由于更新版本只需要 1 个参数，我 99% 确定您可以省略它并编写：

case <subnode>{ ch @ _* }</subnode> => <subnode>{ ch.map(updateVersion) }</subnode>

And end with:

并以：

def updateVersion( node : Node ) : Node = node match {
         case <root>{ ch @ _* }</root> => <root>{ ch.map(updateVersion )}</root>
         case <subnode>{ ch @ _* }</subnode> => <subnode>{ ch.map(updateVersion ) }</subnode>
         case <version>{ contents }</version> => <version>2</version>
         case other @ _ => other
       }

What do you think?

你怎么认为？

Answer 2

回答by Daniel C. Sobral

All this time, and no one actually gave the most appropriate answer! Now that I have learned of it, though, here's my new take on it:

一直以来，实际上没有人给出最合适的答案！既然我已经了解了它，但是，这是我对它的新看法：

import scala.xml._
import scala.xml.transform._

object t1 extends RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case Elem(prefix, "version", attribs, scope, _*)  =>
      Elem(prefix, "version", attribs, scope, Text("2"))
    case other => other
  }
}

object rt1 extends RuleTransformer(t1)

object t2 extends RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case sn @ Elem(_, "subnode", _, _, _*) => rt1(sn)
    case other => other
  }
}

object rt2 extends RuleTransformer(t2)

rt2(InputXml)

Now, for a few explanations. The class RewriteRuleis abstract. It defines two methods, both called transform. One of them takes a single Node, the other a Sequenceof Node. It's an abstract class, so we can't instantiate it directly. By adding a definition, in this case override one of the transformmethods, we are creating an anonymous subclass of it. Each RewriteRule needs concern itself with a single task, though it can do many.

现在，做一些解释。这个类RewriteRule是抽象的。它定义了两个方法，都称为transform. 其中一人带一个Node，另一个是Sequence的Node。它是一个抽象类，所以我们不能直接实例化它。通过添加定义，在这种情况下覆盖其中一个transform方法，我们正在创建它的匿名子类。每个 RewriteRule 需要关注一个单独的任务，尽管它可以做很多。

Next, class RuleTransformertakes as parameters a variable number of RewriteRule. It's transform method takes a Nodeand return a Sequenceof Node, by applying each and every RewriteRuleused to instantiate it.

接下来， classRuleTransformer将可变数量的RewriteRule. 它的变换方法需要Node和返回Sequence的Node运用每一个，RewriteRule用来进行实例化。

Both classes derive from BasicTransformer, which defines a few methods with which one need not concern oneself at a higher level. It's applymethod calls transform, though, so both RuleTransformerand RewriteRulecan use the syntactic sugar associated with it. In the example, the former does and the later does not.

这两个类都派生自BasicTransformer，它定义了一些不需要在更高级别上关注自己的方法。它的apply方法调用transform，虽然如此，两者RuleTransformer并RewriteRule可以用与它相关的语法糖。在这个例子中，前者有，后者没有。

Here we use two levels of RuleTransformer, as the first applies a filter to higher level nodes, and the second apply the change to whatever passes the filter.

这里我们使用两个级别的RuleTransformer，因为第一个将过滤器应用于更高级别的节点，第二个将更改应用于通过过滤器的任何内容。

The extractor Elemis also used, so that there is no need to concern oneself with details such as namespace or whether there are attributes or not. Not that the content of the element versionis completely discarded and replaced with 2. It can be matched against too, if needed.

Elem还使用了提取器，因此无需关心名称空间或是否有属性等细节。并不是说元素的内容version被完全丢弃并替换为2. 如果需要，它也可以匹配。

Note also that the last parameter of the extractor is _*, and not _. That means these elements can have multiple children. If you forget the *, the match may fail. In the example, the match would not fail if there were no whitespaces. Because whitespaces are translated into Textelements, a single whitespace under subnodewould case the match to fail.

另请注意，提取器的最后一个参数是_*，而不是_。这意味着这些元素可以有多个子元素。如果您忘记了*，则匹配可能会失败。在示例中，如果没有空格，匹配不会失败。因为空格被转换为Text元素，所以下面的单个空格subnode会导致匹配失败。

This code is bigger than the other suggestions presented, but it has the advantage of having much less knowledge of the structure of the XML than the others. It changes any element called versionthat is below -- no matter how many levels -- an element called subnode, no matter namespaces, attributes, etc.

这段代码比提供的其他建议要大，但它的优点是对 XML 结构的了解比其他代码少得多。它改变了下面的任何被调用的元素version——不管有多少层——一个被称为的元素subnode，不管命名空间、属性等。

Furthermore... well, if you have many transformations to do, recursive pattern matching becomes quickly unyielding. Using RewriteRuleand RuleTransformer, you can effectively replace xsltfiles with Scala code.

此外......好吧，如果你有很多转换要做，递归模式匹配很快就会变得不屈不挠。使用RewriteRule和RuleTransformer，您可以有效地xslt用 Scala 代码替换文件。

Answer 3

回答by David Pollak

You can use Lift's CSS Selector Transforms and write:

您可以使用 Lift 的 CSS Selector Transforms 并编写：

"subnode" #> ("version *" #> 2)

See http://stable.simply.liftweb.net/#sec:CSS-Selector-Transforms

见http://stable.simly.liftweb.net/#sec:CSS-Selector-Transforms

Answer 4

回答by Daniel C. Sobral

I have since learned more and presented what I deem to be a superior solution in another answer. I have also fixed this one, as I noticed I was failing to account for the subnoderestriction.

从那以后，我学到了更多，并在另一个答案中提出了我认为是更好的解决方案。我也修复了这个问题，因为我注意到我没有考虑到subnode限制。

Thanks for the question! I just learned some cool stuff when dealing with XML. Here is what you want:

感谢提问！我刚刚在处理 XML 时学到了一些很酷的东西。这是你想要的：

def updateVersion(node: Node): Node = {
  def updateNodes(ns: Seq[Node], mayChange: Boolean): Seq[Node] =
    for(subnode <- ns) yield subnode match {
      case <version>{ _ }</version> if mayChange => <version>2</version>
      case Elem(prefix, "subnode", attribs, scope, children @ _*) =>
        Elem(prefix, "subnode", attribs, scope, updateNodes(children, true) : _*)
      case Elem(prefix, label, attribs, scope, children @ _*) =>
        Elem(prefix, label, attribs, scope, updateNodes(children, mayChange) : _*)
      case other => other  // preserve text
    }

  updateNodes(node.theSeq, false)(0)
}

Now, explanation. First and last case statements should be obvious. The last one exists to catch those parts of an XML which are not elements. Or, in other words, text. Note in the first statement, though, the test against the flag to indicate whether versionmay be changed or not.

现在，解释。第一个和最后一个 case 语句应该是显而易见的。最后一个用于捕获 XML 中非元素的那些部分。或者，换句话说，文本。但是，请注意在第一条语句中，针对标志的测试以指示是否version可以更改。

The second and third case statements will use a pattern matcher against the object Elem. This will break an element into allits component parts. The last parameter, "children @ _*", will match children to a list of anything. Or, more specifically, a Seq[Node]. Then we reconstruct the element, with the parts we extracted, but pass the Seq[Node] to updateNodes, doing the recursion step. If we are matching against the element subnode, then we change the flag mayChange to true, enabling the change of the version.

第二个和第三个 case 语句将对对象 Elem 使用模式匹配器。这会将元素分解为其所有组成部分。最后一个参数“children @ _*”将把children 匹配到一个列表。或者，更具体地说，一个 Seq[Node]。然后我们用我们提取的部分重建元素，但将 Seq[Node] 传递给 updateNodes，执行递归步骤。如果我们匹配元素subnode，那么我们将标志 mayChange 更改为true，从而启用版本的更改。

In the last line, we use node.theSeq to generate a Seq[Node] from Node, and (0) to get the first element of the Seq[Node] returned as result. Since updateNodes is essentially a map function (for ... yield is translated into map), we know the result will only have one element. We pass a falseflag to ensure that no versionwill be changed unless a subnodeelement is an ancestor.

在最后一行中，我们使用 node.theSeq 从 Node 生成一个 Seq[Node]，并使用 (0) 获取作为结果返回的 Seq[Node] 的第一个元素。由于 updateNodes 本质上是一个 map 函数（for ... yield 被转换为 map），我们知道结果将只有一个元素。我们传递一个false标志以确保version除非subnode元素是祖先元素，否则不会被更改。

There is a slightly different way of doing it, that's more powerful but a bit more verbose and obscure:

有一种稍微不同的方法，它更强大，但更冗长和晦涩：

def updateVersion(node: Node): Node = {
  def updateNodes(ns: Seq[Node], mayChange: Boolean): Seq[Node] =
    for(subnode <- ns) yield subnode match {
      case Elem(prefix, "version", attribs, scope, Text(_)) if mayChange => 
        Elem(prefix, "version", attribs, scope, Text("2"))
      case Elem(prefix, "subnode", attribs, scope, children @ _*) =>
        Elem(prefix, "subnode", attribs, scope, updateNodes(children, true) : _*)
      case Elem(prefix, label, attribs, scope, children @ _*) =>
        Elem(prefix, label, attribs, scope, updateNodes(children, mayChange) : _*)
      case other => other  // preserve text
    }

  updateNodes(node.theSeq, false)(0)
}

This version allows you to change any "version" tag, whatever it's prefix, attribs and scope.

此版本允许您更改任何“版本”标签，无论它是前缀、属性和范围。

Answer 5

回答by Chris

Scales Xmlprovides tools for "in place" edits. Of course its all immutable but here's the solution in Scales:

Scales Xml提供了用于“就地”编辑的工具。当然，它都是不可变的，但这是 Scales 中的解决方案：

val subnodes = top(xml).\*("subnode"l).\*("version"l)
val folded = foldPositions( subnodes )( p => 
  Replace( p.tree ~> "2"))

The XPath like syntax is a Scales signature feature, the lafter the string specifies it should have no namespace (local name only).

类似 XPath 的语法是 Scales 签名功能，l在字符串后面指定它应该没有命名空间（仅限本地名称）。

foldPositionsiterates over the resulting elements and transforms them, joining the results back together.

foldPositions迭代结果元素并转换它们，将结果重新连接在一起。

Answer 6

回答by nafg

One approach would be lenses (e.g. scalaz's). See http://arosien.github.io/scalaz-base-talk-201208/#slide35for a very clear presentation.

一种方法是镜头（例如scalaz's）。请参阅http://arosien.github.io/scalaz-base-talk-201208/#slide35以获得非常清晰的演示。

Answer 7

回答by Germán

I really don't know how this could be done elegantly. FWIW, I would go for a different approach: use a custom model class for the info you're handling, and have conversion to and from Xml for it. You're probably going to find it's a better way to handle the data, and it's even more succint.

我真的不知道如何优雅地做到这一点。FWIW，我会采用不同的方法：为您正在处理的信息使用自定义模型类，并为此进行与 Xml 的转换。您可能会发现这是一种更好的数据处理方式，而且更加简洁。

However there is a nice way to do it with Xml directly, I'd like to see it.

但是有一个很好的方法可以直接用 Xml 来做，我想看看。

Scala - 修改 xml 中的嵌套元素

提问by ed.

采纳答案by GClaramunt

回答by Daniel C. Sobral

回答by David Pollak

回答by Daniel C. Sobral

回答by Chris

回答by nafg

回答by Germán

相关推荐

最近更新

标签

Scala - 修改 xml 中的嵌套元素

提问by ed.

采纳答案by GClaramunt

回答by Daniel C. Sobral

回答by David Pollak

回答by Daniel C. Sobral

回答by Chris

回答by nafg

回答by Germán

相关推荐

twitter-bootstrap Bootstrap 4 模态中心内容

twitter-bootstrap 为什么 h-100 不起作用？

twitter-bootstrap Bootstrap 4 导航栏活动类

twitter-bootstrap 如何使用 JQuery 3.3.1 和 Bootstrap 3.3.7 设置日期选择器？

相关推荐

最近更新

标签