使用 bash 递归合并 yaml 配置文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25630633/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 11:17:02  来源:igfitidea点击:

Merging yaml config files recursively with bash

bashrecursionyaml

提问by twicejr

Is it possible using some smart piping and coding, to merge yaml files recursively? In PHP, I make an array of them (each module can add or update config nodes of/in the system).

是否可以使用一些智能管道和编码来递归合并 yaml 文件?在 PHP 中,我创建了一个数组(每个模块可以添加或更新系统中的配置节点)。

The goal is an export shellscript that will merge all separate module folders' config files into big merged files. It's faster, efficient, and the customer does not need the modularity at the time we deploy new versions via FTP, for example.

目标是导出 shellscript,它将所有单独的模块文件夹的配置文件合并成大的合并文件。例如,它更快、更高效,并且客户在我们通过 FTP 部署新版本时不需要模块化。

It should behave like the PHP function: array_merge_recursive

它的行为应该类似于 PHP 函数: array_merge_recursive

The filesystem structure is like this:

文件系统结构是这样的:

mod/a/config/sys.yml
mod/a/config/another.yml
mod/b/config/sys.yml
mod/b/config/another.yml
mod/c/config/totally-new.yml
sys/config/sys.yml

Config looks like:

配置看起来像:

date:
   format:
      date_regular: %d-%m-%Y

And a module may, say, do this:

一个模块可以说,这样做:

date:
   format:
      date_regular: regular dates are boring
      date_special: !!!%d-%m-%Y!!!

So far, I have:

到目前为止,我有:

#!/bin/bash
#........
cp -R $dir_project/ $dir_to/
for i in $dir_project/mod/*/
do
    cp -R "${i}/." $dir_to/sys/
done

This of course destroys all existing config files in the loop.. (rest of the system files are uniquely named)

这当然会破坏循环中所有现有的配置文件..(其余的系统文件都是唯一命名的)

Basically, I need a yaml parser for the command line, and an array_merge_recursive like alternative. Then a yaml writer to ouput it merged. I fear I have to start to learn Python because bash won't cut it on this one.

基本上,我需要一个用于命令行的 yaml 解析器,以及一个类似 array_merge_recursive 的替代方案。然后一个 yaml 编写器输出它合并。我担心我必须开始学习 Python,因为 bash 不会在这方面削减它。

回答by jm666

You can use for example perl. The next oneliner:

例如,您可以使用 perl。下一个单线:

perl -MYAML::Merge::Simple=merge_files -MYAML -E 'say Dump merge_files(@ARGV)' f1.yaml f2.yaml

for the next input files: f1.yaml

对于下一个输入文件: f1.yaml

date:
  epoch: 2342342343
  format:
    date_regular: "%d-%m-%Y"

f2.yaml

f2.yaml

date:
  format:
    date_regular: regular dates are boring
    date_special: "!!!%d-%m-%Y!!!"

prints the merged result...

打印合并的结果...

---
date:
  epoch: 2342342343
  format:
    date_regular: regular dates are boring
    date_special: '!!!%d-%m-%Y!!!'

Because @Caleb pointed out that the module now is develeloper only, here is an replacement. It is a bit longer and uses two (but commonly available) modules:

因为@Caleb 指出该模块现在只是开发人员,所以这里有一个替代品。它有点长,并使用两个(但普遍可用的)模块:

perl -MYAML=LoadFile,Dump -MHash::Merge::Simple=merge -E 'say Dump(merge(map{LoadFile($_)}@ARGV))' f1.yaml f2.yaml

produces the same as above.

产生与上述相同。

回答by Charles Duffy

No.

不。

Bash has no support for nested data structures (its maps are integer->string or string->string only), and thus cannot represent arbitrary YAML documents in-memory.

Bash 不支持嵌套数据结构(它的映射是 integer->string 或 string->string),因此不能表示内存中的任意 YAML 文档。

Use a more powerful language for this task.

使用更强大的语言来完成这项任务。

回答by stuart

Late to the party, but I also wrote a tool for this:

聚会迟到了,但我也为此写了一个工具:

https://github.com/benprofessionaledition/yamlmerge

https://github.com/benprofessionaledition/yamlmerge

It's almost identical to Ondra's JVM tool (they're even both called "yaml merge"), the key difference being that it's written in Go so it compiles to a ~3MB binary with no external dependencies. We use it in Gitlab-CI containers.

它几乎与 Ondra 的 JVM 工具相同(它们甚至都被称为“yaml merge”),主要区别在于它是用 Go 编写的,因此它可以编译为一个 ~3MB 的二进制文件,没有外部依赖项。我们在 Gitlab-CI 容器中使用它。

回答by Stefan Frye

I recommend yq -m. yqis a swiss army knife for yaml, very similar to jq (for JSON).

我推荐yq -myq是 yaml 的瑞士军刀,与 jq(用于 JSON)非常相似。

回答by Caleb

Bash is a bit of a stretch for this (it could be done but it would be error prone). If all you want to do is call a few things froma bash shell (as opposed to actually scripting the merge using bash functions) then you have a few options.

Bash 对此有点牵强(它可以完成,但容易出错)。如果您只想bash shell调用一些东西(而不是使用 bash 函数实际编写合并脚本),那么您有几个选择。

I noticed there is a Java based yaml-mergetool, but that didn't suit my fancy very much, so I kept looking. In the end I clobbered together something using two tools: yaml2jsonand jq.

我注意到有一个基于 Java 的yaml-merge工具,但这不太符合我的喜好,所以我一直在寻找。最后,我使用两个工具将一些东西拼凑在一起:yaml2jsonjq

Warning: Since JSON's capabilities are only a subsetof YAML's, this is not a lossless process for complex YAML structures. It will work for a lot of simple key/value/sequence scenarios but will muck things up if your input YAML is too fancy. Test it on your data types to see if it does what you expect.

警告:由于 JSON 的功能只是YAML 的一个子集,这不是复杂 YAML 结构的无损过程。它适用于许多简单的键/值/序列场景,但如果您的输入 YAML 太花哨,则会把事情搞砸。在您的数据类型上测试它,看看它是否符合您的预期。

  1. Use yaml2jsonto convert your inputs to JSON:

    yaml2json input1.yml > input1.json
    yaml2json input2.yml > input2.json
    
  2. Use jqto iterate over the objects and merge them recursively (see this question and answersfor details). List files in reverse order of importance as values in later ones will clobber earlier ones:

    jq -s 'reduce .[] as $item({}; . + $item)' input1.json input2.json > merged.json
    
  3. Take it back to YAML:

    json2yaml merged.json > merged.yml
    
  1. 使用yaml2json您的输入转换成JSON:

    yaml2json input1.yml > input1.json
    yaml2json input2.yml > input2.json
    
  2. 使用jq遍历的对象和递归合并它们(见这个问题和答案的详细信息)。按重要性倒序列出文件,因为后面的值会破坏前面的值:

    jq -s 'reduce .[] as $item({}; . + $item)' input1.json input2.json > merged.json
    
  3. 将其带回 YAML:

    json2yaml merged.json > merged.yml
    

If you want to script this, of course the usual bash mechanisms are your friend. And if you happen to be in GNU-Make like I was, something like this will do the trick:

如果你想编写这个脚本,当然通常的 bash 机制是你的朋友。如果你碰巧像我一样在 GNU-Make 中,这样的事情就可以解决问题:

.SECONDEXPANSION:
merged.yml: input1.yml input2.yml
    json2yaml <(jq -s 'reduce .[] as $$item({}; . + $$item)' $(foreach YAML,$^,<(yaml2json $(YAML)))) > $@

回答by Ondra ?i?ka

There is a tool that merges YAML files - merge-yaml. It supports full YAML syntax, and is capable of expanding environment variables references.

有一个工具可以合并 YAML 文件 - merge-yaml. 它支持完整的 YAML 语法,并且能够扩展环境变量引用。

I forked it and released it into a form of an executable .jar.
https://github.com/OndraZizka/yaml-merge

我将它分叉并将其发布为可执行文件的形式.jar
https://github.com/OndraZizka/yaml-merge

Usage:

用法:

./bin/yaml-merge.sh ./*.yml > result.yml

It is written in Java so you need Java (I think 8 and newer) installed.
(Btw, if someone wants to contribute, that would be great.)

它是用 Java 编写的,因此您需要安装 Java(我认为是 8 和更高版本)。
(顺便说一句,如果有人想做出贡献,那就太好了。)



In general, merging YAML is not a trivial thing, in the sense that the tool doesn't always know what you really want to do. You can merge structures in multiple way. Think if this example:

一般来说,合并 YAML 并不是一件小事,因为该工具并不总是知道您真正想要做什么。您可以通过多种方式合并结构。想想这个例子:

foo:
   bar: bar2
   baz: 
      - baz1
---
foo:
   bar: bar1
   baz: 
      - baz2
   goo: gaz1

Few questions / unknowns arise:

出现的问题/未知数很少:

  • Should the 2nd footree replace the first one?
  • Should the 2nd barreplace the first one, or merge to an array?
  • Should the 2nd bazarray replace the 1st, or be merged?
    • If merged, then how - should there be duplicities, or should the tool keep the values unique? Should the order be managed in some way?
  • 第二foo棵树应该取代第一棵树吗?
  • 第二个应该bar替换第一个,还是合并到一个数组?
  • 第二个baz数组应该替换第一个数组还是合并?
    • 如果合并,那么如何 - 应该有重复,还是该工具应该保持值的唯一性?是否应该以某种方式管理订单?

Etc. One may object that there can be some default, but often, the real world requirements need different operations.

等等。有人可能会反对可以有一些默认值,但通常,现实世界的需求需要不同的操作。

Other tools and libraries to deal with data structures deal with this by defining a scheme with metadata, for instance, JAXB or Hymanson use Java annotations.
For this general tool, that is not an option, so the user would have to control this through a) the input data, or b) parameters. a) is impractical and sometimes impossible, b) is tedious and needs a fancy syntax like jqhas.

其他处理数据结构的工具和库通过定义带有元数据的方案来处理这个问题,例如,JAXB 或 Hymanson 使用 Java 注释。
对于这个通用工具,这不是一个选项,因此用户必须通过 a) 输入数据或 b) 参数来控制它。a) 是不切实际的,有时是不可能的,b) 是乏味的,需要像jqhas一样的花哨语法。

That said, Caleb's answer might be what you need. Although, that solution reduces your data to what JSON is capable of, so you will loose comments, various way to represent long strings, usage of JSON within YAML, etc., which is not too user friendly.

也就是说,Caleb 的答案可能正是您所需要的。虽然,该解决方案将您的数据减少到 JSON 的能力,因此您将丢失注释、表示长字符串的各种方式、在 YAML 中使用 JSON 等,这对用户不太友好。