list 如何合并 YAML 数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24090177/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 02:09:46  来源:igfitidea点击:

How to merge YAML arrays?

listdata-structuresyaml

提问by lfender6445

I would like to merge arrays in YAML, and load them via ruby -

我想在 YAML 中合并数组,并通过 ruby​​ 加载它们 -

some_stuff: &some_stuff
 - a
 - b
 - c

combined_stuff:
  <<: *some_stuff
  - d
  - e
  - f

I'd like to have the combined array as [a,b,c,d,e,f]

我想将组合数组作为 [a,b,c,d,e,f]

I receive the error: did not find expected key while parsing a block mapping

我收到错误:在解析块映射时没有找到预期的键

How do I merge arrays in YAML?

如何在 YAML 中合并数组?

回答by Jorge Leitao

If the aim is to run a sequence of shell commands, you may be able to achieve this as follows:

如果目的是运行一系列 shell 命令,您可以按如下方式实现:

# note: no dash before commands
some_stuff: &some_stuff |-
    a
    b
    c

combined_stuff:
  - *some_stuff
  - d
  - e
  - f

This is equivalent to:

这相当于:

some_stuff: "a\nb\nc"

combined_stuff:
  - "a\nb\nc"
  - d
  - e
  - f

I have been using this on my gitlab-ci.yml(to answer @rink.attendant.6 comment on the question).

我一直在我的gitlab-ci.yml(回答@rink.attendant.6 对这个问题的评论)上使用它。



Working example that we use to support requirements.txthaving private repos from gitlab:

我们用来支持requirements.txt从 gitlab 获得私有存储库的工作示例:

.pip_git: &pip_git
- git config --global url."https://gitlab-ci-token:${CI_JOB_TOKEN}@gitlab.com".insteadOf "ssh://[email protected]"
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
- echo "$SSH_KNOWN_HOSTS" > ~/.ssh/known_hosts
- chmod 644 ~/.ssh/known_hosts

test:
    image: python:3.7.3
    stage: test
    script:
        - *pip_git
        - pip install -q -r requirements_test.txt
        - python -m unittest discover tests

use the same `*pip_git` on e.g. build image...

where requirements_test.txtcontains e.g.

其中requirements_test.txt包含例如

-e git+ssh://[email protected]/example/[email protected]#egg=example

-e git+ssh://[email protected]/example/[email protected]#egg=example

回答by dreftymac

Update: 2019-07-01 14:06:12

更新:2019-07-01 14:06:12

  • Note: another answer to this question was substantially edited with an update on alternative approaches.
    • That updated answer mentions an alternative to the workaround in this answer. It has been added to the See alsosection below.
  • 注意:对这个问题的另一个答案进行了大量编辑,并更新了替代方法
    • 更新后的答案提到了此答案中解决方法的替代方法。它已添加到下面的另请参阅部分。

Context

语境

This post assumes the following context:

这篇文章假设以下上下文:

  • python 2.7
  • python YAML parser
  • 蟒蛇 2.7
  • python YAML 解析器

Problem

问题

lfender6445 wishes to merge two or more lists within a YAML file, and have those merged lists appear as one singular list when parsed.

lfender6445 希望在一个 YAML 文件中合并两个或多个列表,并让这些合并的列表在解析时显示为一个单数列表。

Solution (Workaround)

解决方案(变通方法)

This may be obtained simply by assigning YAML anchors to mappings, where the desired lists appear as child elements of the mappings. There are caveats to this, however, (see "Pitfalls" infra).

这可以通过将 YAML 锚点分配给映射来简单地获得,其中所需的列表显示为映射的子元素。但是,对此有一些警告(请参阅下文的“陷阱”)。

In the example below we have three mappings (list_one, list_two, list_three) and three anchors and aliases that refer to these mappings where appropriate.

在下面的示例中,我们有三个映射 ( list_one, list_two, list_three) 和三个锚点和别名,它们在适当的地方引用这些映射。

When the YAML file is loaded in the program we get the list we want, but it may require a little modification after load (see pitfalls below).

当 YAML 文件加载到程序中时,我们得到了我们想要的列表,但加载后可能需要稍作修改(参见下面的陷阱)。

Example

例子

Original YAML file

原始 YAML 文件

  list_one: &id001
   - a
   - b
   - c

  list_two: &id002
   - e
   - f
   - g

  list_three: &id003
   - h
   - i
   - j

  list_combined:
      - *id001
      - *id002
      - *id003

Result after YAML.safe_load

YAML.safe_load 后的结果

## list_combined
  [
    [
      "a",
      "b",
      "c"
    ],
    [
      "e",
      "f",
      "g"
    ],
    [
      "h",
      "i",
      "j"
    ]
  ]

Pitfalls

陷阱

  • this approach produces a nested list of lists, which may not be the exact desired output, but this can be post-processed using the flattenmethod
  • the usual caveats to YAML anchors and aliasesapply for uniqueness and declaration order
  • 这种方法会生成一个嵌套的列表列表,它可能不是确切所需的输出,但这可以使用flatten方法进行后处理
  • YAML 锚点和别名常见警告适用于唯一性和声明顺序

Conclusion

结论

This approach allows creation of merged lists by use of the alias and anchor feature of YAML.

这种方法允许使用 YAML 的别名和锚点功能创建合并列表。

Although the output result is a nested list of lists, this can be easily transformed using the flattenmethod.

尽管输出结果是一个嵌套的列表列表,但可以使用该flatten方法轻松转换。

See also

也可以看看

Updated alternative approach by @Anthon

@Anthon 更新的替代方法

Examples of the flattenmethod

flatten方法示例

回答by Anthon

This is not going to work:

这是行不通的:

  1. merge is only supported by the YAML specifications for mappings and not for sequences

  2. you are completely mixing things by having a merge key <<followed by the key/value separator :and a value that is a reference and then continue with a list at the same indentation level

  1. 合并仅受映射的 YAML 规范支持,而不支持序列

  2. 你完全混合了一个合并键,<<后跟键/值分隔符:和一个作为引用的值,然后在相同的缩进级别继续一个列表

This is not correct YAML:

这不是正确的 YAML:

combine_stuff:
  x: 1
  - a
  - b

So your example syntax would not even make sense as a YAML extension proposal.

因此,您的示例语法作为 YAML 扩展提案甚至没有任何意义。

If you want to do something like merging multiple arrays you might want to consider a syntax like:

如果您想执行诸如合并多个数组之类的操作,您可能需要考虑以下语法:

combined_stuff:
  - <<: *s1, *s2
  - <<: *s3
  - d
  - e
  - f

where s1, s2, s3are anchors on sequences (not shown) that you want to merge into a new sequence and then have the d, eand fappended to that. But YAML is resolving these kind of structures depth first, so there is no real context available during the processing of the merge key. There is no array/list available to you where you could attach the processed value (the anchored sequence) to.

其中s1s2s3是对要合并成一个新的序列,然后有序列(未显示)锚def附加了这一点。但是 YAML 首先解析这些类型的结构深度,因此在处理合并键期间没有真正的上下文可用。没有可供您使用的数组/列表,您可以将处理后的值(锚定序列)附加到其中。

You can take the approach as proposed by @dreftymac, but this has the huge disadvantage that you somehow need to know which nested sequences to flatten (i.e. by knowing the "path" from the root of the loaded data structure to the parent sequence), or that you recursively walk the loaded data structure searching for nested arrays/lists and indiscriminately flatten all of them.

您可以采用@dreftymac 提出的方法,但这有一个巨大的缺点,即您需要知道要展平哪些嵌套序列(即通过了解从加载数据结构的根到父序列的“路径”),或者您递归地遍历加载的数据结构以搜索嵌套数组/列表并不加选择地将它们全部展平。

A better solution IMO would be to use tags to load data structures that do the flattening for you. This allows for clearly denoting what needs to be flattened and what not and gives you full control over whether this flattening is done during loading, or done during access. Which one to choose is a matter of ease of implementation and efficiency in time and storage space. This is the same trade-off that needs to be made for implementing the merge keyfeatureand there is no single solution that is always the best.

一个更好的解决方案 IMO 是使用标签来加载为你做扁平化的数据结构。这允许清楚地表示哪些需要展平,哪些不需要,并让您完全控制展平是在加载期间完成,还是在访问期间完成。选择哪一个是一个易于实施以及时间和存储空间效率的问题。这与实现合并关键功能需要进行的权衡相同,并且没有一个解决方案总是最好的。

E.g. my ruamel.yamllibrary uses the brute force merge-dicts during loading when using its safe-loader, which results in merged dictionaries that are normal Python dicts. This merging has to be done up-front, and duplicates data (space inefficient) but is fast in value lookup. When using the round-trip-loader, you want to be able to dump the merges unmerged, so they need to be kept separate. The dict like datastructure loaded as a result of round-trip-loading, is space efficient but slower in access, as it needs to try and lookup a key not found in the dict itself in the merges (and this is not cached, so it needs to be done every time). Of course such considerations are not very important for relatively small configuration files.

例如,我的ruamel.yaml库在使用其安全加载器时在加载过程中使用了蛮力合并字典,这导致合并的字典是普通的 Python 字典。这种合并必须预先完成,并复制数据(空间效率低下)但值查找速度很快。使用往返加载器时,您希望能够转储未合并的合并,因此需要将它们分开。由于往返加载而加载的数据结构之类的 dict 空间效率高,但访问速度较慢,因为它需要尝试在合并中查找 dict 本身中未找到的键(这没有被缓存,所以它每次都需要做)。当然,对于相对较小的配置文件,这些考虑并不是很重要。



The following implements a merge like scheme for lists in python using objects with tag flattenwhich on-the-fly recurses into items which are lists and tagged toflatten. Using these two tags you can have YAML file:

下面使用带有标记的对象在 python 中为列表实现类似合并的方案,这些对象flatten在运行中递归到列表和标记的项目中toflatten。使用这两个标签,您可以拥有 YAML 文件:

l1: &x1 !toflatten
  - 1 
  - 2
l2: &x2
  - 3 
  - 4
m1: !flatten
  - *x1
  - *x2
  - [5, 6]
  - !toflatten [7, 8]

(the use of flow vs block style sequences is completely arbitrary and has no influence on the loaded result).

(流与块样式序列的使用是完全任意的,对加载的结果没有影响)。

When iterating over the items that are the value for key m1this "recurses" into the sequences tagged with toflatten, but displays other lists (aliased or not) as a single item.

当迭代作为键值的项目时,m1这“递归”到标记为 的序列中toflatten,但将其他列表(别名或不别名)显示为单个项目。

One possible way with Python code to achieve that is:

使用 Python 代码实现这一目标的一种可能方法是:

import sys
from pathlib import Path
import ruamel.yaml

yaml = ruamel.yaml.YAML()


@yaml.register_class
class Flatten(list):
   yaml_tag = u'!flatten'
   def __init__(self, *args):
      self.items = args

   @classmethod
   def from_yaml(cls, constructor, node):
       x = cls(*constructor.construct_sequence(node, deep=True))
       return x

   def __iter__(self):
       for item in self.items:
           if isinstance(item, ToFlatten):
               for nested_item in item:
                   yield nested_item
           else:
               yield item


@yaml.register_class
class ToFlatten(list):
   yaml_tag = u'!toflatten'

   @classmethod
   def from_yaml(cls, constructor, node):
       x = cls(constructor.construct_sequence(node, deep=True))
       return x



data = yaml.load(Path('input.yaml'))
for item in data['m1']:
    print(item)

which outputs:

输出:

1
2
[3, 4]
[5, 6]
7
8

As you can see you can see, in the sequence that needs flattening, you can either use an alias to a tagged sequence or you can use a tagged sequence. YAML doesn't allow you to do:

如您所见,在需要展平的序列中,您可以使用标记序列的别名,也可以使用标记序列。YAML 不允许您执行以下操作:

- !flatten *x2

, i.e. tag an anchored sequence, as this would essentially make it into a different datastructure.

,即标记一个锚定序列,因为这基本上会使其成为不同的数据结构。

Using explicittags is IMO better than having some magic going on as with YAML merge keys <<. If nothing else you now have to go through hoops if you happen to have a YAML file with a mapping that has a key <<that you don't want to act like a merge key, e.g. when you make a mapping of C operators to their descriptions in English (or some other natural language).

使用显式标签比使用 YAML 合并键使用一些魔法更好<<。如果没有别的,如果您碰巧有一个带有映射的 YAML 文件,该映射具有 <<您不想充当合并键的键,例如,当您将 C 运算符映射到它们的描述时英语(或其他一些自然语言)。

回答by Tamlyn

If you only need to merge one item into a list you can do

如果您只需要将一项合并到一个列表中,您可以这样做

fruit:
  - &banana
    name: banana
    colour: yellow

food:
  - *banana
  - name: carrot
    colour: orange

which yields

这产生

fruit:
  - name: banana
    colour: yellow

food:
  - name: banana
    colour: yellow
  - name: carrot
    colour: orange

回答by sm4rk0

You can merge mappings then convert their keys into a list, under these conditions:

在以下条件下,您可以合并映射,然后将它们的键转换为列表:

  • if you are using jinja2 templating and
  • if item order is not important
  • 如果您使用的是 jinja2 模板和
  • 如果项目顺序不重要
some_stuff: &some_stuff
 a:
 b:
 c:

combined_stuff:
  <<: *some_stuff
  d:
  e:
  f:

{{ combined_stuff | list }}