string 使用 PowerShell 拆分字符串并对每个令牌执行某些操作

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11348506/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:31:52  来源:igfitidea点击:

Split string with PowerShell and do something with each token

stringpowershelltokenize

提问by Pieter Müller

I want to split each line of a pipe on spaces, and then print each token on its own line.

我想在空格上拆分管道的每一行,然后在自己的行上打印每个标记。

I realise that I can get this result using:

我意识到我可以使用以下方法得到这个结果:

(cat someFileInsteadOfAPipe).split(" ")

But I want more flexibility. I want to be able to do just about anything with each token. (I used to use AWKon Unix, and I'm trying to get the same functionality.)

但我想要更多的灵活性。我希望能够对每个令牌做任何事情。(我曾经在 Unix上使用AWK,我正在尝试获得相同的功能。)

I currently have:

我目前有:

echo "Once upon a time there were three little pigs" | %{$data = $_.split(" "); Write-Output "$($data[0]) and whatever I want to output with it"}

Which, obviously, only prints the first token. Is there a way for me to for-each over the tokens, printing each in turn?

显然,它只打印第一个标记。有没有办法让我对令牌进行 for-each,依次打印?

Also, the %{$data = $_.split(" "); Write-Output "$($data[0])"}part I got from a blog, and I really don't understand what I'm doing or how the syntax works.

另外,%{$data = $_.split(" "); Write-Output "$($data[0])"}我从博客中得到的部分,我真的不明白我在做什么或语法是如何工作的。

I want to google for it, but I don't know what to call it. Please help me out with a word or two to Google, or a link explaining to me what the %and all the $symbols do, as well as the significance of the opening and closing brackets.

我想用谷歌搜索它,但我不知道该怎么称呼它。请帮我用一两个词到谷歌,或者一个链接,向我解释%这些$符号和所有符号的作用,以及左括号和右括号的意义。

I realise I can't actually use (cat someFileInsteadOfAPipe).split(" "), since the file (or preferable incoming pipe) contains more than one line.

我意识到我实际上不能使用(cat someFileInsteadOfAPipe).split(" "),因为文件(或更好的传入管道)包含不止一行。

Regarding some of the answers:

关于一些答案:

If you are using Select-Stringto filter the output before tokenizing, you need to keep in mind that the output of the Select-Stringcommand is not a collection of strings, but a collection of MatchInfoobjects. To get to the string you want to split, you need to access the Lineproperty of the MatchInfoobject, like so:

如果Select-String在标记化之前使用过滤输出,则需要记住Select-String命令的输出不是字符串集合,而是MatchInfo对象集合。要获得要拆分的字符串,您需要访问对象的Line属性MatchInfo,如下所示:

cat someFile | Select-String "keywordFoo" | %{$_.Line.Split(" ")}

回答by Justus Thane

"Once upon a time there were three little pigs".Split(" ") | ForEach {
    "$_ is a token"
 }

The key is $_, which stands for the current variable in the pipeline.

关键是$_,它代表管道中的当前变量。

About the code you found online:

关于您在网上找到的代码:

%is an alias for ForEach-Object. Anything enclosed inside the brackets is run once for each object it receives. In this case, it's only running once, because you're sending it a single string.

%是 的别名ForEach-Object。括号内的任何内容都对其接收到的每个对象运行一次。在这种情况下,它只运行一次,因为您发送的是单个字符串。

$_.Split(" ")is taking the current variable and splitting it on spaces. The current variable will be whatever is currently being looped over by ForEach.

$_.Split(" ")正在获取当前变量并将其拆分为空格。当前变量将是当前正在循环的任何内容ForEach

回答by mklement0

To complement Justus Thane's helpful answer:

为了补充Justus Thane 的有用回答

  • As Joeynotes in a comment, PowerShell has a powerful, regex-based -splitoperator.

    • In its unaryform (-split '...'), -splitbehaves like awk's default field splitting, which means that:
      • Leading and trailing whitespace is ignored.
      • Any runof whitespace (e.g., multipleadjacent spaces) is treated as a singleseparator.
  • In PowerShell v4+an expression-based - and therefore faster - alternative to the ForEach-Objectcmdletbecame available: the .ForEach()array (collection) method, as described in this blog post(alongside the .Where()method, a more powerful, expression-based alternative to Where-Object).

  • 正如Joey在评论中指出的那样,PowerShell 有一个强大的、基于正则表达式的-split运算符

    • 在其一形式 ( -split '...') 中,-split其行为类似于awk的默认字段拆分,这意味着:
      • 忽略前导和尾随空格。
      • 任何运行空格(例如,多个相邻空间)被视为一单一隔板。
  • PowerShell v4+ 中,可以使用基于表达式(因此速度更快)ForEach-Objectcmdlet替代.ForEach()方法:数组(集合)方法,如本博客文章中所述(除了该.Where()方法,更强大的基于表达式的替代方法Where-Object)。

Here's a solution based on these features:

这是基于这些功能的解决方案:

PS> (-split '   One      for the money   ').ForEach({ "token: [$_]" })
token: [One]
token: [for]
token: [the]
token: [money]

Note that the leading and trailing whitespace was ignored, and that the multiple spaces between Oneand forwere treated as a single separator.

注意,开头和结尾的空白将被忽略,而之间的多个空格Onefor被视为一个单一的分隔符。

回答by s31064

Another way to accomplish this is a combination of Justus Thane's and mklement0's answers. It doesn't make sense to do it this way when you look at a one liner example, but when you're trying to mass-edit a file or a bunch of filenames it comes in pretty handy:

实现此目的的另一种方法是结合 Justus Thane 和 mklement0 的答案。当您查看单行示例时,这样做没有意义,但是当您尝试批量编辑一个文件或一堆文件名时,它会非常方便:

$test = '   One      for the money   '
$option = [System.StringSplitOptions]::RemoveEmptyEntries
$($test.split(' ',$option)).foreach{$_}

This will come out as:

这将显示为:

One
for
the
money

回答by js2010

-split outputs an array, and you can save it to a variable like this:

-split 输出一个数组,您可以将其保存到这样的变量中:

$a = -split 'Once  upon    a     time'
$a[0]

Once

Another cute thing, you can have arrays on both sides of an assignment statement:

另一个可爱的事情,你可以在赋值语句的两边都有数组:

$a,$b,$c = -split 'Once  upon    a'
$c

a