Linux grep：组捕获

Question

提问by lstipakov

I have following string:

我有以下字符串：

{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}

and I need to get value of "scheme version", which is 1234 in this example.

我需要获取“方案版本”的值，在本例中为 1234。

I have tried

我试过了

grep -Eo "\"scheme_version\":(\w*)"

however it returns

但是它返回

"scheme_version":1234

How can I make it? I know I can add sedcall, but I would prefer to do it with single grep.

我怎样才能做到？我知道我可以添加sed调用，但我更愿意使用单个 grep 来完成。

Answer 1

采纳答案by potong

This might work for you:

这可能对你有用：

echo '{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}' |
sed -n 's/.*"scheme_version":\([^}]*\)}//p'
1234

Sorry it's not grep, so disregard this solution if you like.

抱歉，它不是grep，所以如果您愿意，请忽略此解决方案。

Or stick with grep and add:

或者坚持使用 grep 并添加：

grep -Eo "\"scheme_version\":(\w*)"| cut -d: -f2

Answer 2

回答by SiegeX

You'll need to use a look behind assertion so that it isn't included in the match:

您需要使用断言背后的外观，以便它不包含在匹配中：

grep -Po '(?<=scheme_version":)[0-9]+'

Answer 3

回答by Marc O'Morain

I would recommend that you use jqfor the job. jq is a command-line JSON processor.

我建议你使用jq来完成这项工作。jq 是一个命令行 JSON 处理器。

$ cat tmp
{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}

$ cat tmp | jq .scheme_version
1234

Answer 4

回答by kris.zhang

You can do this:

你可以这样做：

$ echo '{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}' | awk -F ':' '{print }' | tr -d '}'

Answer 5

回答by ClarkZinzow

As an alternative to the positive lookbehind method suggested by SiegeX, you can reset the match starting point to directly after scheme_version":with the \Kescape sequence. E.g.,

作为 SiegeX 建议的正向后视方法的替代方法，您可以scheme_version":使用\K转义序列将匹配起点直接重置为 after 。例如，

$ grep -Po 'scheme_version":\K[0-9]+'

This restarts the matching process after having matched scheme_version":, and tends to have far better performance than the positive lookbehind. Comparing the two on regexp101 demonstrates that the reset match start method takes 37 steps and 1ms, while the positive lookbehind method takes 194 steps and 21ms.

这在匹配后重新启动匹配过程scheme_version":，并且往往比积极的lookbehind具有更好的性能。在 regexp101 上比较两者表明，重置匹配启动方法需要 37 步和 1 毫秒，而正向后视方法需要 194 步和 21 毫秒。

You can compare the performance yourself on regex101and you can read more about resetting the match starting point in the PCRE documentation.

您可以自己在regex101上比较性能，并且可以在PCRE 文档中阅读有关重置匹配起点的更多信息。

Answer 6

回答by kenorb

To avoid using greps PCRE feature which is available in GNU grep, but not in BSD version, another method is to use ripgrep, e.g.

为了避免使用grep在GNU 中grep可用但在BSD 版本中不可用的 PCRE 功能，另一种方法是使用ripgrep，例如

$ rg -o 'scheme_version.?:(\d+)' -r '' <file.json 
1234

-rCapture group indices (e.g., $5) and names (e.g., $foo).

-r捕获组索引（例如，$5）和名称（例如，$foo）。

Another example with Python and json.toolmodulewhich can validate and pretty-print:

另一个可以验证和漂亮打印的Python 和json.tool模块示例：

$ python -mjson.tool file.json | rg -o 'scheme_version[^\d]+(\d+)' -r ''
1234

Related: Can grep output only specified groupings that match?

相关：grep 只能输出匹配的指定分组吗？

Answer 7

回答by Alexandre Hamon

Improving @potong's answer that works only to get "scheme_version", you can use this expression :

改进@potong 的答案，该答案仅适用于“scheme_version”，您可以使用以下表达式：

$ echo '{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}' | sed -n 's/.*"_id":["]*\([^(",})]*\)[",}].*//p'
scheme_version

$ echo '{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}' | sed -n 's/.*"_rev":["]*\([^(",})]*\)[",}].*//p'
4-cad1842a7646b4497066e09c3788e724

$ echo '{"_id":"scheme_version","_rev":"4-cad1842a7646b4497066e09c3788e724","scheme_version":1234}' | sed -n 's/.*"scheme_version":["]*\([^(",})]*\)[",}].*//p'
1234

Linux grep：组捕获

提问by lstipakov

采纳答案by potong

回答by SiegeX

回答by Marc O'Morain

回答by kris.zhang

回答by ClarkZinzow

回答by kenorb

回答by Alexandre Hamon

相关推荐

最近更新

标签

Linux grep：组捕获

提问by lstipakov

采纳答案by potong

回答by SiegeX

回答by Marc O'Morain

回答by kris.zhang

回答by ClarkZinzow

回答by kenorb

回答by Alexandre Hamon

相关推荐

Linux 永久改变 GDB 中的反汇编风味

在 C# 中以编程方式创建 HTML 网页

Linux wget：下载的文件名

PropertyInfo.GetValue() - 如何在 C# 中使用反射索引到泛型参数？

相关推荐

最近更新

标签