使用 Sed / Regex 根据分隔符在 bash 中拆分一行

Question

提问by horatio1701d

Regex rookie and hoping to change that. I have the following seemingly very simple problem that I cannot figure the correct regex implementation to parse properly. Basically I have a file that has lines that looks like this:

正则表达式新手，希望能改变这一点。我有以下看似非常简单的问题，我无法找到正确的正则表达式实现来正确解析。基本上我有一个文件，它的行看起来像这样：

time:3:35PM

I am just trying to cut out all characters up to and including ONLY FIRST ':' delimiter and keep the rest intact with sed so that I can process on many files with same format. What I am trying to get is this:

我只是想删除所有字符，直到并包括仅第一个 ':' 分隔符，并使用 sed 保持其余的完整，以便我可以处理许多具有相同格式的文件。我想要得到的是：

3:35PM

The below is the closest I got but is just using the last occurrence of the delimiter instead of the first.:

下面是我得到的最接近的，但只是使用最后一次出现的分隔符而不是第一个。：

sed 's/.*://'

I have also tried with python but have challenges with applying a python function to iterate through all lines in many files as opposed to just one file.

我也尝试过使用 python，但是在应用 python 函数来迭代多个文件中的所有行而不是一个文件时遇到了挑战。

Any help would be greatly appreciated.

任何帮助将不胜感激。

Answer 1

回答by kojiro

You can do this in just about every text processing tool (many without using regular expressions at all).

您几乎可以在所有文本处理工具中执行此操作（许多根本不使用正则表达式）。

ed

编辑

If the in-place editing is really important, the canonical correct way is not sed (the streameditor) but ed(the fileeditor).

如果就地编辑真的很重要，那么规范的正确方法不是 sed（流编辑器）而是ed（文件编辑器）。

ed "$file" << EOF
,s/^[^:]*://g
w
EOF

sed

(Pretty much the same commands as ed, formatted a little differently)

（与 ed 几乎相同的命令，格式略有不同）

sed 's/^[^:]*://' < "$file" > "$file".new
mv "$file".new "$file"

BASH

巴什

This one doesn't cause any new processes to be spawned. (For whatever that's worth.)

这不会导致产生任何新进程。（不管那是值得的。）

while IFS=: read _ time; do
    printf '%s\n' "$time"
done < "$file" > "$file".new
mv "$file".new "$file"

awk

awk -F: 'BEGIN{ OFS=":" } { print , }' < "$file" > "$file".new
mv "$file".new "$file"

cut

切

cut -d: -f2- < "$file" > "$file".new
mv "$file".new "$file"

Answer 2

回答by Johnsyweb

Since you don't need a regular expression to match a single, known character, consider using cutinstead of sed.

由于您不需要正则表达式来匹配单个已知字符，因此请考虑使用cut而不是sed。

This simple expression sets :as the d-elimiter and emits f-ields 2, onwards (-):

这个简单的表达式设置:为d-elimiter 并发出f-ields 2，向前 ( -)：

cut -d: -f2-

Example:

例子：

% echo 'time:3:35PM' | cut -d: -f2-
3:35PM

Answer 3

回答by hwnd

To remove every instance up to :and including the :you could do..

要删除每个实例:，包括:您可以做的..

sed -i.bak 's/^[^:]*://' file.txt

on multiple .txtfiles

在多个.txt文件上

sed -i.bak 's/^[^:]*://' *.txt

The -ioption specifies that files are to be edited in-place. By creating a temporary file and sending output to this file rather than to the standard output.

该-i选项指定文件将被就地编辑。通过创建一个临时文件并将输出发送到这个文件而不是标准输出。

Answer 4

回答by Aleks-Daniel Jakimenko-A.

kojiro's answerhas a plenty of great alternatives, but you have asked how to do that with regex. Here are some pure regex solutions:

kojiro 的回答有很多不错的选择，但是您已经问过如何使用regex. 以下是一些纯正则表达式解决方案：

grep -oP '[^:]*:\K.*' file.txt

\Kmakes it forget everything before the occurrence of \K. But if you know the exact prefix length then you can use lookaroundfeature:

\K使它忘记发生之前的一切\K。但是如果您知道确切的前缀长度，那么您可以使用环视功能：

grep -oP '(?<=^time:).*' file.txt

Note that most of regex implementations do not support these features. You can use it in grepwith -Pflag and perlitself. I wonder if any other utility supports these.

请注意，大多数正则表达式实现不支持这些功能。您可以将它grep与-P标志和perl本身一起使用。我想知道是否有任何其他实用程序支持这些。

Answer 5

回答by geekdenz

Please consider my answer here:

请在这里考虑我的回答：

How to use regex with cut at the command line?

如何在命令行中使用带有 cut 的正则表达式？

You could for example just write:

例如，您可以只写：

echo 'time:3:35PM' | cutr -d : -f 2- -r :

In your particular case, you could simply use cutthough:

在您的特定情况下，您可以简单地使用cut：

echo 'time:3:35PM' | cut -d : -f 2-

Any feedback welcome. cutrisn't perfect yet, but before I invest too much time into it, I wanted to get some feedback.

欢迎任何反馈。cutr尚不完美，但在我投入太多时间之前，我想获得一些反馈。

使用 Sed / Regex 根据分隔符在 bash 中拆分一行

提问by horatio1701d

回答by kojiro

ed

编辑

sed

sed

BASH

巴什

awk

awk

cut

切

回答by Johnsyweb

回答by hwnd

回答by Aleks-Daniel Jakimenko-A.

回答by geekdenz

相关推荐

最近更新

标签

使用 Sed / Regex 根据分隔符在 bash 中拆分一行

提问by horatio1701d

回答by kojiro

ed

编辑

sed

sed

BASH

巴什

awk

awk

cut

切

回答by Johnsyweb

回答by hwnd

回答by Aleks-Daniel Jakimenko-A.

回答by geekdenz

相关推荐

bash 从shell脚本内的php脚本中检索退出状态

bash 在子目录中查找和 gzip 文件

bash 如何替换文件中的哈希字符？

bash 如何等待脚本产生的所有子（和孙子等）进程

相关推荐

最近更新

标签