使用 Sed / Regex 根据分隔符在 bash 中拆分一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18806186/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Splitting a line in bash based on delimiter with Sed / Regex
提问by horatio1701d
Regex rookie and hoping to change that. I have the following seemingly very simple problem that I cannot figure the correct regex implementation to parse properly. Basically I have a file that has lines that looks like this:
正则表达式新手,希望能改变这一点。我有以下看似非常简单的问题,我无法找到正确的正则表达式实现来正确解析。基本上我有一个文件,它的行看起来像这样:
time:3:35PM
I am just trying to cut out all characters up to and including ONLY FIRST ':' delimiter and keep the rest intact with sed so that I can process on many files with same format. What I am trying to get is this:
我只是想删除所有字符,直到并包括仅第一个 ':' 分隔符,并使用 sed 保持其余的完整,以便我可以处理许多具有相同格式的文件。我想要得到的是:
3:35PM
The below is the closest I got but is just using the last occurrence of the delimiter instead of the first.:
下面是我得到的最接近的,但只是使用最后一次出现的分隔符而不是第一个。:
sed 's/.*://'
I have also tried with python but have challenges with applying a python function to iterate through all lines in many files as opposed to just one file.
我也尝试过使用 python,但是在应用 python 函数来迭代多个文件中的所有行而不是一个文件时遇到了挑战。
Any help would be greatly appreciated.
任何帮助将不胜感激。
回答by kojiro
You can do this in just about every text processing tool (many without using regular expressions at all).
您几乎可以在所有文本处理工具中执行此操作(许多根本不使用正则表达式)。
ed
编辑
If the in-place editing is really important, the canonical correct way is not sed (the streameditor) but ed
(the fileeditor).
如果就地编辑真的很重要,那么规范的正确方法不是 sed(流编辑器)而是ed
(文件编辑器)。
ed "$file" << EOF
,s/^[^:]*://g
w
EOF
sed
sed
(Pretty much the same commands as ed, formatted a little differently)
(与 ed 几乎相同的命令,格式略有不同)
sed 's/^[^:]*://' < "$file" > "$file".new
mv "$file".new "$file"
BASH
巴什
This one doesn't cause any new processes to be spawned. (For whatever that's worth.)
这不会导致产生任何新进程。(不管那是值得的。)
while IFS=: read _ time; do
printf '%s\n' "$time"
done < "$file" > "$file".new
mv "$file".new "$file"
awk
awk
awk -F: 'BEGIN{ OFS=":" } { print , }' < "$file" > "$file".new
mv "$file".new "$file"
cut
切
cut -d: -f2- < "$file" > "$file".new
mv "$file".new "$file"
回答by Johnsyweb
Since you don't need a regular expression to match a single, known character, consider using cutinstead of sed.
由于您不需要正则表达式来匹配单个已知字符,因此请考虑使用cut而不是sed。
This simple expression sets :
as the d
-elimiter and emits f
-ields 2
, onwards (-
):
这个简单的表达式设置:
为d
-elimiter 并发出f
-ields 2
,向前 ( -
):
cut -d: -f2-
Example:
例子:
% echo 'time:3:35PM' | cut -d: -f2-
3:35PM
回答by hwnd
To remove every instance up to :
and including the :
you could do..
要删除每个实例:
,包括:
您可以做的..
sed -i.bak 's/^[^:]*://' file.txt
on multiple .txt
files
在多个.txt
文件上
sed -i.bak 's/^[^:]*://' *.txt
The -i
option specifies that files are to be edited in-place. By creating a temporary file and sending output to this file rather than to the standard output.
该-i
选项指定文件将被就地编辑。通过创建一个临时文件并将输出发送到这个文件而不是标准输出。
回答by Aleks-Daniel Jakimenko-A.
kojiro's answerhas a plenty of great alternatives, but you have asked how to do that with regex
. Here are some pure regex solutions:
kojiro 的回答有很多不错的选择,但是您已经问过如何使用regex
. 以下是一些纯正则表达式解决方案:
grep -oP '[^:]*:\K.*' file.txt
\K
makes it forget everything before the occurrence of \K
.
But if you know the exact prefix length then you can use lookaroundfeature:
\K
使它忘记发生之前的一切\K
。但是如果您知道确切的前缀长度,那么您可以使用环视功能:
grep -oP '(?<=^time:).*' file.txt
Note that most of regex implementations do not support these features. You can use it in grep
with -P
flag and perl
itself. I wonder if any other utility supports these.
请注意,大多数正则表达式实现不支持这些功能。您可以将它grep
与-P
标志和perl
本身一起使用。我想知道是否有任何其他实用程序支持这些。
回答by geekdenz
Please consider my answer here:
请在这里考虑我的回答:
How to use regex with cut at the command line?
You could for example just write:
例如,您可以只写:
echo 'time:3:35PM' | cutr -d : -f 2- -r :
In your particular case, you could simply use cut
though:
在您的特定情况下,您可以简单地使用cut
:
echo 'time:3:35PM' | cut -d : -f 2-
Any feedback welcome. cutr
isn't perfect yet, but before I invest too much time into it, I wanted to get some feedback.
欢迎任何反馈。cutr
尚不完美,但在我投入太多时间之前,我想获得一些反馈。