bash 如何使用 sed/awk 查找/替换和递增匹配的数字?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14348432/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 23:12:23  来源:igfitidea点击:

How to find/replace and increment a matched number with sed/awk?

bashsedawk

提问by Ian

Straight to the point, I'm wondering how to use grep/find/sed/awk to match a certain string (that ends with a number) and increment that number by 1. The closest I've come is to concatenate a 1 to the end (which works well enough) because the main point is to simply change the value. Here's what I'm currently doing:

直截了当,我想知道如何使用 grep/find/sed/awk 来匹配某个字符串(以数字结尾)并将该数字增加 1。我最接近的是将 1 连接到最后(效果很好),因为重点是简单地更改值。这是我目前正在做的事情:

find . -type f | xargs sed -i 's/\(\?cache_version\=[0-9]\+\)//g'

Since I couldn't figure out how to increment the number, I captured the whole thing and just appended a "1". Before, I had something like this:

由于我不知道如何增加数字,我捕获了整个内容并附加了一个“1”。之前,我有这样的事情:

find . -type f | xargs sed -i 's/\?cache_version\=\([0-9]\+\)/?cache_version=/g'

So at least I understand how to capture what I need.

所以至少我明白如何捕捉我需要的东西。

Instead of explaining what this is for, I'll just explain what I want it to do. It should find text in any file, recursively, based on the current directory (isn't important, it could be any directory, so I'd configure that later), that matches "?cache_version=" with a number. It will then increment that number and replace it in the file.

与其解释这是为了什么,我只会解释我想要它做什么。它应该根据当前目录递归地查找任何文件中的文本(不重要,它可以是任何目录,所以我稍后会配置它),匹配“?cache_version=”和一个数字。然后它将增加该数字并在文件中替换它。

Currently the stuff I have above works, it's just that I can't increment that found number at the end. It would be nicer to be able to increment instead of appending a "1" so that the future values wouldn't be "11", "111", "1111", "11111", and so on.

目前我上面的东西有效,只是我不能在最后增加找到的数字。能够增加而不是附加“1”会更好,这样未来的值就不会是“11”、“111”、“1111”、“11111”等等。

I've gone through dozens of articles/explanations, and often enough, the suggestion is to use awk, but I cannot for the life of me mix them. The closest I came to using awk, which doesn't actually replace anything, is:

我已经阅读了数十篇文章/解释,并且通常建议使用awk,但我终生无法将它们混合在一起。我最接近使用的awk,实际上并没有取代任何东西的是:

grep -Pro '(?<=\?cache_version=)[0-9]+' . | awk -F: '{ print "match is", +1 }'

I'm wondering if there's some way to pipe a sedat the end and pass the original file name so that sedcan have the file name and incremented number (from the awk), or whatever it needs that xargshas.

我想知道是否有某种方法可以sed在末尾通过管道传递a并传递原始文件名,以便sed可以具有文件名和递增编号(来自awk),或者它需要的任何内容xargs

Technically, this number has no importance; this replacement is mainly to make sure there is a new number there, 100% for sure different than the last. So as I was writing this question, I realized I might as well use the system time - seconds since epoch (the technique often used by AJAX to eliminate caching for subsequent "identical" requests). I ended up with this, and it seems perfect:

从技术上讲,这个数字并不重要;这种替换主要是为了确保那里有一个新号码,100% 肯定与上一个不同。因此,当我在写这个问题时,我意识到我最好使用系统时间 - 自纪元以来的秒数(AJAX 经常使用的技术来消除后续“相同”请求的缓存)。我最终得到了这个,它看起来很完美:

CXREPLACETIME=`date +%s`; find . -type f | xargs sed -i "s/\(\?cache_version\=\)[0-9]\+/$CXREPLACETIME/g"

(I store the value first so all files get the same value, in case it spans multiple seconds for whatever reason)

(我首先存储该值,以便所有文件都获得相同的值,以防它因任何原因跨越多秒)

But I would still love to know the original question, on incrementing a matched number. I'm guessing an easy solution would be to make it a bash script, but still, I thought there would be an easier way than looping through every file recursively and checking its contents for a match then replacing, since it's simply incrementing a matched number...not much else logic. I just don't want to write to any other files or something like that - it should do it in place, like seddoes with the "i" option.

但我仍然很想知道增加匹配数字的原始问题。我猜一个简单的解决方案是使它成为一个 bash 脚本,但是,我仍然认为有一种比递归遍历每个文件并检查其内容是否匹配然后替换更简单的方法,因为它只是增加匹配的数字...没有其他逻辑。我只是不想写入任何其他文件或类似的东西 - 它应该就地完成,就像sed使用“i”选项一样。

回答by Kent

I think finding file isn't the difficult part for you. I therefore just go to the point, to do the +1 calculation. If you have gnu sed, it could be done in this way:

我认为查找文件对您来说不是困难的部分。因此,我只是切入正题,进行+1 计算。如果您有gnu sed,则可以通过以下方式完成:

sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "$((+1))"/ge' file

let's take an example:

让我们举个例子:

kent$  cat test 
ello
barbaz?cache_version=3fooooo
bye

kent$  sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "$((+1))"/ge' test     
ello                                                                             
barbaz?cache_version=4fooooo
bye

you could add -i option if you like.

如果您愿意,可以添加 -i 选项。

edit

编辑

/eallows you to pass matched part to external command, and do substitution with the execution result. Gnu sed only.

/e允许您将匹配的部分传递给外部命令,并用执行结果进行替换。仅限 Gnu sed。

see this example: external command/tool echo, bcare used

看到这个例子:外部命令/工具echobc用于

kent$  echo "result:3*3"|sed -r 's/(result:)(.*)/echo $(echo ""\|bc)/ge'       

gives output:

给出输出:

result:9

you could use other powerful external command, like cut, sed (again), awk...

您可以使用其他强大的外部命令,例如 cut、sed(再次)、awk...

回答by Martijn

Pure sedversion:

sed版本:

This version has no dependencies on other commands or environment variables. It uses explicit carrying. For carry I use the @ symbol, but another name can be used if you like. Use something that is not present in your input file. First it finds SEARCHSTRING<number>and appends a @ to it. It repeats incrementing digits that have a pending carry (that is, have a carry symbol after it: [0-9]@) If 9 was incremented, this increment yields a carry itself, and the process will repeat until there are no more pending carries. Finally, carries that were yielded but not added to a digit yet are replaced by 1.

此版本不依赖于其他命令或环境变量。它使用显式携带。对于进位,我使用 @ 符号,但如果您愿意,也可以使用其他名称。使用输入文件中不存在的内容。首先它找到SEARCHSTRING<number>并附加一个@。它重复递增具有未决进位的数字(即,其后有一个进位符号:)[0-9]@如果 9 递增,则此递增本身会产生进位,并且该过程将重复,直到没有更多未决进位为止。最后,产生但尚未添加到数字的进位被 1 替换。

sed "s/SEARCHSTRING[0-9]*[0-9]/&@/g;:a {s/0@/1/g;s/1@/2/g;s/2@/3/g;s/3@/4/g;s/4@/5/g;s/5@/6/g;s/6@/7/g;s/7@/8/g;s/8@/9/g;s/9@/@0/g;t a};s/@/1/g" numbers.txt

回答by Birei

This perlcommand will search all files in current directory (without traverse it, you will need File::Findmodule or similar for that more complex task) and will increment the number of a line that matches cache_version=. It uses the /eflag of the regular expression that evaluates the replacement part.

perl命令将搜索当前目录中的所有文件(不遍历它,您将需要File::Find模块或类似的来执行更复杂的任务)并将增加匹配cache_version=. 它使用/e评估替换部分的正则表达式的标志。

perl -i.bak -lpe 'BEGIN { sub inc { my ($num) = @_; ++$num } } s/(cache_version=)(\d+)/ . (inc())/eg' *

I tested it with filein current directory with following data:

file在当前目录中使用以下数据对其进行了测试:

hello
cache_version=3
bye

It backups original file (ls -1):

它备份原始文件 ( ls -1):

file
file.bak

And filenow with:

file现在:

hello
cache_version=4
bye

I hope it can be useful for what you are looking for.

我希望它对您正在寻找的东西有用。



UPDATEto use File::Findfor traversing directories. It accepts *as argument but will discard them with those found with File::Find. The directory to begin the search is the current of execution of the script. It is hardcoded in the line find( \&wanted, "." ).

UPDATE使用File::Find遍历目录。它接受*作为参数,但会将它们与File::Find. 开始搜索的目录是脚本的当前执行目录。它被硬编码在行中find( \&wanted, "." )

perl -MFile::Find -i.bak -lpe '

    BEGIN { 
        sub inc { 
            my ($num) = @_; 
            ++$num 
        }

        sub wanted {
            if ( -f && ! -l ) {  
                push @ARGV, $File::Find::name;
            }
        }

        @ARGV = ();
        find( \&wanted, "." );
    }

    s/(cache_version=)(\d+)/ . (inc())/eg

' *

回答by David Ravetti

This is ugly (I'm a little rusty), but here's a start using sed:

这很丑陋(我有点生疏),但这是使用 sed 的开始:

orig="something1" ;
text=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)//"` ;
num=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)//"` ;
echo $text$(($num + 1))

With an original filename ($orig) of "something1", sed splits off the text and numeric portions into $textand $num, then these are combined in the final section with an incremented number, resulting in something2.

原始文件名 ( $orig) 为 "something1", sed 将文本和数字部分拆分为$text$num,然后将它们组合在最后一部分中并带有递增的数字,从而产生something2.

Just a start since it doesn't consider cases with numbers within the file name or names with no number at the end, but hopefully helps with your original goal of using sed.

只是一个开始,因为它不考虑文件名中包含数字的情况或末尾没有数字的名称,但希望对您使用 sed 的最初目标有所帮助。

This can actually be simplified within sed by using buffers, I believe (sed can operate recursively), but I'm reallyrusty with that aspect of it.

这实际上可以通过使用缓冲区在 sed 中简化,我相信(sed 可以递归操作),但我对它的这方面真的很生疏。