bash 用 Unix 文件中的另一个列表替换字符串列表的有效方法是什么？

Question

提问by

Suppose I have two lists of strings (list A and list B) with the exact same number of entries, N, in each list, and I want to replace all occurrences of the the nth element of A with the nth element of B in a file in Unix (ideally using Bash scripting).

假设我有两个字符串列表（列表 A 和列表 B），每个列表中的条目数完全相同，N，并且我想用 B 中的第 n 个元素替换 A 的第 n 个元素的所有出现Unix 中的文件（最好使用 Bash 脚本）。

What's the most efficient way to do this?

执行此操作的最有效方法是什么？

An inefficient way would be to make N calls to "sed s/stringA/stringB/g".

一种低效的方法是对“ sed s/stringA/stringB/g”进行 N 次调用。

Answer 1

采纳答案by glenn Hymanman

This will do it in one pass. It reads listA and listB into awk arrays, then for each line of the linput, it examines each word and if the word is found in listA, the word is replaced by the corresponding word in listB.

这将一次性完成。它将 listA 和 listB 读入 awk 数组，然后对于 linput 的每一行，它检查每个单词，如果在 listA 中找到该单词，则将该单词替换为 listB 中的相应单词。

awk '
    FILENAME == ARGV[1] { listA[] = FNR; next }
    FILENAME == ARGV[2] { listB[FNR] = ; next }
    {
        for (i = 1; i <= NF; i++) {
            if ($i in listA) {
                $i = listB[listA[$i]]
            }
        }
        print
    }
' listA listB filename > filename.new
mv filename.new filename

I'm assuming the strings in listA do not contain whitespace (awk's default field separator)

我假设 listA 中的字符串不包含空格（awk 的默认字段分隔符）

Answer 2

回答by Jonathan Leffler

Make one call to sedthat writes the sed script, and another to use it? If your lists are in files listAand listB, then:

调用一个sed编写 sed 脚本的调用，另一个调用它来使用它？如果您的列表在文件listA和中listB，则：

paste -d : listA listB | sed 's/\([^:]*\):\([^:]*\)/s%%%/' > sed.script
sed -f sed.script files.to.be.mapped.*

I'm making some sweeping assumptions about 'words' not containing either colon or percent symbols, but you can adapt around that. Some versions of sedhave upper bounds on the number of commands that can be specified; if that's a problem because your word lists are big enough, then you may have to split the generated sed script into separate files which are applied - or change to use something without the limit (Perl, for example).

我正在对不包含冒号或百分比符号的“单词”做出一些全面的假设，但您可以对此进行调整。某些版本对sed可以指定的命令数量有上限；如果这是一个问题，因为您的单词列表足够大，那么您可能必须将生成的 sed 脚本拆分为应用的单独文件 - 或者更改为使用没有限制的内容（例如 Perl）。

Another item to be aware of is sequence of changes. If you want to swap two words, you need to craft your word lists carefully. In general, if you map (1) wordA to wordB and (2) wordB to wordC, it matters whether the sed script does mapping (1) before or after mapping (2).

另一个需要注意的项目是变化的顺序。如果您想交换两个单词，则需要仔细制作单词列表。通常，如果您将 (1) wordA 映射到 wordB 和 (2) wordB 到 wordC，则 sed 脚本在映射 (2) 之前还是之后进行映射 (1) 很重要。

The script shown is not careful about word boundaries; you can make it careful about them in various ways, depending on the version of sedyou are using and your criteria for what constitutes a word.

显示的脚本不注意单词边界；您可以通过各种方式来谨慎对待它们，具体取决于sed您使用的版本以及构成单词的标准。

Answer 3

回答by AXE Labs

I needed to do something similar, and I wound up generating sed commands based on a map file:

我需要做一些类似的事情，最后我根据地图文件生成了 sed 命令：

$ cat file.map
abc => 123
def => 456
ghi => 789

$ cat stuff.txt
abc jdy kdt
kdb def gbk
qng pbf ghi
non non non
try one abc

$ sed `cat file.map | awk '{print "-e s/""/""/"}'`<<<"`cat stuff.txt`"
123 jdy kdt
kdb 456 gbk
qng pbf 789
non non non
try one 123

Make sure your shell supports as many parameters to sed as you have in your map.

确保您的 shell 支持与地图中一样多的 sed 参数。

Answer 4

回答by glenn Hymanman

This is fairly straightforward with Tcl:

这对于 Tcl 来说相当简单：

set fA [open listA r]
set fB [open listB r]
set fin [open input.file r]
set fout [open output.file w]

# read listA and listB and create the mapping of corresponding lines
while {[gets $fA strA] != -1} {
    set strB [gets $fB]
    lappend map $strA $strB
}

# apply the mapping to the input file
puts $fout [string map $map [read $fin]]

# if the file is large, do it line by line instead
#while {[gets $fin line] != -1} {
#    puts $fout [string map $map $line]
#}

close $fA
close $fB
close $fin
close $fout

file rename output.file input.file

Answer 5

回答by ghostdog74

you can do this in bash. Get your lists into arrays.

你可以在bash. 将您的列表放入数组中。

listA=(a b c)
listB=(d e f)
data=$(<file)
echo "${data//${listA[2]}/${listB[2]}}" #change the 3rd element. Redirect to file where necessary

Answer 6

回答by Fritz G. Mehner

Use tr(1) (translate or delete characters):

使用 tr(1)（翻译或删除字符）：

 cat file | tr 'abc' 'XYZ' > file_new
 mv file_new file

bash 用 Unix 文件中的另一个列表替换字符串列表的有效方法是什么？

提问by

采纳答案by glenn Hymanman

回答by Jonathan Leffler

回答by AXE Labs

回答by glenn Hymanman

回答by ghostdog74

回答by Fritz G. Mehner

相关推荐

最近更新

标签

bash 用 Unix 文件中的另一个列表替换字符串列表的有效方法是什么？

提问by

采纳答案by glenn Hymanman

回答by Jonathan Leffler

回答by AXE Labs

回答by glenn Hymanman

回答by ghostdog74

回答by Fritz G. Mehner

相关推荐

bash 中的回显参数

bash flock：如果无法获得锁则退出

在 Bash 中使用 sed 转义域名中的点

bash for 循环在命令行中工作，但在脚本中失败

相关推荐

最近更新

标签