python相当于sed

Question

提问by user1601716

Is there a way, without a double loop to accomplish what the following sed command does

有没有办法，没有双循环来完成以下 sed 命令的作用

Input:

输入：

Time
Banana
spinach
turkey

sed -i "/Banana/ s/$/Toothpaste/" file

Output:

输出：

Time
BananaToothpaste
spinach
turkey

What I have so far is a double list which would take a long time to go through both.

到目前为止，我拥有的是一个双重清单，需要很长时间才能完成这两个清单。

List a has a bunch of numbers list b has a the same bunch of numbers but in a different order

列表 a 有一堆数字列表 b 有一组相同的数字但顺序不同

For each entry in A i want to find the line in B with that same number and add value C to the end of it.

对于 A 中的每个条目，我想在 B 中找到具有相同编号的行，并将值 C 添加到它的末尾。

Hope this makes sense, even if my example doesn't.

希望这是有道理的，即使我的例子没有。

I was doing the following in Bash and it was working however it was super slow...

我在 Bash 中执行以下操作并且它正在工作但是它超级慢......

for line in $(cat DATSRCLN.txt.utf8); do
        srch=$(echo $line | awk -F'^' '{print }');
        rep=$(echo $line | awk -F'^' '{print }');
        sed -i "/$(echo $srch)/ s/$/^$(echo $rep)/" tmp.1;
done

Thanks!

谢谢！

Answer 1

回答by heltonbiker

Using re.sub():

使用re.sub()：

newstring = re.sub('(Banana)', r'Toothpaste', oldstring)

This catches one group (between first parentheses), and replaces it by ITSELF (the \number part) followed by a desired suffix. It is needed to use r''(raw string) so that the escape is correctly interpreted.

这会捕获一个组（在第一个括号之间），并将其替换为 ITSELF（\number 部分），后跟所需的后缀。需要使用r''(raw string) 以便正确解释转义符。

Answer 2

回答by Vyktor

It's possible to do this using tmp file with low system requirements and only one iteration without copying whole file into the memory:

可以使用具有低系统要求且仅一次迭代的 tmp 文件来执行此操作，而无需将整个文件复制到内存中：

#/usr/bin/python
import tempfile
import shutil
import os

newfile = tempfile.mkdtemp()
oldfile = 'stack.txt'

f = open(oldfile)
n = open(newfile,'w')

for i in f:
        if i.find('Banana') == -1:
                n.write(i)
                continue

        # Last row
        if i.find('\n') == -1:
                i += 'ToothPaste'
        else:
                i = i.rstrip('\n')
                i += 'ToothPaste\n'

        n.write(i) 

f.close()
n.close()

os.remove(oldfile)
shutil.move(newfile,oldfile)

Answer 3

回答by M. Adel

If you are using Python3 the following module will help you: https://github.com/mahmoudadel2/pysed

如果您使用的是 Python3，以下模块将帮助您：https: //github.com/mahmoudadel2/pysed

wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py

Place the module file into your Python3 modules path, then:

将模块文件放入你的 Python3 模块路径中，然后：

import pysed
pysed.replace(<Old string>, <Replacement String>, <Text File>)
pysed.rmlinematch(<Unwanted string>, <Text File>)
pysed.rmlinenumber(<Unwanted Line Number>, <Text File>)

Answer 4

回答by shrewmouse

You can actually call sed from python. Many ways to do this but I like to use the sh module. (yum -y install python-sh)

你实际上可以从 python 调用 sed 。有很多方法可以做到这一点，但我喜欢使用 sh 模块。(yum -y 安装 python-sh)

The output of my example program is a follows.

我的示例程序的输出如下。

[me@localhost sh]$ cat input 
Time
Banana
spinich
turkey
[me@localhost sh]$ python test_sh.py 
[me@localhost sh]$ cat input 
Time
Toothpaste
spinich
turkey
[me@localhost sh]$

Here is test_sh.py

这是 test_sh.py

import sh

sh.sed('-i', 's/Banana/Toothpaste/', 'input')

This will probably only work under LINUX.

这可能只适用于 LINUX。

Answer 5

回答by Oz123

A late comer to the race, here is my implementation for sed in Python:

比赛的后来者，这是我在 Python 中对 sed 的实现：

import re
import shutil
from tempfile import mkstemp


def sed(pattern, replace, source, dest=None, count=0):
    """Reads a source file and writes the destination file.

    In each line, replaces pattern with replace.

    Args:
        pattern (str): pattern to match (can be re.pattern)
        replace (str): replacement str
        source  (str): input filename
        count (int): number of occurrences to replace
        dest (str):   destination filename, if not given, source will be over written.        
    """

    fin = open(source, 'r')
    num_replaced = count

    if dest:
        fout = open(dest, 'w')
    else:
        fd, name = mkstemp()
        fout = open(name, 'w')

    for line in fin:
        out = re.sub(pattern, replace, line)
        fout.write(out)

        if out != line:
            num_replaced += 1
        if count and num_replaced > count:
            break
    try:
        fout.writelines(fin.readlines())
    except Exception as E:
        raise E

    fin.close()
    fout.close()

    if not dest:
        shutil.move(name, source)

examples:

例子：

sed('foo', 'bar', "foo.txt")

will replace all 'foo' with 'bar' in foo.txt

将 foo.txt 中的所有 'foo' 替换为 'bar'

sed('foo', 'bar', "foo.txt", "foo.updated.txt")

will replace all 'foo' with 'bar' in 'foo.txt' and save the result in "foo.updated.txt".

将在“foo.txt”中用“bar”替换所有“foo”并将结果保存在“foo.updated.txt”中。

sed('foo', 'bar', "foo.txt", count=1)

will replace only the first occurrence of 'foo' with 'bar' and save the result in the original file 'foo.txt'

将仅用 'bar' 替换第一次出现的 'foo' 并将结果保存在原始文件 'foo.txt' 中

Answer 6

回答by leafonsword

massedit

批量编辑

you could use it as a command line tool:

您可以将其用作命令行工具：

# Will change all test*.py in subdirectories of tests.
massedit.py -e "re.sub('failIf', 'assertFalse', line)" -s tests test*.py

you also could use it as a library:

您也可以将其用作库：

import massedit
filenames = ['massedit.py']
massedit.edit_files(filenames, ["re.sub('Jerome', 'J.', line)"])

Answer 7

回答by Brad Parks

I found the answer supplied by Oz123to be great, but didn't seem to work 100%. I'm new to python, but modded it and wrapped it up to run in a bash script. This works on osx, using python 2.7.

我发现Oz123 提供的答案很棒，但似乎没有 100% 有效。我是 python 的新手，但对其进行了修改并将其打包以在 bash 脚本中运行。这适用于 osx，使用 python 2.7。

# Replace 1 occurrence in file /tmp/1
$ search_replace "Banana" "BananaToothpaste" /tmp/1

# Replace 5 occurrences and save in /tmp/2
$ search_replace "Banana" "BananaToothpaste" /tmp/1 /tmp/2 5

search_replace

搜索替换

#!/usr/bin/env python
import sys
import re
import shutil
from tempfile import mkstemp

total = len(sys.argv)-1
cmdargs = str(sys.argv)
if (total < 3):
    print ("Usage: SEARCH_FOR REPLACE_WITH IN_FILE {OUT_FILE} {COUNT}")
    print ("by default, the input file is replaced")
    print ("and the number of times to replace is 1")
    sys.exit(1)

# Parsing args one by one 
search_for = str(sys.argv[1])
replace_with = str(sys.argv[2])
file_name = str(sys.argv[3])
if (total < 4):
    file_name_dest=file_name
else:
    file_name_dest = str(sys.argv[4])
if (total < 5):
    count = 1
else:
    count = int(sys.argv[5])

def sed(pattern, replace, source, dest=None, count=0):
    """Reads a source file and writes the destination file.

    In each line, replaces pattern with replace.

    Args:
        pattern (str): pattern to match (can be re.pattern)
        replace (str): replacement str
        source  (str): input filename
        count (int): number of occurrences to replace
        dest (str):   destination filename, if not given, source will be over written.        
    """

    fin = open(source, 'r')
    num_replaced = 0

    fd, name = mkstemp()
    fout = open(name, 'w')

    for line in fin:
        if count and num_replaced < count:
            out = re.sub(pattern, replace, line)
            fout.write(out)
            if out != line:
                num_replaced += 1
        else:
            fout.write(line)

    fin.close()
    fout.close()

    if file_name == file_name_dest:
        shutil.move(name, file_name) 
    else:
        shutil.move(name, file_name_dest) 

sed(search_for, replace_with, file_name, file_name_dest, count)

python相当于sed

提问by user1601716

回答by heltonbiker

回答by Vyktor

回答by M. Adel

回答by shrewmouse

回答by Oz123

回答by leafonsword

回答by Brad Parks

search_replace

搜索替换

相关推荐

最近更新

标签

python相当于sed

提问by user1601716

回答by heltonbiker

回答by Vyktor

回答by M. Adel

回答by shrewmouse

回答by Oz123

回答by leafonsword

回答by Brad Parks

search_replace

搜索替换

相关推荐

Python 在字符串中查找最后一次出现的子字符串，替换它

Python 如何从一个文件中随机读取一行？

如何检查字符串是否是有效的python标识符？包括关键字检查？

如何扩展 Python 类 init

相关推荐

最近更新

标签