python相当于sed

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12714415/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:40:32  来源:igfitidea点击:

python equivalent to sed

python

提问by user1601716

Is there a way, without a double loop to accomplish what the following sed command does

有没有办法,没有双循环来完成以下 sed 命令的作用

Input:

输入:

Time
Banana
spinach
turkey

sed -i "/Banana/ s/$/Toothpaste/" file

sed -i "/Banana/ s/$/Toothpaste/" file

Output:

输出:

Time
BananaToothpaste
spinach
turkey

What I have so far is a double list which would take a long time to go through both.

到目前为止,我拥有的是一个双重清单,需要很长时间才能完成这两个清单。

List a has a bunch of numbers list b has a the same bunch of numbers but in a different order

列表 a 有一堆数字列表 b 有一组相同的数字但顺序不同

For each entry in A i want to find the line in B with that same number and add value C to the end of it.

对于 A 中的每个条目,我想在 B 中找到具有相同编号的行,并将值 C 添加到它的末尾。

Hope this makes sense, even if my example doesn't.

希望这是有道理的,即使我的例子没有。

I was doing the following in Bash and it was working however it was super slow...

我在 Bash 中执行以下操作并且它正在工作但是它超级慢......

for line in $(cat DATSRCLN.txt.utf8); do
        srch=$(echo $line | awk -F'^' '{print }');
        rep=$(echo $line | awk -F'^' '{print }');
        sed -i "/$(echo $srch)/ s/$/^$(echo $rep)/" tmp.1;
done

Thanks!

谢谢!

回答by heltonbiker

Using re.sub():

使用re.sub()

newstring = re.sub('(Banana)', r'Toothpaste', oldstring)

This catches one group (between first parentheses), and replaces it by ITSELF (the \number part) followed by a desired suffix. It is needed to use r''(raw string) so that the escape is correctly interpreted.

这会捕获一个组(在第一个括号之间),并将其替换为 ITSELF(\number 部分),后跟所需的后缀。需要使用r''(raw string) 以便正确解释转义符。

回答by Vyktor

It's possible to do this using tmp file with low system requirements and only one iteration without copying whole file into the memory:

可以使用具有低系统要求且仅一次迭代的 tmp 文件来执行此操作,而无需将整个文件复制到内存中:

#/usr/bin/python
import tempfile
import shutil
import os

newfile = tempfile.mkdtemp()
oldfile = 'stack.txt'

f = open(oldfile)
n = open(newfile,'w')

for i in f:
        if i.find('Banana') == -1:
                n.write(i)
                continue

        # Last row
        if i.find('\n') == -1:
                i += 'ToothPaste'
        else:
                i = i.rstrip('\n')
                i += 'ToothPaste\n'

        n.write(i) 

f.close()
n.close()

os.remove(oldfile)
shutil.move(newfile,oldfile)

回答by M. Adel

If you are using Python3 the following module will help you: https://github.com/mahmoudadel2/pysed

如果您使用的是 Python3,以下模块将帮助您:https: //github.com/mahmoudadel2/pysed

wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py

Place the module file into your Python3 modules path, then:

将模块文件放入你的 Python3 模块路径中,然后:

import pysed
pysed.replace(<Old string>, <Replacement String>, <Text File>)
pysed.rmlinematch(<Unwanted string>, <Text File>)
pysed.rmlinenumber(<Unwanted Line Number>, <Text File>)

回答by shrewmouse

You can actually call sed from python. Many ways to do this but I like to use the sh module. (yum -y install python-sh)

你实际上可以从 python 调用 sed 。有很多方法可以做到这一点,但我喜欢使用 sh 模块。(yum -y 安装 python-sh)

The output of my example program is a follows.

我的示例程序的输出如下。

[me@localhost sh]$ cat input 
Time
Banana
spinich
turkey
[me@localhost sh]$ python test_sh.py 
[me@localhost sh]$ cat input 
Time
Toothpaste
spinich
turkey
[me@localhost sh]$ 

Here is test_sh.py

这是 test_sh.py

import sh

sh.sed('-i', 's/Banana/Toothpaste/', 'input')

This will probably only work under LINUX.

这可能只适用于 LINUX。

回答by Oz123

A late comer to the race, here is my implementation for sed in Python:

比赛的后来者,这是我在 Python 中对 sed 的实现:

import re
import shutil
from tempfile import mkstemp


def sed(pattern, replace, source, dest=None, count=0):
    """Reads a source file and writes the destination file.

    In each line, replaces pattern with replace.

    Args:
        pattern (str): pattern to match (can be re.pattern)
        replace (str): replacement str
        source  (str): input filename
        count (int): number of occurrences to replace
        dest (str):   destination filename, if not given, source will be over written.        
    """

    fin = open(source, 'r')
    num_replaced = count

    if dest:
        fout = open(dest, 'w')
    else:
        fd, name = mkstemp()
        fout = open(name, 'w')

    for line in fin:
        out = re.sub(pattern, replace, line)
        fout.write(out)

        if out != line:
            num_replaced += 1
        if count and num_replaced > count:
            break
    try:
        fout.writelines(fin.readlines())
    except Exception as E:
        raise E

    fin.close()
    fout.close()

    if not dest:
        shutil.move(name, source) 

examples:

例子:

sed('foo', 'bar', "foo.txt") 

will replace all 'foo' with 'bar' in foo.txt

将 foo.txt 中的所有 'foo' 替换为 'bar'

sed('foo', 'bar', "foo.txt", "foo.updated.txt")

will replace all 'foo' with 'bar' in 'foo.txt' and save the result in "foo.updated.txt".

将在“foo.txt”中用“bar”替换所有“foo”并将结果保存在“foo.updated.txt”中。

sed('foo', 'bar', "foo.txt", count=1)

will replace only the first occurrence of 'foo' with 'bar' and save the result in the original file 'foo.txt'

将仅用 'bar' 替换第一次出现的 'foo' 并将结果保存在原始文件 'foo.txt' 中

回答by leafonsword

massedit

批量编辑

you could use it as a command line tool:

您可以将其用作命令行工具:

# Will change all test*.py in subdirectories of tests.
massedit.py -e "re.sub('failIf', 'assertFalse', line)" -s tests test*.py

you also could use it as a library:

您也可以将其用作库:

import massedit
filenames = ['massedit.py']
massedit.edit_files(filenames, ["re.sub('Jerome', 'J.', line)"])

回答by Brad Parks

I found the answer supplied by Oz123to be great, but didn't seem to work 100%. I'm new to python, but modded it and wrapped it up to run in a bash script. This works on osx, using python 2.7.

我发现Oz123 提供答案很棒,但似乎没有 100% 有效。我是 python 的新手,但对其进行了修改并将其打包以在 bash 脚本中运行。这适用于 osx,使用 python 2.7。

# Replace 1 occurrence in file /tmp/1
$ search_replace "Banana" "BananaToothpaste" /tmp/1

# Replace 5 occurrences and save in /tmp/2
$ search_replace "Banana" "BananaToothpaste" /tmp/1 /tmp/2 5

search_replace

搜索替换

#!/usr/bin/env python
import sys
import re
import shutil
from tempfile import mkstemp

total = len(sys.argv)-1
cmdargs = str(sys.argv)
if (total < 3):
    print ("Usage: SEARCH_FOR REPLACE_WITH IN_FILE {OUT_FILE} {COUNT}")
    print ("by default, the input file is replaced")
    print ("and the number of times to replace is 1")
    sys.exit(1)

# Parsing args one by one 
search_for = str(sys.argv[1])
replace_with = str(sys.argv[2])
file_name = str(sys.argv[3])
if (total < 4):
    file_name_dest=file_name
else:
    file_name_dest = str(sys.argv[4])
if (total < 5):
    count = 1
else:
    count = int(sys.argv[5])

def sed(pattern, replace, source, dest=None, count=0):
    """Reads a source file and writes the destination file.

    In each line, replaces pattern with replace.

    Args:
        pattern (str): pattern to match (can be re.pattern)
        replace (str): replacement str
        source  (str): input filename
        count (int): number of occurrences to replace
        dest (str):   destination filename, if not given, source will be over written.        
    """

    fin = open(source, 'r')
    num_replaced = 0

    fd, name = mkstemp()
    fout = open(name, 'w')

    for line in fin:
        if count and num_replaced < count:
            out = re.sub(pattern, replace, line)
            fout.write(out)
            if out != line:
                num_replaced += 1
        else:
            fout.write(line)

    fin.close()
    fout.close()

    if file_name == file_name_dest:
        shutil.move(name, file_name) 
    else:
        shutil.move(name, file_name_dest) 

sed(search_for, replace_with, file_name, file_name_dest, count)