Linux 如何用python替换sed之类的文本?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4427542/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 23:59:07  来源:igfitidea点击:

How to do sed like text replace with python?

pythonregexlinux

提问by Maxim Veksler

I would like to enable all apt repositories in this file

我想启用此文件中的所有 apt 存储库

cat /etc/apt/sources.list
## Note, this file is written by cloud-init on first boot of an instance                                                                                                            
## modifications made here will not survive a re-bundle.                                                                                                                            
## if you wish to make changes you can:                                                                                                                                             
## a.) add 'apt_preserve_sources_list: true' to /etc/cloud/cloud.cfg                                                                                                                
##     or do the same in user-data
## b.) add sources in /etc/apt/sources.list.d                                                                                                                                       
#                                                                                                                                                                                   

# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to                                                                                                           
# newer versions of the distribution.                                                                                                                                               
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick main                                                                                                                   
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick main                                                                                                               

## Major bug fix updates produced after the final release of the                                                                                                                    
## distribution.                                                                                                                                                                    
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates main                                                                                                           
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates main                                                                                                       

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu                                                                                                         
## team. Also, please note that software in universe WILL NOT receive any                                                                                                           
## review or updates from the Ubuntu security team.                                                                                                                                 
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick universe                                                                                                               
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick universe                                                                                                           
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates universe
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates universe

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu 
## team, and may not be under a free licence. Please satisfy yourself as to
## your rights to use the software. Also, please note that software in 
## multiverse WILL NOT receive any review or updates from the Ubuntu
## security team.
# deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick multiverse
# deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick multiverse
# deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates multiverse
# deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates multiverse

## Uncomment the following two lines to add software from the 'backports'
## repository.
## N.B. software from this repository may not have been tested as
## extensively as that contained in the main release, although it includes
## newer versions of some applications which may provide useful features.
## Also, please note that software in backports WILL NOT receive any review
## or updates from the Ubuntu security team.
# deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-backports main restricted universe multiverse
# deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-backports main restricted universe multiverse

## Uncomment the following two lines to add software from Canonical's
## 'partner' repository.
## This software is not part of Ubuntu, but is offered by Canonical and the
## respective vendors as a service to Ubuntu users.
# deb http://archive.canonical.com/ubuntu maverick partner
# deb-src http://archive.canonical.com/ubuntu maverick partner

deb http://security.ubuntu.com/ubuntu maverick-security main
deb-src http://security.ubuntu.com/ubuntu maverick-security main
deb http://security.ubuntu.com/ubuntu maverick-security universe
deb-src http://security.ubuntu.com/ubuntu maverick-security universe
# deb http://security.ubuntu.com/ubuntu maverick-security multiverse
# deb-src http://security.ubuntu.com/ubuntu maverick-security multiverse

With sed this is a simple sed -i 's/^# deb/deb/' /etc/apt/sources.listwhat's the most elegant ("pythonic") way to do this?

使用 sed 这是一个简单sed -i 's/^# deb/deb/' /etc/apt/sources.list的最优雅(“pythonic”)的方法是什么?

采纳答案by elmotec

massedit.py (http://github.com/elmotec/massedit) does the scaffolding for you leaving just the regex to write. It's still in beta but we are looking for feedback.

massedit.py ( http://github.com/elmotec/massedit) 为您提供脚手架,只需要编写正则表达式。它仍处于测试阶段,但我们正在寻找反馈。

python -m massedit -e "re.sub(r'^# deb', 'deb', line)" /etc/apt/sources.list

will show the differences (before/after) in diff format.

将以差异格式显示差异(之前/之后)。

Add the -w option to write the changes to the original file:

添加 -w 选项以将更改写入原始文件:

python -m massedit -e "re.sub(r'^# deb', 'deb', line)" -w /etc/apt/sources.list

Alternatively, you can now use the api:

或者,您现在可以使用 api:

>>> import massedit
>>> filenames = ['/etc/apt/sources.list']
>>> massedit.edit_files(filenames, ["re.sub(r'^# deb', 'deb', line)"], dry_run=True)

回答by plundra

Not sure about elegant, but this ought to be pretty readable at least. For a sources.list it's fine to read all the lines before hand, for something larger you might want to change "in place" while looping through it.

不确定优雅,但这至少应该是相当可读的。对于sources.list,可以事先阅读所有行,对于更大的内容,您可能希望在循环时“就地”更改。

#!/usr/bin/env python
# Open file for reading and writing
with open("sources.list", "r+") as sources_file:
    # Read all the lines
    lines = sources_file.readlines()

    # Rewind and truncate
    sources_file.seek(0)
    sources_file.truncate()

    # Loop through the lines, adding them back to the file.
    for line in lines:
        if line.startswith("# deb"):
            sources_file.write(line[2:])
        else:
            sources_file.write(line)

EDIT: Use with-statement for better file-handling. Also forgot to rewind before truncate before.

编辑:使用with-statement 更好地处理文件。之前截断之前也忘了倒带。

回答by David Miller

You can do that like this:

你可以这样做:

with open("/etc/apt/sources.list", "r") as sources:
    lines = sources.readlines()
with open("/etc/apt/sources.list", "w") as sources:
    for line in lines:
        sources.write(re.sub(r'^# deb', 'deb', line))

The with statement ensures that the file is closed correctly, and re-opening the file in "w"mode empties the file before you write to it. re.sub(pattern, replace, string) is the equivalent of s/pattern/replace/ in sed/perl.

with 语句确保文件被正确关闭,并且"w"在写入文件之前以mode重新打开文件会清空文件。re.sub(pattern, replace, string) 相当于 sed/perl 中的 s/pattern/replace/。

Edit:fixed syntax in example

编辑:示例中的固定语法

回答by barti_ddu

You could do something like:

你可以这样做:

p = re.compile("^\# *deb", re.MULTILINE)
text = open("sources.list", "r").read()
f = open("sources.list", "w")
f.write(p.sub("deb", text))
f.close()

Alternatively (imho, this is better from organizational standpoint) you could split your sources.listinto pieces (one entry/one repository) and place them under /etc/apt/sources.list.d/

或者(恕我直言,从组织的角度来看这更好)您可以将您的内容sources.list分成几部分(一个条目/一个存储库)并将它们放在/etc/apt/sources.list.d/

回答by plundra

This is such a different approach, I don't want to edit my other answer. Nested withsince I don't use 3.1 (Where with A() as a, B() as b:works).

这是一种不同的方法,我不想编辑我的其他答案。嵌套,with因为我不使用 3.1 (Where with A() as a, B() as b:works)。

Might be a bit overkill to change sources.list, but I want to put it out there for future searches.

更改sources.list 可能有点矫枉过正,但我​​想把它放在那里以供将来搜索。

#!/usr/bin/env python
from shutil   import move
from tempfile import NamedTemporaryFile

with NamedTemporaryFile(delete=False) as tmp_sources:
    with open("sources.list") as sources_file:
        for line in sources_file:
            if line.startswith("# deb"):
                tmp_sources.write(line[2:])
            else:
                tmp_sources.write(line)

move(tmp_sources.name, sources_file.name)

This should ensure no race conditions of other people reading the file. Oh, and I prefer str.startswith(...) when you can do without a regexp.

这应该确保没有其他人阅读文件的竞争条件。哦,我更喜欢 str.startswith(...) 当你可以不用正则表达式时。

回答by Matt McClure

Here's a one-module Python replacement for perl -p:

这是一个单模块 Python 替代品perl -p

# Provide compatibility with `perl -p`

# Usage:
#
#     python -mloop_over_stdin_lines '<program>'

# In, `<program>`, use the variable `line` to read and change the current line.

# Example:
#
#         python -mloop_over_stdin_lines 'line = re.sub("pattern", "replacement", line)'

# From the perlrun documentation:
#
#        -p   causes Perl to assume the following loop around your
#             program, which makes it iterate over filename arguments
#             somewhat like sed:
# 
#               LINE:
#                 while (<>) {
#                     ...             # your program goes here
#                 } continue {
#                     print or die "-p destination: $!\n";
#                 }
# 
#             If a file named by an argument cannot be opened for some
#             reason, Perl warns you about it, and moves on to the next
#             file. Note that the lines are printed automatically. An
#             error occurring during printing is treated as fatal. To
#             suppress printing use the -n switch. A -p overrides a -n
#             switch.
# 
#             "BEGIN" and "END" blocks may be used to capture control
#             before or after the implicit loop, just as in awk.
# 

import re
import sys

for line in sys.stdin:
    exec(sys.argv[1], globals(), locals())
    try:
        print line,
    except:
        sys.exit('-p destination: $!\n')

回答by M. Adel

If you are using Python3 the following module will help you: https://github.com/mahmoudadel2/pysed

如果您使用的是 Python3,以下模块将帮助您:https: //github.com/mahmoudadel2/pysed

wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py

Place the module file into your Python3 modules path, then:

将模块文件放入你的 Python3 模块路径中,然后:

import pysed
pysed.replace(<Old string>, <Replacement String>, <Text File>)
pysed.rmlinematch(<Unwanted string>, <Text File>)
pysed.rmlinenumber(<Unwanted Line Number>, <Text File>)

回答by dslackw

Try https://pypi.python.org/pypi/pysed

试试https://pypi.python.org/pypi/pysed

pysed -r '# deb' 'deb' /etc/apt/sources.list

pysed -r '# deb' 'deb' /etc/apt/sources.list

回答by Brad Jasperson

If you really want to use a sedcommand without installing a new Python module, you could simply do the following:

如果你真的想在sed不安装新 Python 模块的情况下使用命令,你可以简单地执行以下操作:

import subprocess
subprocess.call("sed command")

回答by Cecil Curry

Authoring a homegrown sedreplacement in pure Python with noexternal commands or additional dependencies is a noble task laden with noble landmines. Who would have thought?

sed没有外部命令或额外依赖的情况下,用纯 Python编写一个自产的替代品是一项充满高尚地雷的崇高任务。谁曾想到?

Nonetheless, it is feasible.It's also desirable. We've all been there, people: "I need to munge some plaintext files, but I only have Python, two plastic shoelaces, and a moldy can of bunker-grade Maraschino cherries. Help."

尽管如此,这是可行的。这也是可取的。我们都去过那里,人们:“我需要处理一些纯文本文件,但我只有 Python、两条塑料鞋带和一罐发霉的地堡级马拉斯基诺樱桃。帮助。”

In this answer, we offer a best-of-breed solution cobbling together the awesomeness of prior answers without all of that unpleasant not-awesomeness. As plundra notes, David Miller's otherwise top-notch answerwrites the desired file non-atomically and hence invites race conditions (e.g., from other threads and/or processes attempting to concurrently read that file). That's bad. Plundra's otherwise excellent answersolves thatissue while introducing yet more – including numerous fatal encoding errors, a critical security vulnerability (failing to preserve the permissions and other metadata of the original file), and premature optimization replacing regular expressions with low-level character indexing. That's also bad.

在这个答案中,我们提供了一个同类最佳的解决方案,将先前答案的精彩拼凑在一起,而没有所有令人不快的不-真棒。正如 plundra 指出的那样,大卫米勒的其他一流答案非原子地写入所需的文件,因此会引发竞争条件(例如,来自其他线程和/或尝试同时读取该文件的进程)。那很糟。Plundra 的其他优秀答案解决了这个问题,同时引入了更多问题——包括许多致命的编码错误、一个严重的安全漏洞(未能保留原始文件的权限和其他元数据),以及用低级字符索引替换正则表达式的过早优化。那也不好。

Awesomeness, unite!

厉害了,团结起来!

import re, shutil, tempfile

def sed_inplace(filename, pattern, repl):
    '''
    Perform the pure-Python equivalent of in-place `sed` substitution: e.g.,
    `sed -i -e 's/'${pattern}'/'${repl}' "${filename}"`.
    '''
    # For efficiency, precompile the passed regular expression.
    pattern_compiled = re.compile(pattern)

    # For portability, NamedTemporaryFile() defaults to mode "w+b" (i.e., binary
    # writing with updating). This is usually a good thing. In this case,
    # however, binary writing imposes non-trivial encoding constraints trivially
    # resolved by switching to text writing. Let's do that.
    with tempfile.NamedTemporaryFile(mode='w', delete=False) as tmp_file:
        with open(filename) as src_file:
            for line in src_file:
                tmp_file.write(pattern_compiled.sub(repl, line))

    # Overwrite the original file with the munged temporary file in a
    # manner preserving file attributes (e.g., permissions).
    shutil.copystat(filename, tmp_file.name)
    shutil.move(tmp_file.name, filename)

# Do it for Johnny.
sed_inplace('/etc/apt/sources.list', r'^\# deb', 'deb')