Python 搜索并替换为“仅整个单词”选项

Question

提问by Renan Cidale

I have a script that runs into my text and search and replace all the sentences I write based in a database.

我有一个脚本，可以运行到我的文本中并搜索和替换我基于数据库编写的所有句子。

The script:

剧本：

with open('C:/Users/User/Desktop/Portuguesetranslator.txt') as f:
    for l in f:
        s = l.split('*')
        editor.replace(s[0],s[1])

And the Database example:

和数据库示例：

Event*Evento*
result*resultado*

And so on...

等等...

Now what is happening is that I need the "whole word only" in that script, because I'm finding myself with problems.

现在发生的事情是我需要在该脚本中使用“仅整个单词”，因为我发现自己遇到了问题。

For example with Resultand Event, because when I replace for Resultadoand Evento, and I run the script one more time in the text the script replace again the Resultadoand Evento.

例如使用Resultand Event，因为当我替换为ResultadoandEvento并且我在文本中再次运行脚本时，脚本再次替换了Resultadoand Evento。

And the result after I run the script stays like this Resultadoadoand Eventoo.

我运行脚本后的结果保持这样Resultadoado和Eventoo.

Just so you guys know.. Its not only for Event and Result, there is more then 1000+ sentences that I already set for the search and replace to work..

只是让你们知道..它不仅用于事件和结果，我已经为搜索和替换设置了超过 1000 多个句子。

I don't need a simples search and replace for two words.. because I'm going to be editing the database over and over for different sentences..

我不需要简单的搜索和替换两个词..因为我要为不同的句子一遍又一遍地编辑数据库..

Answer 1

回答by DhruvPathak

Use re.subinstead of normal string replace to replace only whole words.So your script,even if it runs again will not replace the already replaced words.

使用re.sub代替普通的字符串替换来替换整个单词。所以你的脚本，即使它再次运行也不会替换已经替换的单词。

>>> import re
>>> editor = "This is result of the match"
>>> new_editor = re.sub(r"\bresult\b","resultado",editor)
>>> new_editor
'This is resultado of the match'
>>> newest_editor = re.sub(r"\bresult\b","resultado",new_editor)
>>> newest_editor
'This is resultado of the match'

Answer 2

回答by kindall

You want a regular expression. You can use the token \bto match a word boundary: i.e., \bresult\bwould match only the exact word "result."

你想要一个正则表达式。您可以使用标记\b来匹配单词边界：即，\bresult\b只匹配精确的单词“result”。

import re

with open('C:/Users/User/Desktop/Portuguesetranslator.txt') as f:
    for l in f:
        s = l.split('*')
        editor = re.sub(r"\b%s\b" % s[0] , s[1], editor)

Answer 3

回答by Steven Rumbalski

Use re.sub:

使用re.sub：

replacements = {'the':'a', 
                'this':'that'}

def replace(match):
    return replacements[match.group(0)]

# notice that the 'this' in 'thistle' is not matched 
print re.sub('|'.join(r'\b%s\b' % re.escape(s) for s in replacements), 
        replace, 'the cat has this thistle.')

Prints

印刷

a cat has that thistle.

Notes:

笔记：

All the strings to be replaced are joined into a single pattern so that the string needs to be looped over just once.
The source strings are passed to re.escapeto make avoid interpreting them as regular expressions.
The words are surrounded by r'\b'to make sure matches are for whole words only.
A replacement function is used so that any match can be replaced.

所有要替换的字符串都连接到一个模式中，这样字符串只需要循环一次。
传递源字符串re.escape以避免将它们解释为正则表达式。
单词被包围r'\b'以确保匹配仅适用于整个单词。
使用替换功能以便可以替换任何匹配。

Answer 4

回答by Sudharsan

It is very simple. use re.sub, don't use replace.

这很简单。使用 re.sub，不要使用替换。

import re
replacements = {r'\bthe\b':'a', 
                r'\bthis\b':'that'}

def replace_all(text, dic):
    for i, j in dic.iteritems():
        text = re.sub(i,j,text)
    return text

replace_all("the cat has this thistle.", replacements)

It will print

它会打印

a cat has that thistle.

Answer 5

回答by Chris Zhu

import re

match = {}  # create a dictionary of words-to-replace and words-to-replace-with

f = open("filename", "r")
data = f.read()  # string of all file content


def replace_all(text, dic):
    for i, j in dic.items():
        text = re.sub(r"\b%s\b" % i, j, text)
        # r"\b%s\b"% enables replacing by whole word matches only
    return text


data = replace_all(data, match)
print(data)  # you can copy and paste the result to whatever file you like

Python 搜索并替换为“仅整个单词”选项

提问by Renan Cidale

回答by DhruvPathak

回答by kindall

回答by Steven Rumbalski

回答by Sudharsan

回答by Chris Zhu

相关推荐

最近更新

标签

Python 搜索并替换为“仅整个单词”选项

提问by Renan Cidale

回答by DhruvPathak

回答by kindall

回答by Steven Rumbalski

回答by Sudharsan

回答by Chris Zhu

相关推荐

Python 我想在我的熊猫数据框中创建一列 value_counts

Python DataFrame.loc 的“索引器太多”

Python ValueError: numpy.dtype 大小错误，尝试重新编译

Python takeOrdered 降序 Pyspark

相关推荐

最近更新

标签