如何使用 .translate() 从 Python 3.x 中的字符串中删除标点符号?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34293875/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:45:17  来源:igfitidea点击:

How to remove punctuation marks from a string in Python 3.x using .translate()?

pythonpython-3.x

提问by cybujan

I want to remove all punctuation marks from a text file using .translate() method. It seems to work well under Python 2.x but under Python 3.4 it doesn't seem to do anything.

我想使用 .translate() 方法从文本文件中删除所有标点符号。它似乎在 Python 2.x 下运行良好,但在 Python 3.4 下似乎没有任何作用。

My code is as follows and the output is the same as input text.

我的代码如下,输出与输入文本相同。

import string
fhand = open("Hemingway.txt")
for fline in fhand:
    fline = fline.rstrip()
    print(fline.translate(string.punctuation))

回答by elzell

The call signature of str.translate has changed and apparently the parameter deletechars has been removed. You could use

str.translate 的调用签名已更改,显然参数 deletechars 已被删除。你可以用

import re
fline = re.sub('['+string.punctuation+']', '', fline)

instead, or create a table as shown in the other answer.

相反,或创建一个表,如其他答案中所示。

回答by wkl

You have to create a translation table using maketransthat you pass to the str.translatemethod.

您必须使用maketrans传递给str.translate方法的转换表来创建转换表。

In Python 3.1 and newer, maketransis now a static-method on the strtype, so you can use it to create a translation of each punctuation you want to None.

在 Python 3.1 和更新版本中,maketrans现在是type静态方法str,因此您可以使用它来创建您想要的每个标点符号的翻译None

import string

# Thanks to Martijn Pieters for this improved version

# This uses the 3-argument version of str.maketrans
# with arguments (x, y, z) where 'x' and 'y'
# must be equal-length strings and characters in 'x'
# are replaced by characters in 'y'. 'z'
# is a string (string.punctuation here)
# where each character in the string is mapped
# to None
translator = str.maketrans('', '', string.punctuation)

# This is an alternative that creates a dictionary mapping
# of every character from string.punctuation to None (this will
# also work)
#translator = str.maketrans(dict.fromkeys(string.punctuation))

s = 'string with "punctuation" inside of it! Does this work? I hope so.'

# pass the translator to the string's translate method.
print(s.translate(translator))

This should output:

这应该输出:

string with punctuation inside of it Does this work I hope so

回答by imbolc

I just compared the three methods by speed. translateis slower than re.sub(with precomilation) in about 10 times. And str.replaceis faster than re.subin about 3 times. By str.replaceI mean:

我只是通过速度比较了三种方法。translatere.sub(预编译)慢约 10 倍。并且str.replacere.sub大约快3倍。通过str.replace我的意思是:

for ch in string.punctuation:                                                                                                     
    s = s.replace(ch, "'") 

回答by Mayank Kumar

In python3.x ,it can be done using :

在 python3.x 中,可以使用:

import string
#make translator object
translator=str.maketrans('','',string.punctuation)
string_name=string_name.translate(translator)

回答by CONvid19

Late answer, but to remove all punctuation on python >= 3.6, you can also use:

迟到的答案,但要删除 python >= 3.6 上的所有标点符号,您还可以使用:

import re, string

clean_string = re.sub(rf"[{string.punctuation}]", "", dirty_string)

Demo

演示