str.translate 给出了 TypeError - Translate 需要一个参数（给出 2 个），在 Python 2 中工作

Question

提问by carebear

I have the following code

我有以下代码

import nltk, os, json, csv, string, cPickle
from scipy.stats import scoreatpercentile

lmtzr = nltk.stem.wordnet.WordNetLemmatizer()

def sanitize(wordList): 
answer = [word.translate(None, string.punctuation) for word in wordList] 
answer = [lmtzr.lemmatize(word.lower()) for word in answer]
return answer

words = []
for filename in json_list:
    words.extend([sanitize(nltk.word_tokenize(' '.join([tweet['text'] 
                   for tweet in json.load(open(filename,READ))])))])

I've tested lines 2-4 in a separate testing.py file when I wrote

我在编写时在单独的 testing.py 文件中测试了第 2-4 行

import nltk, os, json, csv, string, cPickle
from scipy.stats import scoreatpercentile

wordList= ['\'the', 'the', '"the']
print wordList
wordList2 = [word.translate(None, string.punctuation) for word in wordList]
print wordList2
answer = [lmtzr.lemmatize(word.lower()) for word in wordList2]
print answer

freq = nltk.FreqDist(wordList2)
print freq

and the command prompt returns ['the','the','the'], which is what I wanted (removing punctuation).

并且命令提示符返回 ['the','the','the']，这是我想要的（删除标点符号）。

However, when I put the exact same code in a different file, python returns a TypeError stating that

但是，当我将完全相同的代码放在不同的文件中时，python 返回一个 TypeError 说明

File "foo.py", line 8, in <module>
  for tweet in json.load(open(filename, READ))])))])
File "foo.py", line 2, in sanitize
  answer = [word.translate(None, string.punctuation) for word in wordList]
TypeError: translate() takes exactly one argument (2 given)

json_list is a list of all the file paths (I printed and check that this list is valid). I'm confused on this TypeError because everything works perfectly fine when I'm just testing it in a different file.

json_list 是所有文件路径的列表（我打印并检查此列表是否有效）。我对这个 TypeError 感到困惑，因为当我只是在不同的文件中测试它时，一切都很好。

Answer 1

回答by Blckknght

I suspect your issue has to do with the differences between str.translateand unicode.translate(these are also the differences between str.translateon Python 2 versus Python 3). I suspect your original code is being sent unicodeinstances while your test code is using regular 8-bit strinstances.

我怀疑你的问题是与之间的差异做str.translate和unicode.translate（这些也是之间的差异str.translate上的Python 2与Python 3中）。我怀疑您的原始代码正在发送unicode实例，而您的测试代码正在使用常规 8 位str实例。

I don't suggest converting Unicode strings back to regular strinstances, since unicodeis a much better type for handling text data (and it is the future!). Instead, you should just adapt to the new unicode.translatesyntax. With regular str.translate(on Python 2), you can pass an optional deletecharsargument and the characters in it would be removed from the string. For unicode.translate(and str.translateon Python 3), the extra argument is no longer allowed, but translation table entries with Noneas their value will be deleted from the output.

我不建议将 Unicode 字符串转换回常规str实例，因为这unicode是处理文本数据的更好的类型（而且是未来！）。相反，您应该只适应新unicode.translate语法。使用常规str.translate（在 Python 2 上），您可以传递一个可选deletechars参数，其中的字符将从字符串中删除。对于unicode.translate（以及str.translate在 Python 3 上），不再允许使用额外的参数，但None会从输出中删除带有其值的转换表条目。

To solve the problem you'll need to create an appropriate translation table. A translation table is a dictionary mapping from Unicode ordinals (that is, ints) to ordinals, strings or None. A helper function for making them exists in Python 2 as string.maketrans(and Python 3 as a method of the strtype), but the Python 2 version of it doesn't handle the case we care about (putting Nonevalues into the table). You can build an appropriate dictionary yourself with something like {ord(c): None for c in string.punctuation}.

要解决这个问题，您需要创建一个适当的转换表。转换表是从 Unicode 序数（即ints）到序数、字符串或的字典映射None。在 Python 2 中存在一个用于生成它们的辅助函数string.maketrans（以及 Python 3 作为该str类型的方法），但它的 Python 2 版本不处理我们关心的情况（将None值放入表中）。你可以用类似的东西自己构建一个合适的字典{ord(c): None for c in string.punctuation}。

Answer 2

回答by drchuck

If all you are looking to accomplish is to do the same thing you were doing in Python 2 in Python 3, here is what I was doing in Python 2.0 to throw away punctuation and numbers:

如果您想要完成的只是在 Python 3 中做与在 Python 2 中所做的相同的事情，那么我在 Python 2.0 中所做的就是丢弃标点符号和数字：

text = text.translate(None, string.punctuation)
text = text.translate(None, '1234567890')

Here is my Python 3.0 equivalent:

这是我的 Python 3.0 等效项：

text = text.translate(str.maketrans('','',string.punctuation))
text = text.translate(str.maketrans('','','1234567890'))

Basically it says 'translate nothing to nothing' (first two parameters) and translate any punctuation or numbers to None(i.e. remove them).

基本上它说“什么都不翻译”（前两个参数）并将任何标点或数字翻译成None（即删除它们）。

Answer 3

回答by ChuQuan

Python 3.0:

蟒蛇 3.0：

text = text.translate(str.maketrans('','','1234567890'))

static str.maketrans(x[, y[, z]])
This static method returns a translation table usable for str.translate().

静态 str.maketrans(x[, y[, z]])
此静态方法返回可用于的转换表str.translate()。

If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters (strings of length 1) to Unicode ordinals, strings (of arbitrary lengths) or None. Character keys will then be converted to ordinals.

如果只有一个参数，则它必须是将 Unicode 序数（整数）或字符（长度为 1 的字符串）映射到 Unicode 序数、字符串（任意长度）或None. 然后字符键将被转换为序数。

If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in xwill be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to Nonein the result.

如果有两个参数，它们必须是等长的字符串，并且在生成的字典中，每个字符 inx都会映射到 in 中相同位置的字符y。如果有第三个参数，它必须是一个字符串，其字符将被映射到None结果中。

https://docs.python.org/3/library/stdtypes.html?highlight=maketrans#str.maketrans

Answer 4

回答by Preeti Duhan

This is how translate works:

这就是翻译的工作方式：

yourstring.translate(str.maketrans(fromstr, tostr, deletestr))

Replace the characters in fromstrwith the character in the same position in tostrand delete all characters that are in deletestr. The fromstrand tostrcan be empty strings and the deletestrparameter can be omitted.

将 infromstr中的字符替换为in 中相同位置的tostr字符，并删除中的所有字符deletestr。该fromstr和tostr可以为空字符串和deletestr可以省略参数。

example:

例子：

str="preetideepak12345aeiou"
>>> str.translate(str.maketrans('abcde','12345','p'))

output:

输出：

'r55ti4551k1234515iou'

here:

这里：

a is translated to 1
b is translated to 2
c is translated to 3 and so on
and p is deleted from string.

str.translate 给出了 TypeError - Translate 需要一个参数（给出 2 个），在 Python 2 中工作

提问by carebear

回答by Blckknght

回答by drchuck

回答by ChuQuan

回答by Preeti Duhan

相关推荐

最近更新

标签

str.translate 给出了 TypeError - Translate 需要一个参数（给出 2 个），在 Python 2 中工作

提问by carebear

回答by Blckknght

回答by drchuck

回答by ChuQuan

回答by Preeti Duhan

相关推荐

Python Django：必须使用对象 pk 或 slug 调用通用详细信息视图

python：向上两级获取目录

Python 在 Pandas 中将浮点数转换为字符串

Python从xml中提取数据并保存到excel

相关推荐

最近更新

标签