pandas df ['X'].unique() 和 TypeError: unhashable type: 'numpy.ndarray'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51675151/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
df ['X'].unique() and TypeError: unhashable type: 'numpy.ndarray'
提问by SBad
all,
全部,
I have a column in a dataframe that looks like this:
我在数据框中有一列,如下所示:
allHoldingsFund['BrokerMixed']
Out[419]:
78 ML
81 CITI
92 ML
173 CITI
235 ML
262 ML
264 ML
25617 GS
25621 CITI
25644 CITI
25723 GS
25778 CITI
25786 CITI
25793 GS
25797 CITI
Name: BrokerMixed, Length: 2554, dtype: object
Although the column is an object. I am not able to group by that column or even extract the unique values of that column. For example when I do:
虽然列是一个对象。我无法按该列分组,甚至无法提取该列的唯一值。例如,当我这样做时:
allHoldingsFund['BrokerMixed'].unique()
I get an error
我收到一个错误
uniques = table.unique(values)
File "pandas/_libs/hashtable_class_helper.pxi", line 1340, in pandas._libs.hashtable.PyObjectHashTable.unique
TypeError: unhashable type: 'numpy.ndarray'
I also get an error when I do group by.
当我分组时,我也收到错误消息。
Any help is welcome. Thank you
欢迎任何帮助。谢谢
采纳答案by Harry_pb
First I would suggest you to check what's typeof your column. You may try as follows
首先,我建议您检查一下type您的column. 你可以尝试如下
print (type(allHoldingsFund['BrokerMixed']))
If this is a dataframe series, you may try
如果这是一个dataframe series,你可以试试
allHoldingsFund['BrokerMixed'].reset_index()['BrokerMixed'].unique()
and check if this works for you.
并检查这是否适合您。
EDIT 2020: Your way to get unique and mentioned answers fetch same results using Python 3
EDIT 2020:您获得独特和提及的答案的方式使用 Python 3 获取相同的结果
回答by jpp
Looks like you have a NumPy array in your series. But you can't hash NumPy arrays and pd.Series.unique, like set, relies on hashing.
看起来您的系列中有一个 NumPy 数组。但是您不能对 NumPy 数组进行散列,并且pd.Series.unique像 一样set依赖于散列。
If you can't ensure your series data only consists of strings, you can convert NumPy arrays to tuples before calling pd.Series.unique:
如果您不能确保您的系列数据只包含字符串,您可以在调用之前将 NumPy 数组转换为元组pd.Series.unique:
s = pd.Series([np.array([1, 2, 3]), 1, 'hello', 'test', 1, 'test'])
def tuplizer(x):
return tuple(x) if isinstance(x, (np.ndarray, list)) else x
res = s.apply(tuplizer).unique()
print(res)
array([(1, 2, 3), 1, 'hello', 'test'], dtype=object)
Of course, this means your data type information is lost in the result, but at least you get to see your "unique" NumPy arrays, provided they are 1-dimensional.
当然,这意味着您的数据类型信息会在结果中丢失,但至少您可以看到“唯一”的 NumPy 数组,前提是它们是一维的。
回答by Sahil Puri
You have an array in your data column, you could try the following
您的数据列中有一个数组,您可以尝试以下操作
allHoldingsFund['BrokerMixed'].apply(lambda x: str(x)).unique()


