Pandas - AttributeError: 'DataFrame' 对象没有属性 'map'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/54607989/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas - AttributeError: 'DataFrame' object has no attribute 'map'
提问by praveen
I am trying to create a new column in an dataframe, by creating a dictionary based on an existing column and calling the 'map' function on the column. It seemed to be working for quite some time. However, the notebook started throwing
我正在尝试通过基于现有列创建字典并在列上调用“map”函数来在数据框中创建一个新列。它似乎工作了很长一段时间。然而,笔记本开始抛出
AttributeError: 'DataFrame' object has no attribute 'map'
AttributeError: 'DataFrame' 对象没有属性 'map'
I haven't changed the kernel or the python version. Here's the code i am using.
我没有更改内核或 python 版本。这是我正在使用的代码。
dict= {1:A,
2:B,
3:C,
4:D,
5:E}
# Creating an interval-type
data['new'] = data['old'].map(dict)
how to fix this?
如何解决这个问题?
采纳答案by jezrael
Main problem is after selecting old
column get DataFrame
instead Series
, so map
implemented yet to Series
failed.
主要的问题是选择后old
列中获取DataFrame
代替Series
,因此map
实现没有Series
失败。
Here should be duplicated column old
, so if select one column it return all columns old
in DataFrame
:
这里应该是重复列old
,因此,如果选择一列其返回的所有列old
在DataFrame
:
df = pd.DataFrame([[1,3,8],[4,5,3]], columns=['old','old','col'])
print (df)
old old col
0 1 3 8
1 4 5 3
print(df['old'])
old old
0 1 3
1 4 5
#dont use dict like variable, because python reserved word
df['new'] = df['old'].map(d)
print (df)
AttributeError: 'DataFrame' object has no attribute 'map'
AttributeError: 'DataFrame' 对象没有属性 'map'
Possible solution for deduplicated this columns:
对此列进行重复数据删除的可能解决方案:
s = df.columns.to_series()
new = s.groupby(s).cumcount().astype(str).radd('_').replace('_0','')
df.columns += new
print (df)
old old_1 col
0 1 3 8
1 4 5 3
Another problem should be MultiIndex
in column, test it by:
另一个问题应该MultiIndex
在列中,通过以下方式进行测试:
mux = pd.MultiIndex.from_arrays([['old','old','col'],['a','b','c']])
df = pd.DataFrame([[1,3,8],[4,5,3]], columns=mux)
print (df)
old col
a b c
0 1 3 8
1 4 5 3
print (df.columns)
MultiIndex(levels=[['col', 'old'], ['a', 'b', 'c']],
codes=[[1, 1, 0], [0, 1, 2]])
And solution is flatten MultiIndex
:
解决方案是展平的MultiIndex
:
#python 3.6+
df.columns = [f'{a}_{b}' for a, b in df.columns]
#puthon bellow
#df.columns = ['{}_{}'.format(a,b) for a, b in df.columns]
print (df)
old_a old_b col_c
0 1 3 8
1 4 5 3
Another solution is map by MultiIndex
with tuple and assign to new tuple
:
另一种解决方案是MultiIndex
使用元组映射并分配给 new tuple
:
df[('new', 'd')] = df[('old', 'a')].map(d)
print (df)
old col new
a b c d
0 1 3 8 A
1 4 5 3 D
print (df.columns)
MultiIndex(levels=[['col', 'old', 'new'], ['a', 'b', 'c', 'd']],
codes=[[1, 1, 0, 2], [0, 1, 2, 3]])
回答by Arran Duff
map is a method that you can call on a pandas.Series object. This method doesn't exist on pandas.DataFrame objects.
map 是一种可以在 pandas.Series 对象上调用的方法。pandas.DataFrame 对象上不存在此方法。
df['new'] = df['old'].map(d)
In your code ^^^ df['old']is returning a pandas.Dataframe object for some reason.
在您的代码中 ^^^ df['old']由于某种原因返回一个 pandas.Dataframe 对象。
- As @jezrael points out this could be due to having more than one oldcolumn in the dataframe.
Or perhaps your code isn't quite the same as the example you have given.
Either way the error is there because you are calling map()on a pandas.Dataframe object
- 正如@jezrael 指出的那样,这可能是由于数据框中有多个旧列。
或者您的代码可能与您给出的示例不完全相同。
无论哪种方式都会出现错误,因为您在 pandas.Dataframe 对象上调用 map()