Pandas - AttributeError: 'DataFrame' 对象没有属性 'map'

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/54607989/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:19:00  来源:igfitidea点击:

Pandas - AttributeError: 'DataFrame' object has no attribute 'map'

python-3.xpandas

提问by praveen

I am trying to create a new column in an dataframe, by creating a dictionary based on an existing column and calling the 'map' function on the column. It seemed to be working for quite some time. However, the notebook started throwing

我正在尝试通过基于现有列创建字典并在列上调用“map”函数来在数据框中创建一个新列。它似乎工作了很长一段时间。然而,笔记本开始抛出

AttributeError: 'DataFrame' object has no attribute 'map'

AttributeError: 'DataFrame' 对象没有属性 'map'

I haven't changed the kernel or the python version. Here's the code i am using.

我没有更改内核或 python 版本。这是我正在使用的代码。

dict= {1:A,
       2:B,
       3:C,
       4:D,
       5:E}

# Creating an interval-type 
data['new'] = data['old'].map(dict)

how to fix this?

如何解决这个问题?

采纳答案by jezrael

Main problem is after selecting oldcolumn get DataFrameinstead Series, so mapimplemented yet to Seriesfailed.

主要的问题是选择后old列中获取DataFrame代替Series,因此map实现没有Series失败。

Here should be duplicated column old, so if select one column it return all columns oldin DataFrame:

这里应该是重复列old,因此,如果选择一列其返回的所有列oldDataFrame

df = pd.DataFrame([[1,3,8],[4,5,3]], columns=['old','old','col'])
print (df)
   old  old  col
0    1    3    8
1    4    5    3

print(df['old'])
   old  old
0    1    3
1    4    5

#dont use dict like variable, because python reserved word
df['new'] = df['old'].map(d)
print (df)

AttributeError: 'DataFrame' object has no attribute 'map'

AttributeError: 'DataFrame' 对象没有属性 'map'

Possible solution for deduplicated this columns:

对此列进行重复数据删除的可能解决方案:

s = df.columns.to_series()
new = s.groupby(s).cumcount().astype(str).radd('_').replace('_0','')
df.columns += new
print (df)
   old  old_1  col
0    1      3    8
1    4      5    3

Another problem should be MultiIndexin column, test it by:

另一个问题应该MultiIndex在列中,通过以下方式进行测试:

mux = pd.MultiIndex.from_arrays([['old','old','col'],['a','b','c']])
df = pd.DataFrame([[1,3,8],[4,5,3]], columns=mux)
print (df)
  old    col
    a  b   c
0   1  3   8
1   4  5   3

print (df.columns)
MultiIndex(levels=[['col', 'old'], ['a', 'b', 'c']],
           codes=[[1, 1, 0], [0, 1, 2]])

And solution is flatten MultiIndex:

解决方案是展平的MultiIndex

#python 3.6+
df.columns = [f'{a}_{b}' for a, b in df.columns]
#puthon bellow
#df.columns = ['{}_{}'.format(a,b) for a, b in df.columns]
print (df)
   old_a  old_b  col_c
0      1      3      8
1      4      5      3

Another solution is map by MultiIndexwith tuple and assign to new tuple:

另一种解决方案是MultiIndex使用元组映射并分配给 new tuple

df[('new', 'd')] = df[('old', 'a')].map(d)
print (df)
  old    col new
    a  b   c   d
0   1  3   8   A
1   4  5   3   D

print (df.columns)
MultiIndex(levels=[['col', 'old', 'new'], ['a', 'b', 'c', 'd']],
           codes=[[1, 1, 0, 2], [0, 1, 2, 3]])

回答by Arran Duff

map is a method that you can call on a pandas.Series object. This method doesn't exist on pandas.DataFrame objects.

map 是一种可以在 pandas.Series 对象上调用的方法。pandas.DataFrame 对象上不存在此方法。

df['new'] = df['old'].map(d)

In your code ^^^ df['old']is returning a pandas.Dataframe object for some reason.

在您的代码中 ^^^ df['old']由于某种原因返回一个 pandas.Dataframe 对象。

  • As @jezrael points out this could be due to having more than one oldcolumn in the dataframe.
  • Or perhaps your code isn't quite the same as the example you have given.

  • Either way the error is there because you are calling map()on a pandas.Dataframe object

  • 正如@jezrael 指出的那样,这可能是由于数据框中有多个旧列。
  • 或者您的代码可能与您给出的示例不完全相同。

  • 无论哪种方式都会出现错误,因为您在 pandas.Dataframe 对象上调用 map()