Pandas - AttributeError: 'DataFrame' 对象没有属性 'map'

Question

提问by praveen

I am trying to create a new column in an dataframe, by creating a dictionary based on an existing column and calling the 'map' function on the column. It seemed to be working for quite some time. However, the notebook started throwing

我正在尝试通过基于现有列创建字典并在列上调用“map”函数来在数据框中创建一个新列。它似乎工作了很长一段时间。然而，笔记本开始抛出

AttributeError: 'DataFrame' object has no attribute 'map'

AttributeError: 'DataFrame' 对象没有属性 'map'

I haven't changed the kernel or the python version. Here's the code i am using.

我没有更改内核或 python 版本。这是我正在使用的代码。

dict= {1:A,
       2:B,
       3:C,
       4:D,
       5:E}

# Creating an interval-type 
data['new'] = data['old'].map(dict)

how to fix this?

如何解决这个问题？

Answer 1

采纳答案by jezrael

Main problem is after selecting oldcolumn get DataFrameinstead Series, so mapimplemented yet to Seriesfailed.

主要的问题是选择后old列中获取DataFrame代替Series，因此map实现没有Series失败。

Here should be duplicated column old, so if select one column it return all columns oldin DataFrame:

这里应该是重复列old，因此，如果选择一列其返回的所有列old在DataFrame：

df = pd.DataFrame([[1,3,8],[4,5,3]], columns=['old','old','col'])
print (df)
   old  old  col
0    1    3    8
1    4    5    3

print(df['old'])
   old  old
0    1    3
1    4    5

#dont use dict like variable, because python reserved word
df['new'] = df['old'].map(d)
print (df)

AttributeError: 'DataFrame' object has no attribute 'map'

AttributeError: 'DataFrame' 对象没有属性 'map'

Possible solution for deduplicated this columns:

对此列进行重复数据删除的可能解决方案：

s = df.columns.to_series()
new = s.groupby(s).cumcount().astype(str).radd('_').replace('_0','')
df.columns += new
print (df)
   old  old_1  col
0    1      3    8
1    4      5    3

Another problem should be MultiIndexin column, test it by:

另一个问题应该MultiIndex在列中，通过以下方式进行测试：

mux = pd.MultiIndex.from_arrays([['old','old','col'],['a','b','c']])
df = pd.DataFrame([[1,3,8],[4,5,3]], columns=mux)
print (df)
  old    col
    a  b   c
0   1  3   8
1   4  5   3

print (df.columns)
MultiIndex(levels=[['col', 'old'], ['a', 'b', 'c']],
           codes=[[1, 1, 0], [0, 1, 2]])

And solution is flatten MultiIndex:

解决方案是展平的MultiIndex：

#python 3.6+
df.columns = [f'{a}_{b}' for a, b in df.columns]
#puthon bellow
#df.columns = ['{}_{}'.format(a,b) for a, b in df.columns]
print (df)
   old_a  old_b  col_c
0      1      3      8
1      4      5      3

Another solution is map by MultiIndexwith tuple and assign to new tuple:

另一种解决方案是MultiIndex使用元组映射并分配给 new tuple：

df[('new', 'd')] = df[('old', 'a')].map(d)
print (df)
  old    col new
    a  b   c   d
0   1  3   8   A
1   4  5   3   D

print (df.columns)
MultiIndex(levels=[['col', 'old', 'new'], ['a', 'b', 'c', 'd']],
           codes=[[1, 1, 0, 2], [0, 1, 2, 3]])

Answer 2

回答by Arran Duff

map is a method that you can call on a pandas.Series object. This method doesn't exist on pandas.DataFrame objects.

map 是一种可以在 pandas.Series 对象上调用的方法。pandas.DataFrame 对象上不存在此方法。

df['new'] = df['old'].map(d)

In your code ^^^ df['old']is returning a pandas.Dataframe object for some reason.

在您的代码中 ^^^ df['old']由于某种原因返回一个 pandas.Dataframe 对象。

As @jezrael points out this could be due to having more than one oldcolumn in the dataframe.
Or perhaps your code isn't quite the same as the example you have given.
Either way the error is there because you are calling map()on a pandas.Dataframe object

正如@jezrael 指出的那样，这可能是由于数据框中有多个旧列。
或者您的代码可能与您给出的示例不完全相同。
无论哪种方式都会出现错误，因为您在 pandas.Dataframe 对象上调用 map()

Pandas - AttributeError: 'DataFrame' 对象没有属性 'map'

提问by praveen

采纳答案by jezrael

回答by Arran Duff

相关推荐

最近更新

标签

Pandas - AttributeError: 'DataFrame' 对象没有属性 'map'

提问by praveen

采纳答案by jezrael

回答by Arran Duff

相关推荐

pandas 将excel中的某些列读取到数据框

Pandas 数据框列的浮动百分比样式错误

pandas 类型错误：append() 缺少 1 个必需的位置参数：“其他”

将 Pandas DataFrame 附加到现有 Excel 文档

相关推荐

最近更新

标签