无法执行 Python Pandas set_value

Question

提问by Windtalker

Got a problem with Pandas in Python 3.5

Python 3.5 中的 Pandas 有问题

I read local csv using Pandas, the csv contains pure data, no header involved. Then I assigned column name using

我使用 Pandas 读取本地 csv，csv 包含纯数据，不涉及标题。然后我使用指定的列名

df= pd.read_csv(filePath, header=None)
df.columns=['XXX', 'XXX'] #for short, totally 11 cols

The csv has 11 columns, one of them is string, others are integer.

csv 有 11 列，其中一列是字符串，其他是整数。

Then I tried to replace string column by integer value in a loop, cell by cell

然后我尝试在循环中逐个单元格地用整数值替换字符串列

for i, row in df.iterrows():
    print(i, row['Name'])
    df.set_value(i, 'Name', 123)

intrger 123 is an example, not every cell under this column is 123. print function works well if I remove set_value, but with

intrger 123 是一个例子，并非此列下的每个单元格都是 123。如果我删除 set_value，打印功能运行良好，但使用

df.set_value(i, 'Name', 123)

Then error info:

然后错误信息：

Traceback (most recent call last): File "D:/xxx/test.py", line 20, in df.set_value(i, 'Name', 233)
File "E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1862, in set_value series = self._get_item_cache(col)
File "E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1351, in _get_item_cache res = self._box_item_values(item, values)
File "E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2334, in _box_item_values
return self._constructor(values.T, columns=items, index=self.index)
AttributeError: 'BlockManager' object has no attribute 'T'

回溯（最近一次调用最后一次）：文件“D:/xxx/test.py”，第 20 行，在 df.set_value(i, 'Name', 233)
文件“E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\frame.py”，第 1862 行，在 set_value 系列中 = self._get_item_cache(col)
文件“E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\generic.py”，第 1351 行，在 _get_item_cache res = self._box_item_values(item, values)
文件“E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\frame.py”，第 2334 行，在 _box_item_values
返回 self._constructor(values.T, columns=items, index=self.index)
AttributeError: 'BlockManager' 对象没有属性 'T'

But if I create a dataframe manually in code

但是如果我在代码中手动创建一个数据框

df = pd.DataFrame(index=[0, 1, 2], columns=['x', 'y'])
df['x'] = 2
df['y'] = 'BBB'
print(df)
for i, row in df.iterrows():
    df.set_value(i, 'y', 233)


print('\n')
print(df)

It worked. I am wondering maybe there is something I am missing?

有效。我想知道也许我遗漏了什么？

Thanks!

谢谢！

Answer 1

回答by TheRoman

The cause of the original error:

原错误原因：

Pandas DataFrame set_value(index, col, value) method will return the posted obscure AttributeError: 'BlockManager' object has no attribute 'T' when the dataframe being modified has duplicate column names.

Pandas DataFrame set_value(index, col, value) 方法将返回发布的晦涩的 AttributeError: 'BlockManager' object has no attribute 'T' 当正在修改的数据帧具有重复的列名时。

The error can be reproduced using the code above by @Windtalker where the only change made is that the column names are now both 'x' rather than 'x' and 'y'.

可以使用@Windtalker 使用上面的代码重现该错误，其中唯一的更改是列名现在都是“x”而不是“x”和“y”。

import pandas as pd
df = pd.DataFrame(index=[0, 1, 2], columns=['x', 'x'])
df['x'] = 2
df['y'] = 'BBB'
print(df)
for i, row in df.iterrows():
    df.set_value(i, 'y', 233)

print('\n')
print(df)

Hopefully this helps someone else diagnose the same issue.

希望这有助于其他人诊断相同的问题。

Answer 2

回答by MaxU

well, now when you made it lot clearer, it's easier to answer your question...

好吧，现在当你说得更清楚了，就更容易回答你的问题了......

assuming your DF looks like this:

假设您的 DF 如下所示：

In [164]: df
Out[164]:
    a   b   c   d   e          city
0   6  55   3  48  11          Kiev
1   5  29  42  95  69        Munich
2  53  79  60  80  89        Berlin
3   6  70  87   6  85      New York
4  97  23  94  43  31         Paris
5  15  17  56  34  77  Zaporizhzhia
6  28  35  58  82  33        Warsaw
7  41  93  60  54  21      Hurghada
8  68  23  80  39  66          Bern
9  15  17  30  26  98          Lviv

and you hasve another DF with city-id's:

你有另一个带有城市 ID 的 DF：

In [165]: cities
Out[165]:
              id
city
Warsaw         6
Kiev           0
New York       3
Hurghada       7
Munich         1
Paris          4
Berlin         2
Zaporizhzhia   5
Lviv           9
Bern           8

you can map city to city-id like this:

您可以像这样将城市映射到城市 ID：

In [168]: df['city_id'] = df['city'].map(cities['id'])

In [169]: df
Out[169]:
    a   b   c   d   e          city  city_id
0   6  55   3  48  11          Kiev        0
1   5  29  42  95  69        Munich        1
2  53  79  60  80  89        Berlin        2
3   6  70  87   6  85      New York        3
4  97  23  94  43  31         Paris        4
5  15  17  56  34  77  Zaporizhzhia        5
6  28  35  58  82  33        Warsaw        6
7  41  93  60  54  21      Hurghada        7
8  68  23  80  39  66          Bern        8
9  15  17  30  26  98          Lviv        9

PS when working with Pandas in 95% you don't really need to loop through your DF's in order to achieve your goals

PS 在 95% 中使用 Pandas 时，您实际上并不需要遍历 DF 来实现您的目标

无法执行 Python Pandas set_value

提问by Windtalker

回答by TheRoman

回答by MaxU

相关推荐

最近更新

标签

无法执行 Python Pandas set_value

提问by Windtalker

回答by TheRoman

回答by MaxU

相关推荐

Pandas / IPython Notebook：在数据框中包含并显示图像

pandas 如何在python上过滤数据透视表

pandas 用“符号”数字填充数据帧

Pandas 使用 bool 过滤 DataFrame 的列

相关推荐

最近更新

标签