Python 逐行编辑熊猫数据框

Question

提问by Jonas Lindel?v

pandas for python is neat. I'm trying to replace a list-of-dictionaries with a pandas-dataframe. However, I'm wondering of there's a way to change values row-by-row in a for-loop just as easy?

python的熊猫很整洁。我正在尝试用熊猫数据框替换字典列表。但是，我想知道有没有一种方法可以同样简单地在 for 循环中逐行更改值？

Here's the non-pandas dict-version:

这是非熊猫字典版本：

trialList = [
    {'no':1, 'condition':2, 'response':''},
    {'no':2, 'condition':1, 'response':''},
    {'no':3, 'condition':1, 'response':''}
]  # ... and so on

for trial in trialList:
    # Do something and collect response
    trial['response'] = 'the answer!'

... and now trialListcontains the updated values because trialrefers back to that. Very handy! But the list-of-dicts is very unhandy, especially because I'd like to be able to compute stuff column-wise which pandas excel at.

...现在trialList包含更新的值，因为trial指的是那个。非常便利！但是字典列表非常不方便，特别是因为我希望能够按列计算熊猫擅长的东西。

So given trialList from above, I though I could make it even better by doing something pandas-like:

因此，鉴于上面的trialList，我虽然可以通过做一些类似熊猫的事情来使它变得更好：

import pandas as pd    
dfTrials = pd.DataFrame(trialList)  # makes a nice 3-column dataframe with 3 rows

for trial in dfTrials.iterrows():
   # do something and collect response
   trials[1]['response'] = 'the answer!'

... but trialListremains unchanged here. Is there an easy way to update values row-by-row, perhaps equivalent to the dict-version? It is important that it's row-by-row as this is for an experiment where participants are presented with a lot of trials and various data is collected on each single trial.

...但trialList在这里保持不变。有没有一种简单的方法可以逐行更新值，也许相当于 dict-version？重要的是它是逐行的，因为这是一个实验，在这个实验中，参与者会看到很多试验，并且每次试验都会收集各种数据。

Answer 1

采纳答案by DSM

If you really want row-by-row ops, you could use iterrowsand loc:

如果你真的想要逐行操作，你可以使用iterrowsand loc：

>>> for i, trial in dfTrials.iterrows():
...     dfTrials.loc[i, "response"] = "answer {}".format(trial["no"])
...     
>>> dfTrials
   condition  no  response
0          2   1  answer 1
1          1   2  answer 2
2          1   3  answer 3

[3 rows x 3 columns]

Better though is when you can vectorize:

更好的是当您可以矢量化时：

>>> dfTrials["response 2"] = dfTrials["condition"] + dfTrials["no"]
>>> dfTrials
   condition  no  response  response 2
0          2   1  answer 1           3
1          1   2  answer 2           3
2          1   3  answer 3           4

[3 rows x 4 columns]

And there's always apply:

而且总是有apply：

>>> def f(row):
...     return "c{}n{}".format(row["condition"], row["no"])
... 
>>> dfTrials["r3"] = dfTrials.apply(f, axis=1)
>>> dfTrials
   condition  no  response  response 2    r3
0          2   1  answer 1           3  c2n1
1          1   2  answer 2           3  c1n2
2          1   3  answer 3           4  c1n3

[3 rows x 5 columns]

Python 逐行编辑熊猫数据框

提问by Jonas Lindel?v

采纳答案by DSM

相关推荐

最近更新

标签

Python 逐行编辑熊猫数据框

提问by Jonas Lindel?v

采纳答案by DSM

相关推荐

使用 Python 请求：会话、Cookie 和 POST

Python PyQt QPushButton 背景色

Python Tkinter清除帧

Python 熊猫和 unicode

相关推荐

最近更新

标签