Python pandas:逐行填充数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17091769/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas: fill a dataframe row by row
提问by xApple
The simple task of adding a row to a pandas.DataFrameobject seems to be hard to accomplish. There are 3 stackoverflow questions relating to this, none of which give a working answer.
向pandas.DataFrame对象添加一行的简单任务似乎很难完成。有 3 个与此相关的 stackoverflow 问题,但没有一个给出有效的答案。
Here is what I'm trying to do. I have a DataFrame of which I already know the shape as well as the names of the rows and columns.
这就是我想要做的。我有一个 DataFrame,我已经知道它的形状以及行和列的名称。
>>> df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
>>> df
a b c d
x NaN NaN NaN NaN
y NaN NaN NaN NaN
z NaN NaN NaN NaN
Now, I have a function to compute the values of the rows iteratively. How can I fill in one of the rows with either a dictionary or a pandas.Series? Here are various attempts that have failed:
现在,我有一个函数来迭代计算行的值。如何使用字典或pandas.Series. 以下是失败的各种尝试:
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df['y'] = y
AssertionError: Length of values does not match length of index
Apparently it tried to add a column instead of a row.
显然它试图添加一列而不是一行。
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.join(y)
AttributeError: 'builtin_function_or_method' object has no attribute 'is_unique'
Very uninformative error message.
非常无信息的错误消息。
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.set_value(index='y', value=y)
TypeError: set_value() takes exactly 4 arguments (3 given)
Apparently that is only for setting individual values in the dataframe.
显然,这仅用于在数据框中设置单个值。
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.append(y)
Exception: Can only append a Series if ignore_index=True
Well, I don't want to ignore the index, otherwise here is the result:
好吧,我不想忽略索引,否则结果如下:
>>> df.append(y, ignore_index=True)
a b c d
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 1 5 2 3
It did align the column names with the values, but lost the row labels.
它确实将列名与值对齐,但丢失了行标签。
>>> y = {'a':1, 'b':5, 'c':2, 'd':3}
>>> df.ix['y'] = y
>>> df
a b \
x NaN NaN
y {'a': 1, 'c': 2, 'b': 5, 'd': 3} {'a': 1, 'c': 2, 'b': 5, 'd': 3}
z NaN NaN
c d
x NaN NaN
y {'a': 1, 'c': 2, 'b': 5, 'd': 3} {'a': 1, 'c': 2, 'b': 5, 'd': 3}
z NaN NaN
That also failed miserably.
那也惨败。
So how do you do it ?
你是怎么做到的 ?
采纳答案by Jeff
df['y']will set a column
df['y']将设置一列
since you want to set a row, use .loc
既然要设置一行,请使用 .loc
Note that .ixis equivalent here, yours failed because you tried to assign a dictionary
to each element of the row yprobably not what you want; converting to a Series tells pandas
that you want to align the input (for example you then don't have to to specify all of the elements)
请注意,.ix这里是等效的,您的失败是因为您试图为行的每个元素分配一个字典,这y可能不是您想要的;转换为 Series 告诉熊猫您要对齐输入(例如,您不必指定所有元素)
In [7]: df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
In [8]: df.loc['y'] = pandas.Series({'a':1, 'b':5, 'c':2, 'd':3})
In [9]: df
Out[9]:
a b c d
x NaN NaN NaN NaN
y 1 5 2 3
z NaN NaN NaN NaN
回答by Satheesh
This is a simpler version
这是一个更简单的版本
import pandas as pd
df = pd.DataFrame(columns=('col1', 'col2', 'col3'))
for i in range(5):
df.loc[i] = ['<some value for first>','<some value for second>','<some value for third>']`
回答by flow
My approach was, but I can't guarantee that this is the fastest solution.
我的方法是,但我不能保证这是最快的解决方案。
df = pd.DataFrame(columns=["firstname", "lastname"])
df = df.append({
"firstname": "John",
"lastname": "Johny"
}, ignore_index=True)
回答by stackoverflowuser2010
If your input rows are lists rather than dictionaries, then the following is a simple solution:
如果您的输入行是列表而不是字典,那么以下是一个简单的解决方案:
import pandas as pd
list_of_lists = []
list_of_lists.append([1,2,3])
list_of_lists.append([4,5,6])
pd.DataFrame(list_of_lists, columns=['A', 'B', 'C'])
# A B C
# 0 1 2 3
# 1 4 5 6

