pandas 如何将值添加到熊猫数据框中的新列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51499385/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:50:12  来源:igfitidea点击:

How to add values to a new column in pandas dataframe?

pythonpandasdataframe

提问by barciewicz

I want to create a new named column in a Pandas dataframe, insert first value into it, and then add another values to the same column:

我想在 Pandas 数据框中创建一个新的命名列,将第一个值插入其中,然后向同一列添加另一个值:

Something like:

就像是:

import pandas

df = pandas.DataFrame()
df['New column'].append('a')
df['New column'].append('b')
df['New column'].append('c')

etc.

How do I do that?

我怎么做?

采纳答案by jezrael

Dont do it, because slow:

不要这样做,因为很慢

6) updating an empty frame a-single-row-at-a-time. I have seen this method used WAY too much. It is by far the slowest. It is probably common place (and reasonably fast for some python structures), but a DataFrame does a fair number of checks on indexing, so this will always be very slow to update a row at a time. Much better to create new structures and concat.

6) 一次更新一个空帧 a-single-row-at-a-time。我已经看到这种方法使用太多了。它是迄今为止最慢的。这可能是常见的地方(并且对于某些 python 结构来说相当快),但是 DataFrame 对索引进行了大量检查,因此一次更新一行总是很慢。创建新结构和连接要好得多。

Better is create list of data and create DataFrameby contructor:

更好的是创建数据列表DataFrame并由构造函数创建:

vals = ['a','b','c']

df = pandas.DataFrame({'New column':vals})

回答by amo3tasem

If I understand correctly you want to append value to an existing column in a pandas dataframe, the thing is with DFs you need to maintain a matrix-like shape so number of rows is equal for each columns what you can do is add a column with a default value then update this value with

如果我理解正确,您想将值附加到 Pandas 数据框中的现有列,问题是使用 DF,您需要保持类似矩阵的形状,因此每列的行数相等,您可以做的是添加一列一个默认值然后更新这个值

for index, row in df.iterrows(): df.at[index, 'new_column'] = new_value

for index, row in df.iterrows(): df.at[index, 'new_column'] = new_value