Python 如何将空列添加到数据框?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16327055/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to add an empty column to a dataframe?
提问by kjo
What's the easiest way to add an empty column to a pandas DataFrameobject? The best I've stumbled upon is something like
将空列添加到 PandasDataFrame对象的最简单方法是什么?我偶然发现的最好的是
df['foo'] = df.apply(lambda _: '', axis=1)
Is there a less perverse method?
有没有更不反常的方法?
采纳答案by DSM
If I understand correctly, assignment should fill:
如果我理解正确,作业应填写:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
A B
0 1 2
1 2 3
2 3 4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
A B C D
0 1 2 NaN
1 2 3 NaN
2 3 4 NaN
回答by emunsing
To add to DSM's answer and building on this associated question, I'd split the approach into two cases:
为了添加到 DSM 的答案并以此相关问题为基础,我将该方法分为两种情况:
Adding a single column: Just assign empty values to the new columns, e.g.
df['C'] = np.nanAdding multiple columns: I'd suggest using the
.reindex(columns=[...])method of pandasto add the new columns to the dataframe's column index. This also works for adding multiple new rows with.reindex(rows=[...]). Note that newer versions of Pandas (v>0.20) allow you to specify anaxiskeyword rather than explicitly assigning tocolumnsorrows.
添加单列:只需为新列分配空值,例如
df['C'] = np.nan添加多列:我建议使用
.reindex(columns=[...])pandas的方法将新列添加到数据框的列索引中。这也适用于添加多个新行.reindex(rows=[...])。请注意,较新版本的 Pandas (v>0.20) 允许您指定axis关键字而不是显式分配给columns或rows。
Here is an example adding multiple columns:
这是添加多列的示例:
mydf = mydf.reindex(columns = mydf.columns.tolist() + ['newcol1','newcol2'])
or
或者
mydf = mydf.reindex(mydf.columns.tolist() + ['newcol1','newcol2'], axis=1) # version > 0.20.0
You can also always concatenate a new (empty) dataframe to the existing dataframe, but that doesn't feel as pythonic to me :)
您也可以始终将新的(空的)数据帧连接到现有的数据帧,但这对我来说并不像 Pythonic :)
回答by Nickil Maveli
Starting with v0.16.0, DF.assign()could be used to assign new columns (single/multiple) to a DF. These columns get inserted in alphabetical order at the end of the DF.
以v0.16.0,开头,DF.assign()可用于将新列(单个/多个)分配给 a DF。这些列按字母顺序插入到DF.
This becomes advantageous compared to simple assignment in cases wherein you want to perform a series of chained operations directly on the returned dataframe.
在您想直接在返回的数据帧上执行一系列链接操作的情况下,这与简单分配相比变得有利。
Consider the same DFsample demonstrated by @DSM:
考虑DF@DSM 演示的相同示例:
df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
df
Out[18]:
A B
0 1 2
1 2 3
2 3 4
df.assign(C="",D=np.nan)
Out[21]:
A B C D
0 1 2 NaN
1 2 3 NaN
2 3 4 NaN
Note that this returns a copy with all the previous columns along with the newly created ones. Inorder for the original DFto be modified accordingly, use it like : df = df.assign(...)as it does not support inplaceoperation currently.
请注意,这将返回一个包含所有先前列以及新创建的列的副本。为了使原始文件DF进行相应的修改,请像 :df = df.assign(...)一样使用它,因为它目前不支持inplace操作。
回答by edge-case
回答by liana
an even simpler solution is:
一个更简单的解决方案是:
df = df.reindex(columns = header_list)
where "header_list" is a list of the headers you want to appear.
其中“header_list”是您要显示的标题列表。
any header included in the list that is not found already in the dataframe will be added with blank cells below.
列表中未在数据框中找到的任何标题都将在下方添加空白单元格。
so if
因此,如果
header_list = ['a','b','c', 'd']
then c and d will be added as columns with blank cells
然后 c 和 d 将添加为带有空白单元格的列
回答by Joy Mazumder
if you want to add column name from a list
如果要从列表中添加列名
df=pd.DataFrame()
a=['col1','col2','col3','col4']
for i in a:
df[i]=np.nan
回答by Carsten
I like:
我喜欢:
df['new'] = pd.Series()
This makes sure that a dfwith zero rows stays with zero rows.
这确保df具有零行的 a 保持零行。
回答by moys
The below code address the question "How do I add n number of empty columns to my existing dataframe". In the interest of keeping solutions to similar problems in one place, I am adding it here.
下面的代码解决了“如何向现有数据框添加 n 个空列”的问题。为了将类似问题的解决方案集中在一处,我将其添加到此处。
Approach 1 (to create 64 additional columns with column names from 1-64)
方法 1(创建 64 个附加列,列名从 1 到 64)
m = list(range(1,65,1))
dd=pd.DataFrame(columns=m)
df.join(dd).replace(np.nan,'') #df is the dataframe that already exists
Approach 2 (to create 64 additional columns with column names from 1-64)
方法 2(创建 64 个附加列,列名从 1 到 64)
df.reindex(df.columns.tolist() + list(range(1,65,1)), axis=1).replace(np.nan,'')
回答by Bharath_Raja
You can do
你可以做
df['column'] = None #This works. This will create a new column with None type
df.column = None #This will work only when the column is already present in the dataframe
回答by Usman Ahmad
One can use df.insert(index_to_insert_at, column_header, init_value)to insert new column at a specific index.
可以使用df.insert(index_to_insert_at, column_header, init_value)在特定索引处插入新列。
cost_tbl.insert(1, "col_name", "")
The above statement would insert an empty Column after the first column.
上面的语句将在第一列之后插入一个空列。

