Python 如何将空列添加到数据框？

Question

提问by kjo

What's the easiest way to add an empty column to a pandas DataFrameobject? The best I've stumbled upon is something like

将空列添加到 PandasDataFrame对象的最简单方法是什么？我偶然发现的最好的是

df['foo'] = df.apply(lambda _: '', axis=1)

Is there a less perverse method?

有没有更不反常的方法？

Answer 1

采纳答案by DSM

If I understand correctly, assignment should fill:

如果我理解正确，作业应填写：

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
   A  B
0  1  2
1  2  3
2  3  4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
   A  B C   D
0  1  2   NaN
1  2  3   NaN
2  3  4   NaN

Answer 2

回答by emunsing

To add to DSM's answer and building on this associated question, I'd split the approach into two cases:

为了添加到 DSM 的答案并以此相关问题为基础，我将该方法分为两种情况：

Adding a single column: Just assign empty values to the new columns, e.g. df['C'] = np.nan
Adding multiple columns: I'd suggest using the .reindex(columns=[...])method of pandasto add the new columns to the dataframe's column index. This also works for adding multiple new rows with .reindex(rows=[...]). Note that newer versions of Pandas (v>0.20) allow you to specify an axiskeyword rather than explicitly assigning to columnsor rows.

添加单列：只需为新列分配空值，例如 df['C'] = np.nan
添加多列：我建议使用.reindex(columns=[...])pandas的方法将新列添加到数据框的列索引中。这也适用于添加多个新行.reindex(rows=[...])。请注意，较新版本的 Pandas (v>0.20) 允许您指定axis关键字而不是显式分配给columns或rows。

Here is an example adding multiple columns:

这是添加多列的示例：

mydf = mydf.reindex(columns = mydf.columns.tolist() + ['newcol1','newcol2'])

or

或者

mydf = mydf.reindex(mydf.columns.tolist() + ['newcol1','newcol2'], axis=1)  # version > 0.20.0

You can also always concatenate a new (empty) dataframe to the existing dataframe, but that doesn't feel as pythonic to me :)

您也可以始终将新的（空的）数据帧连接到现有的数据帧，但这对我来说并不像 Pythonic :)

Answer 3

回答by Nickil Maveli

Starting with v0.16.0, DF.assign()could be used to assign new columns (single/multiple) to a DF. These columns get inserted in alphabetical order at the end of the DF.

以v0.16.0,开头，DF.assign()可用于将新列（单个/多个）分配给 a DF。这些列按字母顺序插入到DF.

This becomes advantageous compared to simple assignment in cases wherein you want to perform a series of chained operations directly on the returned dataframe.

在您想直接在返回的数据帧上执行一系列链接操作的情况下，这与简单分配相比变得有利。

Consider the same DFsample demonstrated by @DSM:

考虑DF@DSM 演示的相同示例：

df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
df
Out[18]:
   A  B
0  1  2
1  2  3
2  3  4

df.assign(C="",D=np.nan)
Out[21]:
   A  B C   D
0  1  2   NaN
1  2  3   NaN
2  3  4   NaN

Note that this returns a copy with all the previous columns along with the newly created ones. Inorder for the original DFto be modified accordingly, use it like : df = df.assign(...)as it does not support inplaceoperation currently.

请注意，这将返回一个包含所有先前列以及新创建的列的副本。为了使原始文件DF进行相应的修改，请像 :df = df.assign(...)一样使用它，因为它目前不支持inplace操作。

Answer 4

回答by edge-case

@emunsing's answeris really cool for adding multiple columns, but I couldn't get it to work for me in python 2.7. Instead, I found this works:

@emunsing 的答案对于添加多列来说真的很酷，但我无法在 python 2.7 中使用它。相反，我发现这有效：

mydf = mydf.reindex(columns = np.append( mydf.columns.values, ['newcol1','newcol2'])

Answer 5

回答by liana

an even simpler solution is:

一个更简单的解决方案是：

df = df.reindex(columns = header_list)

where "header_list" is a list of the headers you want to appear.

其中“header_list”是您要显示的标题列表。

any header included in the list that is not found already in the dataframe will be added with blank cells below.

列表中未在数据框中找到的任何标题都将在下方添加空白单元格。

so if

因此，如果

header_list = ['a','b','c', 'd']

then c and d will be added as columns with blank cells

然后 c 和 d 将添加为带有空白单元格的列

Answer 6

回答by Joy Mazumder

if you want to add column name from a list

如果要从列表中添加列名

df=pd.DataFrame()
a=['col1','col2','col3','col4']
for i in a:
    df[i]=np.nan

Answer 7

回答by Carsten

I like:

我喜欢：

df['new'] = pd.Series()

This makes sure that a dfwith zero rows stays with zero rows.

这确保df具有零行的 a 保持零行。

Answer 8

回答by moys

The below code address the question "How do I add n number of empty columns to my existing dataframe". In the interest of keeping solutions to similar problems in one place, I am adding it here.

下面的代码解决了“如何向现有数据框添加 n 个空列”的问题。为了将类似问题的解决方案集中在一处，我将其添加到此处。

Approach 1 (to create 64 additional columns with column names from 1-64)

方法 1（创建 64 个附加列，列名从 1 到 64）

m = list(range(1,65,1)) 
dd=pd.DataFrame(columns=m)
df.join(dd).replace(np.nan,'') #df is the dataframe that already exists

Approach 2 (to create 64 additional columns with column names from 1-64)

方法 2（创建 64 个附加列，列名从 1 到 64）

df.reindex(df.columns.tolist() + list(range(1,65,1)), axis=1).replace(np.nan,'')

Answer 9

回答by Bharath_Raja

You can do

你可以做

df['column'] = None #This works. This will create a new column with None type
df.column = None #This will work only when the column is already present in the dataframe

Answer 10

回答by Usman Ahmad

One can use df.insert(index_to_insert_at, column_header, init_value)to insert new column at a specific index.

可以使用df.insert(index_to_insert_at, column_header, init_value)在特定索引处插入新列。

cost_tbl.insert(1, "col_name", "")

The above statement would insert an empty Column after the first column.

上面的语句将在第一列之后插入一个空列。

Python 如何将空列添加到数据框？

提问by kjo

采纳答案by DSM

回答by emunsing

回答by Nickil Maveli

回答by edge-case

回答by liana

回答by Joy Mazumder

回答by Carsten

回答by moys

回答by Bharath_Raja

回答by Usman Ahmad

相关推荐

最近更新

标签

Python 如何将空列添加到数据框？

提问by kjo

采纳答案by DSM

回答by emunsing

回答by Nickil Maveli

回答by edge-case

回答by liana

回答by Joy Mazumder

回答by Carsten

回答by moys

回答by Bharath_Raja

回答by Usman Ahmad

相关推荐

Python 导入错误：无法使用 cx_Freeze 导入名称 MAXREPEAT

Python Nginx 和 uWSGI：连接被拒绝和 502 Bad Gateway 错误

Python 这是什么意思：key=lambda x: x[1] ？

Python 如果集合为空则返回布尔值

相关推荐

最近更新

标签