Python 将特定选定的列作为副本提取到新的 DataFrame

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34682828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 15:21:46  来源:igfitidea点击:

Extracting specific selected columns to new DataFrame as a copy

pythonpandaschained-assignment

提问by SpeedCoder5

I have a pandas DataFrame with 4 columns and I want to create a newDataFrame that onlyhas three of the columns. This question is similar to: Extracting specific columns from a data framebut for pandas not R. The following code does not work, raises an error, and is certainly not the pandasnic way to do it.

我有一个有 4 列的 Pandas DataFrame,我想创建一个只有三列的DataFrame 。这个问题类似于:从数据框中提取特定列,但对于 Pandas 而不是 R。以下代码不起作用,引发错误,当然不是 Pandasnic 的方法。

import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new = pd.DataFrame(zip(old.A, old.C, old.D)) # raises TypeError: data argument can't be an iterator 

What is the pandasnic way to do it?

pandasnic 的方法是什么?

采纳答案by johnchase

There is a way of doing this and it actually looks similar to R

有一种方法可以做到这一点,它实际上看起来类似于 R

new = old[['A', 'C', 'D']].copy()

Here you are just selecting the columns you want from the original data frame and creating a variable for those. If you want to modify the new dataframe at all you'll probably want to use .copy()to avoid a SettingWithCopyWarning.

在这里,您只需从原始数据框中选择所需的列并为这些列创建变量。如果您想完全修改新的数据框,您可能希望使用它.copy()来避免SettingWithCopyWarning.

An alternative method is to use filterwhich will create a copy by default:

另一种方法是使用filter它默认创建一个副本:

new = old.filter(['A','B','D'], axis=1)

Finally, depending on the number of columns in your original dataframe, it might be more succinct to express this using a drop(this will also create a copy by default):

最后,根据原始数据框中的列数,使用 a 表示可能更简洁drop(这也会默认创建一个副本):

new = old.drop('B', axis=1)

回答by Hit

Another simpler way seems to be:

另一种更简单的方法似乎是:

new = pd.DataFrame([old.A, old.B, old.C]).transpose()

where old.column_namewill give you a series. Make a list of all the column-series you want to retain and pass it to the DataFrame constructor. We need to do a transpose to adjust the shape.

哪里old.column_name会给你一个系列。列出要保留的所有列系列并将其传递给 DataFrame 构造函数。我们需要做一个转置来调整形状。

In [14]:pd.DataFrame([old.A, old.B, old.C]).transpose()
Out[14]: 
   A   B    C
0  4  10  100
1  5  20   50

回答by Deslin Naidoo

Generic functional form

通用函数形式

def select_columns(data_frame, column_names):
    new_frame = data_frame.loc[:, column_names]
    return new_frame

Specific for your problem above

针对您上面的问题

selected_columns = ['A', 'C', 'D']
new = select_columns(old, selected_columns)

回答by Ellen

As far as I can tell, you don't necessarily need to specify the axis when using the filter function.

据我所知,使用过滤器功能时不一定需要指定轴。

new = old.filter(['A','B','D'])

returns the same dataframe as

返回相同的数据帧

new = old.filter(['A','B','D'], axis=1)

回答by stidmatt

The easiest way is

最简单的方法是

new = old[['A','C','D']]

.

.

回答by sailfish009

columns by index:

按索引列:

# selected column index: 1, 6, 7
new = old.iloc[: , [1, 6, 7]].copy() 

回答by Ali.E

If you want to have a new data frame then:

如果你想要一个新的数据框,那么:

import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new=  old[['A', 'C', 'D']]