Python 将特定选定的列作为副本提取到新的 DataFrame
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34682828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extracting specific selected columns to new DataFrame as a copy
提问by SpeedCoder5
I have a pandas DataFrame with 4 columns and I want to create a newDataFrame that onlyhas three of the columns. This question is similar to: Extracting specific columns from a data framebut for pandas not R. The following code does not work, raises an error, and is certainly not the pandasnic way to do it.
我有一个有 4 列的 Pandas DataFrame,我想创建一个只有三列的新DataFrame 。这个问题类似于:从数据框中提取特定列,但对于 Pandas 而不是 R。以下代码不起作用,引发错误,当然不是 Pandasnic 的方法。
import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new = pd.DataFrame(zip(old.A, old.C, old.D)) # raises TypeError: data argument can't be an iterator
What is the pandasnic way to do it?
pandasnic 的方法是什么?
采纳答案by johnchase
There is a way of doing this and it actually looks similar to R
有一种方法可以做到这一点,它实际上看起来类似于 R
new = old[['A', 'C', 'D']].copy()
Here you are just selecting the columns you want from the original data frame and creating a variable for those. If you want to modify the new dataframe at all you'll probably want to use .copy()
to avoid a SettingWithCopyWarning
.
在这里,您只需从原始数据框中选择所需的列并为这些列创建变量。如果您想完全修改新的数据框,您可能希望使用它.copy()
来避免SettingWithCopyWarning
.
An alternative method is to use filter
which will create a copy by default:
另一种方法是使用filter
它默认创建一个副本:
new = old.filter(['A','B','D'], axis=1)
Finally, depending on the number of columns in your original dataframe, it might be more succinct to express this using a drop
(this will also create a copy by default):
最后,根据原始数据框中的列数,使用 a 表示可能更简洁drop
(这也会默认创建一个副本):
new = old.drop('B', axis=1)
回答by Hit
Another simpler way seems to be:
另一种更简单的方法似乎是:
new = pd.DataFrame([old.A, old.B, old.C]).transpose()
where old.column_name
will give you a series.
Make a list of all the column-series you want to retain and pass it to the DataFrame constructor. We need to do a transpose to adjust the shape.
哪里old.column_name
会给你一个系列。列出要保留的所有列系列并将其传递给 DataFrame 构造函数。我们需要做一个转置来调整形状。
In [14]:pd.DataFrame([old.A, old.B, old.C]).transpose()
Out[14]:
A B C
0 4 10 100
1 5 20 50
回答by Deslin Naidoo
Generic functional form
通用函数形式
def select_columns(data_frame, column_names):
new_frame = data_frame.loc[:, column_names]
return new_frame
Specific for your problem above
针对您上面的问题
selected_columns = ['A', 'C', 'D']
new = select_columns(old, selected_columns)
回答by Ellen
As far as I can tell, you don't necessarily need to specify the axis when using the filter function.
据我所知,使用过滤器功能时不一定需要指定轴。
new = old.filter(['A','B','D'])
returns the same dataframe as
返回相同的数据帧
new = old.filter(['A','B','D'], axis=1)
回答by stidmatt
The easiest way is
最简单的方法是
new = old[['A','C','D']]
.
.
回答by sailfish009
columns by index:
按索引列:
# selected column index: 1, 6, 7
new = old.iloc[: , [1, 6, 7]].copy()
回答by Ali.E
If you want to have a new data frame then:
如果你想要一个新的数据框,那么:
import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new= old[['A', 'C', 'D']]