Python 从现有数据框的某些列创建新的熊猫数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45035929/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:38:11  来源:igfitidea点击:

Creating new pandas dataframe from certain columns of existing dataframe

pythonpandasdataframe

提问by Sjoseph

I have read loaded a csv file into a pandas dataframe and want to do some simple manipulations on the dataframe. I can not figure out how to create a new dataframe based on selected columns from my original dataframe. My attempt:

我已将 csv 文件读取到 Pandas 数据帧中,并希望对数据帧进行一些简单的操作。我不知道如何根据原始数据框中的选定列创建新的数据框。我的尝试:

names = ['A','B','C','D']
dataset = pandas.read_csv('file.csv', names=names)
new_dataset = dataset['A','D']

I would like to create a new dataframe with the columns A and D from the original dataframe.

我想用原始数据帧中的 A 列和 D 列创建一个新数据帧。

采纳答案by jezrael

It is called subset- passed list of columns in []:

它被称为subset- 传递的列列表[]

dataset = pandas.read_csv('file.csv', names=names)

new_dataset = dataset[['A','D']]

what is same as:

什么是相同的:

new_dataset = dataset.loc[:, ['A','D']]

If need only filtered output add parameter usecolsto read_csv:

如果只需要过滤输出添加参数usecolsread_csv

new_dataset = pandas.read_csv('file.csv', names=names, usecols=['A','D'])

EDIT:

编辑:

If use only:

如果仅使用:

new_dataset = dataset[['A','D']]

and use some data manipulation, obviously get:

并使用一些数据操作,显然得到:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

试图在来自 DataFrame 的切片副本上设置值。
尝试使用 .loc[row_indexer,col_indexer] = value 代替

If you modify values in new_datasetlater you will find that the modifications do not propagate back to the original data (dataset), and that Pandas does warning.

如果new_dataset稍后修改值,您会发现修改不会传播回原始数据 ( dataset),并且 Pandas 会发出警告。

As pointed EdChumadd copyfor remove warning:

正如所指出的EdChum添加copy删除警告:

new_dataset = dataset[['A','D']].copy()