Python 从现有数据框的某些列创建新的熊猫数据框

Question

提问by Sjoseph

I have read loaded a csv file into a pandas dataframe and want to do some simple manipulations on the dataframe. I can not figure out how to create a new dataframe based on selected columns from my original dataframe. My attempt:

我已将 csv 文件读取到 Pandas 数据帧中，并希望对数据帧进行一些简单的操作。我不知道如何根据原始数据框中的选定列创建新的数据框。我的尝试：

names = ['A','B','C','D']
dataset = pandas.read_csv('file.csv', names=names)
new_dataset = dataset['A','D']

I would like to create a new dataframe with the columns A and D from the original dataframe.

我想用原始数据帧中的 A 列和 D 列创建一个新数据帧。

Answer 1

采纳答案by jezrael

It is called subset- passed list of columns in []:

它被称为subset- 传递的列列表[]：

dataset = pandas.read_csv('file.csv', names=names)

new_dataset = dataset[['A','D']]

what is same as:

什么是相同的：

new_dataset = dataset.loc[:, ['A','D']]

If need only filtered output add parameter usecolsto read_csv:

如果只需要过滤输出添加参数usecols到read_csv：

new_dataset = pandas.read_csv('file.csv', names=names, usecols=['A','D'])

EDIT:

编辑：

If use only:

如果仅使用：

new_dataset = dataset[['A','D']]

and use some data manipulation, obviously get:

并使用一些数据操作，显然得到：

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

试图在来自 DataFrame 的切片副本上设置值。
尝试使用 .loc[row_indexer,col_indexer] = value 代替

If you modify values in new_datasetlater you will find that the modifications do not propagate back to the original data (dataset), and that Pandas does warning.

如果new_dataset稍后修改值，您会发现修改不会传播回原始数据 ( dataset)，并且 Pandas 会发出警告。

As pointed EdChumadd copyfor remove warning:

正如所指出的EdChum添加copy删除警告：

new_dataset = dataset[['A','D']].copy()

Python 从现有数据框的某些列创建新的熊猫数据框

提问by Sjoseph

采纳答案by jezrael

相关推荐

最近更新

标签

Python 从现有数据框的某些列创建新的熊猫数据框

提问by Sjoseph

采纳答案by jezrael

相关推荐

Python 类型错误：不支持 / 的操作数类型：'list' 和 'int'

Python Tensorflow 在 C++ 中导出和运行图的不同方式

Python 为什么我会收到此错误“TypeError: 'method' object is not iterable”？

如何在 Windows 上的 python 中安装 XGBoost 包

相关推荐

最近更新

标签