Python 如何在使用 Pandas 读取特定列的 csv 文件时删除它？

Question

提问by Anon George

I need to remove a columnwith label nameat the time of loading a csv using pandas. I am reading csv as follows and want to add parameters inside it to do so. Thanks.

我需要在使用 .csv 加载 csv 时删除带有标签名称的列。我正在按如下方式读取 csv 并希望在其中添加参数。谢谢。pandas

pd.read_csv("sample.csv")

I know this to do after reading csv:

我在阅读 csv 后知道要这样做：

df.drop('name', axis=1)

Answer 1

回答by Sociopath

If you know the column names prior, you can do it by setting usecolsparameter

如果您事先知道列名，则可以通过设置usecols参数来完成

When you know which columns to use

当您知道要使用哪些列时

Suppose you have csv file with columns ['id','name','last_name']and you want just ['name','last_name']. You can do it as below:

假设您有包含列的 csv 文件，['id','name','last_name']而您只需要['name','last_name']. 你可以这样做：

import pandas as pd
df = pd.read_csv("sample.csv", usecols = ['name','last_name'])

when you want first N columns

当你想要前 N 列时

If you don't know the column names but you want first N columns from dataframe. You can do it by

如果您不知道列名，但您想要数据框中的前 N 列。你可以通过

import pandas as pd
df = pd.read_csv("sample.csv", usecols = [i for i in range(n)])

Edit

编辑

When you know name of the column to be dropped

当您知道要删除的列的名称时

# Read column names from file
cols = list(pd.read_csv("sample_data.csv", nrows =1))
print(cols)

# Use list comprehension to remove the unwanted column in **usecol**
df= pd.read_csv("sample_data.csv", usecols =[i for i in cols if i != 'name'])

Answer 2

回答by cs95

Get the column headers from your CSV using pd.read_csvwith nrows=1, then do a subsequent read with usecolsto pull everything but the column(s) you want to omit.

使用pd.read_csvwith从 CSV 中获取列标题nrows=1，然后进行后续读取usecols以提取除要省略的列之外的所有内容。

headers = [*pd.read_csv('sample.csv', nrows=1)]
df = pd.read_csv('sample.csv', usecols=[c for c in headers if c != 'name']))

Alternatively, you can do the same thing (read only the headers) very efficientlyusing the CSV module,

或者，您可以使用 CSV 模块非常有效地执行相同的操作（仅读取标题），

import csv

with open("sample.csv", 'r') as f:
    header = next(csv.reader(f))
    # For python 2, use
    # header = csv.reader(f).next()

df = pd.read_csv('sample.csv', usecols=list(set(header) - {'name'}))

Answer 3

回答by Ege

Using df= df.drop(['ID','prediction'],axis=1)made the work for me. I dropped 'ID' and 'prediction' columns. Make sure you put them in square brackets like ['column1','column2']. There is need for other complicated solutions.

使用df= df.drop(['ID','prediction'],axis=1)为我工作。我删除了“ID”和“预测”列。确保将它们放在方括号中，例如['column1','column2']. 需要其他复杂的解决方案。

Python 如何在使用 Pandas 读取特定列的 csv 文件时删除它？

提问by Anon George

回答by Sociopath

回答by cs95

回答by Ege

相关推荐

最近更新

标签

Python 如何在使用 Pandas 读取特定列的 csv 文件时删除它？

提问by Anon George

回答by Sociopath

回答by cs95

回答by Ege

相关推荐

Python 日期时间：使用具有时区感知日期的 strftime()

Python 如何发送彩色短信？

在python中读取图像

Python 类型错误：使用 strptime 时必须是字符串，而不是 datetime.datetime

相关推荐

最近更新

标签