Python read_csv 后在 Pandas 数据框中选择列时的关键错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35831496/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:02:28  来源:igfitidea点击:

Key error when selecting columns in pandas dataframe after read_csv

pythoncsvpandas

提问by Harry M

I'm trying to read in a CSV file into a pandas dataframe and select a column, but keep getting a key error.

我正在尝试将 CSV 文件读入 Pandas 数据帧并选择一列,但一直收到一个关键错误。

The file reads in successfully and I can view the dataframe in an iPython notebook, but when I want to select a column any other than the first one, it throws a key error.

该文件成功读入,我可以在 iPython 笔记本中查看数据帧,但是当我想选择除第一列之外的任何列时,它会引发关键错误。

I am using this code:

我正在使用此代码:

import pandas as pd

transactions = pd.read_csv('transactions.csv',low_memory=False, delimiter=',', header=0, encoding='ascii')
transactions['quarter']

This is the file I'm working on: https://www.dropbox.com/s/imd7hq2iq23hf8o/transactions.csv?dl=0

这是我正在处理的文件:https: //www.dropbox.com/s/imd7hq2iq23hf8o/transactions.csv?dl=0

Thank you!

谢谢!

回答by MaxU

use sep='\s*,\s*'so that you will take care of spaces in column-names:

使用sep='\s*,\s*'以便您处理列名中的空格:

transactions = pd.read_csv('transactions.csv', sep=r'\s*,\s*',
                           header=0, encoding='ascii', engine='python')

alternatively you can make sure that you don't have unquoted spaces in your CSV file and use your command (unchanged)

或者,您可以确保 CSV 文件中没有未加引号的空格并使用您的命令(未更改)

prove:

证明:

print(transactions.columns.tolist())

Output:

输出:

['product_id', 'customer_id', 'store_id', 'promotion_id', 'month_of_year', 'quarter', 'the_year', 'store_sales', 'store_cost', 'unit_sales', 'fact_count']

回答by Aswin Babu

if you need to select multiple columns from dataframe use 2 pairs of square brackets eg.

如果您需要从数据框中选择多列,请使用 2 对方括号,例如。

df[["product_id","customer_id","store_id"]]

回答by beta

The key error generally comes if the key doesn't match any of the dataframe column name 'exactly':

如果键与任何数据框列名称“完全”不匹配,通常会出现键错误:

You could also try:

你也可以试试:

import csv
import pandas as pd
import re
    with open (filename, "r") as file:
        df = pd.read_csv(file, delimiter = ",")
        df.columns = ((df.columns.str).replace("^ ","")).str.replace(" $","")
        print(df.columns)

回答by john collin

I finally got the answer to how to read a specific column:

我终于得到了如何阅读特定专栏的答案:

import pandas as pd  

df=pd.read_csv('titanic.csv',sep='\t')

df['key']   

Because pandas separator uses \t. I hope that works for you.

因为熊猫分隔符使用\t. 我希望这对你有用。