Python:使用 Excel CSV 文件仅读取某些列和行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15286560/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: Using Excel CSV file to read only certain columns and rows
提问by Thomas Jones
While I can read csv file instead of reading to whole file how can I print only certain rows and columns?
虽然我可以读取 csv 文件而不是读取整个文件,但如何仅打印某些行和列?
Imagine as if this is Excel:
想象一下,如果这是 Excel:
A B C D E
State |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate
Alabama 235.5 54.5 16.7 18.01
Alaska 147.9 44.3 3.2 N/A
Arizona 152.5 32.7 11.9 N/A
Arkansas 221.8 57.4 10.2 N/A
California 177.9 42.2 N/A N/A
Colorado 145.3 39 8.4 9.25
Heres what I have:
这是我所拥有的:
import csv
try:
risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file
except:
while risk != "riskfactors.csv": # if the file cant be found if there is an error
print("Could not open", risk, "file")
risk = input("\nPlease try to open file again: ")
else:
with open("riskfactors.csv") as f:
reader = csv.reader(f, delimiter=' ', quotechar='|')
data = []
for row in reader:# Number of rows including the death rates
for col in (2,4): # The columns I want read B and D
data.append(row)
data.append(col)
for item in data:
print(item) #print the rows and columns
I need to only read column B and D with all statistics to read like this:
我只需要读取 B 列和 D 列的所有统计信息,如下所示:
A B D
State |Heart Disease Rate| HIV Diagnosis Rate |
Alabama 235.5 16.7
Alaska 147.9 3.2
Arizona 152.5 11.9
Arkansas 221.8 10.2
California 177.9 N/A
Colorado 145.3 8.4
Edited
已编辑
no errors
没有错误
Any ideas on how to tackle this? Everything I try isn't working. Any help or advice is much appreciated.
关于如何解决这个问题的任何想法?我尝试的一切都不起作用。非常感谢任何帮助或建议。
采纳答案by max k.
If you're still stuck, there's really no reason you have to read the file with the CSV module as all CSV files are just comma separated strings. So, for something simple you could try this, which would give you a list of tuples of the form (state,heart disease rate,HIV diagnosis rate)
如果您仍然卡住,那么您真的没有理由必须使用 CSV 模块读取文件,因为所有 CSV 文件都只是逗号分隔的字符串。所以,对于一些简单的事情,你可以试试这个,它会给你一个形式的元组列表(状态,心脏病发病率,艾滋病毒诊断率)
output = []
f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
cells = line.split( "," )
output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column
f.close()
print output
Just note that you would then have to go through and ignore the header rows if you wanted to do any sort of data analysis.
请注意,如果您想进行任何类型的数据分析,则必须仔细检查并忽略标题行。
回答by Paul Yin
try this
尝试这个
data = []
for row in reader:# Number of rows including the death rates
data.append([row[1],row[3]) # The columns I want read B and D
for item in data
print(item) #print the rows and columns
回答by LonelySoul
I hope you have heard about Pandas for Data Analysis.
我希望您听说过用于数据分析的 Pandas。
The following code will do the job for reading columns however about reading rows, you might have to explain more.
以下代码将完成读取列的工作,但是关于读取行,您可能需要解释更多。
import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io

