使用 Python 从 Excel 中提取列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14931906/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract columns from Excel using Python
提问by mirageservo
I have an Excel file with the ff: row/col structure
我有一个带有 ff: row/col 结构的 Excel 文件
ID English Spanish French
1 Hello Hilo Halu
2 Hi Hye Ghi
3 Bus Buzz Bas
I would like to read the Excel file, extract the row and col values, and create 3 new files base on the columns English, Spanish, and French.
我想读取 Excel 文件,提取行和列值,并基于英语、西班牙语和法语列创建 3 个新文件。
So I would have something like:
所以我会有类似的东西:
English File:
英文档案:
"1" = "Hello"
"2" = "Hi"
"3" = "Bus"
I've been using xlrd. I can open, read, and print the contents of the file. However, this is what I get using this command (with the Excel file already open):
我一直在使用 xlrd。我可以打开、阅读和打印文件的内容。但是,这就是我使用此命令得到的结果(Excel 文件已经打开):
for index in xrange(0,2):
theWord = '\n' + str(sh.col_values(index, start_rowx=index, end_rowx=1)) + '=' + str(sh.col_values(index+1, start_rowx=index, end_rowx = 1))
print theWord
OUTPUT:
输出:
[u'Parameter/Variable/Key/String']=[u'ENGLISH'] <-- is this a list?, didn't the str() use to strip it out?
What's the udoing there? How can I remove the square brackets?
什么是ü做什么呢?如何删除方括号?
采纳答案by Raufio
The umeans it is a unicode string, it gets put there when you call str(). If you write the string out to a file it wont be there. What you are getting is 1 row from the column. It's because you are using end_rowx=1it returns a list with one element.
这u意味着它是一个 unicode 字符串,当您调用str(). 如果你把字符串写到一个文件中,它就不会在那里了。你得到的是列中的 1 行。这是因为您使用end_rowx=1它返回一个包含一个元素的列表。
Try getting the column value lists:
尝试获取列值列表:
ids = sh.col_values(0, start_rowx=1)
english = sh.col_values(1, start_rowx=1)
spanish = sh.col_values(2, start_rowx=1)
french = sh.col_values(3, start_rowx=1)
and then you can zipthem into tuple lists:
然后你可以将zip它们放入元组列表中:
english_with_IDS = zip(ids, english)
spanish_with_IDS = zip(ids, spanish)
french_with_IDS = zip(ids, french)
Which are in the form:
形式如下:
("1", "Hello"),("2", "Hi"), ("3", "Bus")
If you want to print the pairs:
如果要打印对:
for id, word in english_with_IDS:
print id + "=" + word
col_valuesreturns a list of column values, if you want single values you can call sh.cell_value(rowx, cellx).
col_values返回一个列值列表,如果你想要单个值,你可以调用sh.cell_value(rowx, cellx).
回答by root
Use pandas:
使用熊猫:
In [1]: import pandas as pd
In [2]: df = pd.ExcelFile('test.xls').parse('Sheet1', index_col=0) # reads file
In [3]: df.index = df.index.map(int)
In [4]: for col in df.columns:
...: column = df[col]
...: column.to_csv(column.name, sep='=') # writes each column to a file
...: # with filename == column name
In [5]: !cat English # English file content
1=Hello
2=Hi
3=Bus
回答by tigeronk2
import xlrd
sh = xlrd.open_workbook('input.xls').sheet_by_index(0)
english = open("english.txt", 'w')
spanish = open("spanish.txt", 'w')
french = open("french.txt", 'w')
try:
for rownum in range(sh.nrows):
english.write(str(rownum)+ " = " +str(sh.cell(rownum, 0).value)+"\n")
spanish.write(str(rownum)+ " = " +str(sh.cell(rownum, 1).value)+"\n")
french.write(str(rownum)+ " = " +str(sh.cell(rownum, 2).value)+"\n")
finally:
english.close()
spanish.close()
french.close()

