pandas 熊猫阅读 csv 方向

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10484106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 15:42:57  来源:igfitidea点击:

pandas reading csv orientation

pythoncsvpandas

提问by Nicola Vianello

Hei I'm trying to read in pandas the csv file you can download from here(euribor rates I think you can imagine the reason I would like to have this file!). The file is a CSV file but it is somehow strangely oriented. If you import it in Excel file has the format

嘿,我正在尝试在Pandas 中读取你可以从这里下载的 csv 文件(euribor 利率我想你可以想象我想要这个文件的原因!)。该文件是一个 CSV 文件,但它的方向有些奇怪。如果你在 Excel 文件中导入它有格式

   02/01/2012,03/01/2012,04/01/2012,,,, 
1w 0.652,0.626,0.606,,,,
2w,0.738,0.716,0.700,,,,

act with first column going up to 12m (but I have give you the link where you can download a sample). I would like to read it in pandas but I'm not able to read it in the correct way. Pandas has a built-in function for reading csv files but somehow it expect to be row oriented rather than column oriented. What I would like to do is to obtain the information on the row labeled 3m and having the values and the date in order to plot the time variation of this index. But I can't handle this problem. I know I can read the data with

第一列最多 12m(但我已经给了你可以下载样本的链接)。我想在熊猫中阅读它,但我无法以正确的方式阅读它。Pandas 具有用于读取 csv 文件的内置函数,但不知何故它期望面向行而不是面向列。我想要做的是获取标记为 3m 并具有值和日期的行上的信息,以便绘制该索引的时间变化。但我无法处理这个问题。我知道我可以读取数据

import pandas 
data = pandas.io.read_csv("file.csv",parse_dates=True) 

but it would work if the csv file would be somehow transpose. H

但如果 csv 文件以某种方式转置,它就会起作用。H

回答by Thomas K

A pandas dataframe has a .transpose()method, but it doesn't like all the empty rows in this file. Here's how to get it cleaned up:

Pandas 数据帧有一个.transpose()方法,但它不喜欢这个文件中的所有空行。以下是清理它的方法:

df = pandas.read_csv("hist_EURIBOR_2012.csv")  # Read the file
df = df[:15]    # Chop off the empty rows beyond 12m
df2 = df.transpose()
df2 = df2[:88]  # Chop off what were empty columns (I guess you should increase 88 as more data is added.

Of course, you can chain these together:

当然,您可以将这些链接在一起:

df2 = pandas.read_csv("hist_EURIBOR_2012.csv")[:15].transpose()[:88]

Then df2['3m']is the data you want, but the dates are still stored as strings. I'm not quite sure how to convert it to a DateIndex.

然后df2['3m']是您想要的数据,但日期仍然存储为字符串。我不太确定如何将其转换为DateIndex.

回答by Jason Morgan

I've never used pandas for csv processing. I just use the standard Python lib csv functions as these use iterators.

我从未使用过熊猫进行 csv 处理。我只使用标准的 Python lib csv 函数,因为它们使用迭代器。

import csv
myCSVfile=r"c:/Documents and Settings/Jason/Desktop/hist_EURIBOR_2012.csv"
f=open(myCSVfile,"r")
reader=csv.reader(f,delimiter=',')
data=[]
for l in reader:
    if l[0].strip()=="3m":
        data.append(l)

f.close()