处理错误“TypeError:预期的元组,得到了str”将CSV加载到pandas多级和多索引(pandas)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/53022580/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Handling error "TypeError: Expected tuple, got str" loading a CSV to pandas multilevel and multiindex (pandas)
提问by Andre Araujo
I'm trying to load a CSV file (this file) to create a multiindex e multilevel dataframe. It has 5(five) indexesand 3(three) levelsin columns.
我正在尝试加载一个 CSV 文件(这个文件)来创建一个多索引 e 多级数据框。它在列中有5(五个)索引和3(三个)级别。
How I can do? Here is the code:
我能怎么办?这是代码:
df = pd.read_csv('./teste.csv'
,index_col=[0,1,2,3,4]
,header=[0,1,2,3]
,skipinitialspace=True
,tupleize_cols=True)
df.columns = pd.MultiIndex.from_tuples(df.columns)
Expected output:
预期输出:
variables u \
level 1
days 1 2
times 00h 06h 12h 18h 00h
wsid lat lon start prcp_24
329 -43.969397 -19.883945 2007-03-18 10:00:00 72.0 0 0 0 0 0
2007-03-20 10:00:00 104.4 0 0 0 0 0
2007-10-18 23:00:00 92.8 0 0 0 0 0
2007-12-21 00:00:00 60.4 0 0 0 0 0
2008-01-19 18:00:00 53.0 0 0 0 0 0
2008-04-05 01:00:00 80.8 0 0 0 0 0
2008-10-31 17:00:00 101.8 0 0 0 0 0
2008-11-01 04:00:00 82.0 0 0 0 0 0
2008-12-29 00:00:00 57.8 0 0 0 0 0
2009-03-28 10:00:00 72.4 0 0 0 0 0
2009-10-07 02:00:00 57.8 0 0 0 0 0
2009-10-08 00:00:00 83.8 0 0 0 0 0
2009-11-28 16:00:00 84.4 0 0 0 0 0
2009-12-18 04:00:00 51.8 0 0 0 0 0
2009-12-28 00:00:00 96.4 0 0 0 0 0
2010-01-06 05:00:00 74.2 0 0 0 0 0
2011-12-18 00:00:00 113.6 0 0 0 0 0
2011-12-19 00:00:00 90.6 0 0 0 0 0
2012-11-15 07:00:00 85.8 0 0 0 0 0
2013-10-17 00:00:00 52.4 0 0 0 0 0
2014-04-01 22:00:00 72.0 0 0 0 0 0
2014-10-20 06:00:00 56.6 0 0 0 0 0
2014-12-13 09:00:00 104.4 0 0 0 0 0
2015-02-09 00:00:00 62.0 0 0 0 0 0
2015-02-16 19:00:00 56.8 0 0 0 0 0
2015-05-06 17:00:00 50.8 0 0 0 0 0
2016-02-26 00:00:00 52.2 0 0 0 0 0
I need handling error "TypeError: Expected tuple, got str":
我需要处理错误“TypeError:预期的元组,得到 str”:
TypeError: Expected tuple, got str
回答by Sandeep Kadapa
You are getting an error because some of your columns are not tuples, they are strings from index 2368to 2959in df.columns.
Indices where the columns are strings:
您收到错误,因为您的某些列不是元组,它们是从 index2368到2959in 的字符串df.columns。
列是字符串的索引:
df.columns[2368:2959]
Index(['('z', '1', '1', '00h').1', '('z', '1', '1', '06h').1',
'('z', '1', '1', '12h').1', '('z', '1', '1', '18h').1',
'('z', '1', '2', '00h').1', '('z', '1', '2', '06h').1',
'('z', '1', '2', '12h').1', '('z', '1', '2', '18h').1',
'('z', '1', '3', '00h').1', '('z', '1', '3', '06h').1',
...
'('z', '1000', '2', '06h').1', '('z', '1000', '2', '12h').1',
'('z', '1000', '2', '18h').1', '('z', '1000', '3', '00h').1',
'('z', '1000', '3', '06h').1', '('z', '1000', '3', '12h').1',
'('z', '1000', '3', '18h').1', '('z', '1000', '4', '00h').1',
'('z', '1000', '4', '06h').1', '('z', '1000', '4', '12h').1'],
dtype='object', length=591)
Since you want multi-index column dataframe using the tuples, so we are cleaning these strings first by taking the substring which is necessary using re.findallwith regex pattern = '(\(.*?\)).'then passing this value through ast.literal_evalfor converting string to tuple automatically. Finally, using the pd.MultiIndex.from_tuplesas:
由于您想要使用元组的多索引列数据帧,因此我们首先通过获取必要的子字符串来清理这些字符串re.findall,regex pattern = '(\(.*?\)).'然后传递此值ast.literal_eval以自动将字符串转换为元组。最后,使用pd.MultiIndex.from_tuplesas:
df = pd.read_csv('teste.csv',index_col=[0,1,2,3,4],header=[0,1,2,3],parse_dates=True)
import re
import ast
column_list = []
for column in df.columns:
if isinstance(column,str):
column_list.append(ast.literal_eval(re.findall('(\(.*?\)).',column)[0]))
else:
column_list.append(column)
df.columns = pd.MultiIndex.from_tuples(column_list,
names=('variables', 'level','days','times'))
print(df.iloc[:,:6].head())
variables u
level 1
days 1 2
times 00h 06h 12h 18h 00h 06h
wsid lat lon start prcp_24
329 -43.969397 -19.883945 2007-03-18 10:00:00 72.0 0 0 0 0 0 0
2007-03-20 10:00:00 104.4 0 0 0 0 0 0
2007-10-18 23:00:00 92.8 0 0 0 0 0 0
2007-12-21 00:00:00 60.4 0 0 0 0 0 0
2008-01-19 18:00:00 53.0 0 0 0 0 0 0

