Pandas 密钥错误日期
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35497189/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Key Error date
提问by eWizardII
df['ts'] = pd.to_datetime(df['_created_at'])
df = df.set_index('ts')
def f(x):
x = x.reindex(df.index)
x = x.sort_values('battery')
x['ts'] = x['ts'].fillna(method='ffill')
x['battery'] = x['battery'].combine_first(df['battery'])
x['model'] = x['model'].combine_first(df['model'])
x['user'] = x['user'].combine_first(df['user'])
x['version'] = x['version'].combine_first(df['version'])
return x
I have the above code and it seems I run into an error when I get to the x['ts'] = x['ts'].fillna(method='ffill')
line. This occurs when I run the following command:
我有上面的代码,当我到达x['ts'] = x['ts'].fillna(method='ffill')
线路时似乎遇到了错误。当我运行以下命令时会发生这种情况:
df = df.groupby(level=0, sort=False).apply(f).reset_index(level=0, drop=True).reset_index()
df = df.groupby(level=0, sort=False).apply(f).reset_index(level=0, drop=True).reset_index()
My ts
values look like : 2013-03-04 13:56:29.662
and are datetime64; I don't understand what I am doing wrong that is causing this key error on ts
as I thought seeing them as to_datetime
would put the index in a format pandas understands. Ideas on how to fix this?
我的ts
值看起来像:2013-03-04 13:56:29.662
并且是 datetime64; 我不明白我做错了什么导致了这个关键错误,ts
因为我认为看到它们to_datetime
会将索引置于Pandas理解的格式中。关于如何解决这个问题的想法?
回答by jezrael
I think you have to omit this problematic row like, because column ts
is set to index
and is filled values by x.reindex(df.index)
. I think you need delete column _created_at
by drop
:
我认为你必须省略这个有问题的行,因为列ts
被设置为index
并由x.reindex(df.index)
. 我认为您需要_created_at
通过drop
以下方式删除列:
print df
_created_at user battery model version
0 2013-03-04 13:56:29.662 R 3 A 1
1 2013-03-05 13:56:29.662 S 5 B 3
2 2013-03-06 13:56:29.662 J 6 C 2
df['ts'] = pd.to_datetime(df['_created_at'])
df = df.drop('_created_at', axis=1)
df = df.set_index(['ts'])
def f(x):
#print x
x = x.reindex(df.index)
x = x.sort_values('battery')
#x['ts'] = x['ts'].fillna(method='ffill')
x['battery'] = x['battery'].combine_first(df['battery'])
x['model'] = x['model'].combine_first(df['model'])
x['user'] = x['user'].combine_first(df['user'])
x['version'] = x['version'].combine_first(df['version'])
return x
df = df.groupby(level=0, sort=False).apply(f).reset_index(level=0, drop=True).reset_index()
print df
ts user battery model version
0 2013-03-04 13:56:29.662 R 3 A 1
1 2013-03-05 13:56:29.662 S 5 B 3
2 2013-03-06 13:56:29.662 J 6 C 2
3 2013-03-05 13:56:29.662 S 5 B 3
4 2013-03-04 13:56:29.662 R 3 A 1
5 2013-03-06 13:56:29.662 J 6 C 2
6 2013-03-06 13:56:29.662 J 6 C 2
7 2013-03-04 13:56:29.662 R 3 A 1
8 2013-03-05 13:56:29.662 S 5 B 3
But maybe you need fillna
for other column e.g. user
:
但也许您需要fillna
其他列,例如user
:
df['ts'] = pd.to_datetime(df['_created_at'])
df = df.drop('_created_at', axis=1)
df = df.set_index(['ts'])
def f(x):
#print x
x = x.reindex(df.index)
x = x.sort_values('battery')
#x['ts'] = x['ts'].fillna(method='ffill')
x['battery'] = x['battery'].combine_first(df['battery'])
x['model'] = x['model'].combine_first(df['model'])
x['user'] = x['user'].fillna(method='ffill')
x['version'] = x['version'].combine_first(df['version'])
return x
df = df.groupby(level=0, sort=False).apply(f).reset_index(level=0, drop=True).reset_index()
print df
ts user battery model version
0 2013-03-04 13:56:29.662 R 3 A 1
1 2013-03-05 13:56:29.662 R 5 B 3
2 2013-03-06 13:56:29.662 R 6 C 2
3 2013-03-05 13:56:29.662 S 5 B 3
4 2013-03-04 13:56:29.662 S 3 A 1
5 2013-03-06 13:56:29.662 S 6 C 2
6 2013-03-06 13:56:29.662 J 6 C 2
7 2013-03-04 13:56:29.662 J 3 A 1
8 2013-03-05 13:56:29.662 J 5 B 3