Pandas TypeError：仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效，但得到了“Int64Index”的实例

Question

提问by Chris

I've got some order data that I want to analyse. Currently of interest is: How often has which SKU been bought in which month?

我有一些要分析的订单数据。目前感兴趣的是：哪个SKU在哪个月被购买的频率？

Here a small example:

这里有一个小例子：

import datetime
import pandas as pd
import numpy as np

d = {'sku': ['RT-17']}
df_skus = pd.DataFrame(data=d)
print(df_skus)

d = {'date': ['2017/02/17', '2017/03/17', '2017/04/17', '2017/04/18', '2017/05/02'], 'item_sku': ['HT25', 'RT-17', 'HH30', 'RT-17', 'RT-19']}
df_orders = pd.DataFrame(data=d)
print(df_orders)

for i in df_orders.index:
    print("\n toll")
    df_orders.loc[i,'date']=pd.to_datetime(df_orders.loc[i, 'date'])

df_orders = df_orders[df_orders["item_sku"].isin(df_skus["sku"])]
monthly_sales = df_orders.groupby(["item_sku", pd.Grouper(key="date",freq="M")]).size()
monthly_sales = monthly_sales.unstack(0) 

print(monthly_sales)

That works fine, but if I use my real order data (from CSV) I get after some minutes:

这工作正常，但如果我使用我的真实订单数据（来自 CSV），几分钟后我会得到：

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

类型错误：仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效，但有一个“Int64Index”实例

That problem comes from the line:

该问题来自以下行：

monthly_sales = df_orders.groupby(["item_sku", pd.Grouper(key="date",freq="M")]).size()

Is it possible to skip over the error? I tried a try except block:

是否可以跳过错误？我尝试了一个除了块之外的尝试：

try:
    monthly_sales = df_orders.groupby(["item_sku", pd.Grouper(key="date",freq="M")]).size()
    monthly_sales = monthly_sales.unstack(0) 
except:
    print "\n Here seems to be one issue"

Then I get for the print(monthly_sales)

然后我得到印刷品（monthly_sales）

Empty DataFrame
Columns: [txn_id, date, item_sku, quantity]
Index: []

空数据帧
列：[txn_id，日期，item_sku，数量]
索引：[]

So something in my data empties or brakes the grouping it seems like? How can I 'clean' my data?
Or I'd be even fine with loosing the data of a sale here and there if I can just 'skip' over the error, is this possible?

那么我的数据中的某些内容会清空或破坏分组吗？如何“清理”我的数据？
或者，如果我可以“跳过”错误，我什至可以在这里和那里丢失销售数据，这可能吗？

Answer 1

回答by cs95

When reading your CSV, use the parse_datesargument -

阅读 CSV 时，请使用parse_dates参数 -

df_order = pd.read_csv('file.csv', parse_dates=['date'])

Which automatically converts dateto datetime. If that doesn't work, then you'll need to load it in as a string, and then use the errors='coerce'argument with pd.to_datetime-

它会自动转换date为日期时间。如果这不起作用，那么您需要将其作为字符串加载，然后将errors='coerce'参数与pd.to_datetime-

df_order['date'] = pd.to_datetime(df_order['date'], errors='coerce')

Note that you can pass series objects (amongst other things) to pd.to_datetime`.

请注意，您可以将系列对象（除其他外）传递给 pd.to_datetime`。

Next, filter and group as you've been doing, and it should work.

接下来，像你一直在做的那样过滤和分组，它应该可以工作。

df_orders[df_orders["item_sku"].isin(df_skus["sku"])]\
     .groupby(['item_sku', pd.Grouper(key='date', freq='M')]).size()

item_sku  date      
RT-17     2017-03-31    1
          2017-04-30    1

Pandas TypeError：仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效，但得到了“Int64Index”的实例

提问by Chris

回答by cs95

相关推荐

最近更新

标签

Pandas TypeError：仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效，但得到了“Int64Index”的实例

提问by Chris

回答by cs95

相关推荐

Pandas 合并 TypeError：“NoneType”类型的对象没有 len()

pandas 用于多个分隔符的熊猫 read_csv()

pandas 基于组和前一行pandas的前向填充（ffill）

检查 Pandas 数据框的异常值

相关推荐

最近更新

标签