Python 在 Pandas 中设置现有数据框的多索引

Question

提问by user3527975

I have a DataFramethat looks like

我有一个DataFrame看起来像

  Emp1    Empl2           date       Company
0    0        0     2012-05-01         apple
1    0        1     2012-05-29         apple
2    0        1     2013-05-02         apple
3    0        1     2013-11-22         apple
18   1        0     2011-09-09        google
19   1        0     2012-02-02        google
20   1        0     2012-11-26        google
21   1        0     2013-05-11        google

I want to pass the company and date for setting a MultiIndexfor this DataFrame. Currently it has a default index. I am using df.set_index(['Company', 'date'], inplace=True)

我想通过公司和日期设置MultiIndex为这个DataFrame。目前它有一个默认索引。我在用df.set_index(['Company', 'date'], inplace=True)

df = pd.DataFrame()
for c in company_list:
        row = pd.DataFrame([dict(company = '%s' %s, date = datetime.date(2012, 05, 01))])
        df = df.append(row, ignore_index = True)
        for e in emp_list:
            dataset  = pd.read_sql("select company, emp_name, date(date), count(*) from company_table where  = '"+s+"' and emp_name = '"+b+"' group by company, date, name LIMIT 5 ", con)
                if len(dataset) == 0:
                row = pd.DataFrame([dict(sitename='%s' %s, name = '%s' %b, date = datetime.date(2012, 05, 01), count = np.nan)])
                dataset = dataset.append(row, ignore_index=True)
            dataset = dataset.rename(columns = {'count': '%s' %b})
            dataset = dataset.groupby(['company', 'date', 'emp_name'], as_index = False).sum()

            dataset = dataset.drop('emp_name', 1)
            df = pd.merge(df, dataset, how = '')
            df = df.sort('date', ascending = True)
            df.fillna(0, inplace = True)

df.set_index(['Company', 'date'], inplace=True)            
print df

But when I print this DataFrame, it prints None. I saw this solution from stackoverflow it self. Is this not the correct way of doing it. Also I want to shuffle the positions of the columns company and date so that company becomes the first index, and date becomes the second in Hierarchy. Any ideas on this?

但是当我打印这个时DataFrame，它会打印None. 我从 stackoverflow 它自己看到了这个解决方案。这不是正确的做法吗。另外我想洗牌公司和日期列的位置，以便公司成为第一个索引，日期成为层次结构中的第二个。对此有何想法？

Answer 1

采纳答案by Andy Hayden

When you pass inplace in makes the changes on the original variable and returns None, and the function does notreturn the modified dataframe, it returns None.

当您就地传入对原始变量进行更改并返回 None 时，该函数不返回修改后的数据帧，它返回 None。

is_none = df.set_index(['Company', 'date'], inplace=True)
df  # the dataframe you want
is_none # has the value None

so when you have a line like:

所以当你有这样一行时：

df = df.set_index(['Company', 'date'], inplace=True)

it first modifies df... but then it sets dfto None!

它首先修改df......但随后它设置df为无！

That is, you should just use the line:

也就是说，您应该只使用以下行：

df.set_index(['Company', 'date'], inplace=True)

Python 在 Pandas 中设置现有数据框的多索引

提问by user3527975

采纳答案by Andy Hayden

相关推荐

最近更新

标签

Python 在 Pandas 中设置现有数据框的多索引

提问by user3527975

采纳答案by Andy Hayden

相关推荐

Python 了解 sklearn 中 CountVectorizer 中的 `ngram_range` 参数

在python的集合操作中添加vs更新

Python：模拟上下文管理器

Python 激活虚拟环境不起作用

相关推荐

最近更新

标签