Python Pandas,两行作为列标题?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41005577/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:34:57  来源:igfitidea点击:

Python Pandas, two rows as column headers?

python-3.xpandas

提问by Stephen

I have seen how to work with a double index, but I have not seen how to work with a two-row column headers. Is this possible?

我已经看到如何使用双索引,但我还没有看到如何使用两行列标题。这可能吗?

For example, row 1 is a repetitive series of dates: 2016, 2016, 2015, 2015

例如,第 1 行是一系列重复的日期:2016、2016、2015、2015

Row 2 is a repetitive series of data. Dollar Sales, Unit Sales, Dollar Sales, Unit Sales.

第 2 行是一系列重复的数据。美元销售额、单位销售额、美元销售额、单位销售额。

So each "Dollar Sales" heading is actually tied to the date in the row above.

因此,每个“美元销售”标题实际上都与上一行中的日期相关联。

Subsequent rows are individual items with data.

后续行是带有数据的单个项目。

Is there a way to do a groupbyor some way that I can have two column headers? Ultimately, I want to line up the "Dollar Sales" as a series by date so that I can make a nice graph. Unfortunately there are multiple columns before the next "Dollar Sales" value. (More than just the one "Unit Sales" column). Also if I delete the date row above, there is no link between which "Dollar Sales" are tied to each date.

有没有办法做一个groupby或某种方式我可以有两个列标题?最终,我想将“美元销售额”按日期排列为一个系列,以便我可以制作一个漂亮的图表。不幸的是,在下一个“Dollar Sales”值之前有多个列。(不仅仅是一个“单位销售额”列)。此外,如果我删除上面的日期行,则“美元销售”与每个日期之间没有链接。

回答by squareskittles

If using pandas.read_csv()or pandas.read_table(), you can provide a list of indices for the headerargument, to specify the rows you want to use for column headers. Python will generate the pandas.MultiIndexfor you in df.columns:

如果使用pandas.read_csv()pandas.read_table(),您可以提供header参数的索引列表,以指定要用于列标题的行。Python 将在pandas.MultiIndex以下位置为您生成df.columns

df = pandas.read_csv('DollarUnitSales.csv', header=[0,1])

You can also use more than two rows, or non-consecutive rows, to specify the column headers:

您还可以使用多于两行或非连续行来指定列标题:

df = pandas.read_table('DataSheet1.csv', header=[0,2,3])