带有 2 行标题的 Pandas 数据框并导出到 csv

Question

提问by Meloun

I have a dataframe

我有一个数据框

df = pd.DataFrame(columns = ["AA", "BB", "CC"])
df.loc[0]= ["a", "b", "c1"]
df.loc[1]= ["a", "b", "c2"]
df.loc[2]= ["a", "b", "c3"]

I need to add secod row to header

我需要将第二行添加到标题

df.columns = pd.MultiIndex.from_tuples(zip(df.columns, ["DD", "EE", "FF"]))

my df is now

我的 df 现在是

  AA BB  CC
  DD EE  FF
0  a  b  c1
1  a  b  c2
2  a  b  c3

but when I write this dataframe to csv file

但是当我将此数据框写入 csv 文件时

df.to_csv("test.csv", index = False)

I get one more row than expected

我得到比预期多一排

AA,BB,CC
DD,EE,FF
,,
a,b,c1
a,b,c2
a,b,c3

Answer 1

采纳答案by DSM

It's an ugly hack, but if you needed something to work Right Now(tm), you could write it out in two parts:

这是一个丑陋的黑客，但如果你现在需要一些东西来工作（tm），你可以把它写成两部分：

>>> pd.DataFrame(df.columns.tolist()).T.to_csv("noblankrows.csv", mode="w", header=False, index=False)
>>> df.to_csv("noblankrows.csv", mode="a", header=False, index=False)
>>> !cat noblankrows.csv
AA,BB,CC
DD,EE,FF
a,b,c1
a,b,c2
a,b,c3

Answer 2

回答by Andy Hayden

I think this is a bug in to_csv. If you're looking for workarounds then here's a couple.

我认为这是to_csv. 如果您正在寻找解决方法，那么这里有几个。

To read back in this csv specify the header rows*:

要在此 csv 中回读指定标题行*：

In [11]: csv = "AA,BB,CC
DD,EE,FF
,,
a,b,c1
a,b,c2
a,b,c3"

In [12]: pd.read_csv(StringIO(csv), header=[0, 1])
Out[12]:
  AA BB  CC
  DD EE  FF
0  a  b  c1
1  a  b  c2
2  a  b  c3

*strangely this seems to ignore the blank lines.

*奇怪的是，这似乎忽略了空行。

To write out you could write the header first and then append:

要写出，您可以先编写标题，然后附加：

with open('test.csv', 'w') as f:
    f.write('\n'.join([','.join(h) for h in zip(*df.columns)]) + '\n')
df.to_csv('test.csv', mode='a', index=False, header=False)

Note the to_csvpart for MultiIndex column here:

请注意to_csv此处 MultiIndex 列的部分：

In [21]: '\n'.join([','.join(h) for h in zip(*df.columns)]) + '\n'
Out[21]: 'AA,BB,CC\nDD,EE,FF\n'

Answer 3

回答by Bluu

Building on top of @DSM's solution:

建立在@DSM 的解决方案之上：

if you need (as I did) to apply the same hack to an export to excel, the main change needed (apart from expected differences with the to_excel method) is to actually remove the multiindex used for your column labels...

如果您需要（就像我一样）将相同的 hack 应用到导出到excel，需要的主要更改（除了与 to_excel 方法的预期差异）是实际删除用于列标签的多索引...

That's because .to_excel doesn't support writing out a df having a multiindex for columns but no index (providing index=False to the .to_excel method) contrarily to .to_csv

那是因为 .to_excel 不支持写出具有多索引的 df 列但没有索引（为 .to_excel 方法提供 index=False）与 .to_csv 相反

Anyway, here's what it would look like:

无论如何，这就是它的样子：

>>> writer = pd.ExcelWriter("noblankrows.xlsx")
>>> headers = pd.DataFrame(df.columns.tolist()).T
>>> headers.to_excel(
        writer, header=False, index=False)
>>> df.columns = pd.Index(range(len(df.columns)))  # that's what I was referring to...
>>> df.to_excel(
        writer, header=False, index=False, startrow=len(headers))
>>> writer.save()
>>> pd.read_excel("noblankrows.xlsx").to_csv(sys.stdout, index=False)
AA,BB,CC
DD,EE,FF
a,b,c1
a,b,c2
a,b,c3

Answer 4

回答by CT Zhu

Use df.to_csv("test.csv", index = False, tupleize_cols=True)to get the resulting CSV to be:

使用df.to_csv("test.csv", index = False, tupleize_cols=True)获得所产生的CSV是：

"('AA', 'DD')","('BB', 'EE')","('CC', 'FF')"
a,b,c1
a,b,c2
a,b,c3

To read it back:

读回来：

df2=pd.read_csv("test.csv", tupleize_cols=True)
df2.columns=pd.MultiIndex.from_tuples(eval(','.join(df2.columns)))

To get the exact output you wanted:

要获得您想要的确切输出：

with open('test.csv', 'a') as f:
    pd.DataFrame(np.asanyarray(df.columns.tolist())).T.to_csv(f, index = False, header=False)
    df.to_csv(f, index = False, header=False)

带有 2 行标题的 Pandas 数据框并导出到 csv

提问by Meloun

采纳答案by DSM

回答by Andy Hayden

To read back in this csv specify the header rows*:

要在此 csv 中回读指定标题行*：

To write out you could write the header first and then append:

要写出，您可以先编写标题，然后附加：

回答by Bluu

回答by CT Zhu

相关推荐

最近更新

标签

带有 2 行标题的 Pandas 数据框并导出到 csv

提问by Meloun

采纳答案by DSM

回答by Andy Hayden

To read back in this csv specify the header rows*:

要在此 csv 中回读指定标题行*：

To write out you could write the header first and then append:

要写出，您可以先编写标题，然后附加：

回答by Bluu

回答by CT Zhu

相关推荐

pandas to_sql pandas方法改变sqlite表的scheme

使用 XlsxWriter 在 Pandas 中导出到“xlsx”时应用样式

Python 中的 Fama Macbeth 回归（Pandas 或 Statsmodels）

numpy genfromtxt/pandas read_csv；忽略引号内的逗号

相关推荐

最近更新

标签