Python 将 Pandas DataFrame 的行转换为列标题，

Question

提问by E.K.

The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header?

我必须处理的数据有点乱。它的数据中有标题名称。如何从现有的熊猫数据框中选择一行并将其（重命名为）列标题？

I want to do something like:

我想做类似的事情：

header = df[df['old_header_name1'] == 'new_header_name1']

df.columns = header

Answer 1

采纳答案by unutbu

In [21]: df = pd.DataFrame([(1,2,3), ('foo','bar','baz'), (4,5,6)])

In [22]: df
Out[22]: 
     0    1    2
0    1    2    3
1  foo  bar  baz
2    4    5    6

Set the column labels to equal the values in the 2nd row (index location 1):

将列标签设置为等于第二行（索引位置 1）中的值：

In [23]: df.columns = df.iloc[1]

If the index has unique labels, you can drop the 2nd row using:

如果索引具有唯一标签，则可以使用以下方法删除第二行：

In [24]: df.drop(df.index[1])
Out[24]: 
1 foo bar baz
0   1   2   3
2   4   5   6

If the index is not unique, you could use:

如果索引不是唯一的，您可以使用：

In [133]: df.iloc[pd.RangeIndex(len(df)).drop(1)]
Out[133]: 
1 foo bar baz
0   1   2   3
2   4   5   6

Using df.drop(df.index[1])removes allrows with the same label as the second row. Because non-unique indexes can lead to stumbling blocks (or potential bugs) like this, it's often better to take care that the index is unique (even though Pandas does not require it).

使用df.drop(df.index[1])删除与第二行具有相同标签的所有行。因为非唯一索引会导致像这样的绊脚石（或潜在的错误），通常最好注意索引是唯一的（即使 Pandas 不需要它）。

Answer 2

回答by Zachary Wilson

This works (pandas v'0.19.2'):

这有效（熊猫 v'0.19.2'）：

df.rename(columns=df.iloc[0])

Answer 3

回答by ccpizza

You can specify the row index in the read_csvor read_htmlconstructors via the headerparameter which represents Row number(s) to use as the column names, and the start of the data. This has the advantage of automatically dropping all the preceding rows which supposedly are junk.

您可以通过表示.csv的参数在read_csv或read_html构造函数中指定行索引。这样做的好处是可以自动删除所有前面应该是垃圾的行。headerRow number(s) to use as the column names, and the start of the data

import pandas as pd
from io import StringIO

In[1]
    csv = '''junk1, junk2, junk3, junk4, junk5
    junk1, junk2, junk3, junk4, junk5
    pears, apples, lemons, plums, other
    40, 50, 61, 72, 85
    '''

    df = pd.read_csv(StringIO(csv), header=2)
    print(df)

Out[1]
       pears   apples   lemons   plums   other
    0     40       50       61      72      85

Answer 4

回答by shahar_m

It would be easier to recreate the data frame. This would also interpret the columns types from scratch.

重新创建数据框会更容易。这也将从头开始解释列类型。

headers = df.iloc[0]
new_df  = pd.DataFrame(df.values[1:], columns=headers)

Python 将 Pandas DataFrame 的行转换为列标题，

提问by E.K.

采纳答案by unutbu

回答by Zachary Wilson

回答by ccpizza

回答by shahar_m

相关推荐

最近更新

标签

Python 将 Pandas DataFrame 的行转换为列标题，

提问by E.K.

采纳答案by unutbu

回答by Zachary Wilson

回答by ccpizza

回答by shahar_m

相关推荐

使用 Python 向 Microsoft Exchange 组发送电子邮件？

Python 没有异常值的 Matplotlib 箱线图

'str' 对象在 Python3 中没有属性 'decode'

Python如何在进程运行时进行简单的动画加载

相关推荐

最近更新

标签