将列值更改为 Pandas 中的列标题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/22173572/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Change column values to column headers in pandas
提问by juniper-
I have the following code, which takes the values in one column of a pandas dataframe and makes them the columns of a new data frame. The values in the first column of the dataframe become the index of the new dataframe.
我有以下代码,它采用Pandas数据框的一列中的值,并使它们成为新数据框的列。数据帧第一列中的值成为新数据帧的索引。
In a sense, I want to turn an adjacency list into an adjacency matrix. Here's the code so far:
从某种意义上说,我想把一个邻接表变成一个邻接矩阵。这是到目前为止的代码:
import pandas as pa
print "Original Data Frame"
# Create a dataframe
oldcols = {'col1':['a','a','b','b'], 'col2':['c','d','c','d'], 'col3':[1,2,3,4]}
a = pa.DataFrame(oldcols)
print a
# The columns of the new data frame will be the values in col2 of the original
newcols = list(set(oldcols['col2']))
rows = list(set(oldcols['col1']))
# Create the new data matrix
data = np.zeros((len(rows), len(newcols)))
# Iterate over each row and fill in the new matrix
for row in zip(a['col1'], a['col2'], a['col3']):
    rowindex = rows.index(row[0])
    colindex = newcols.index(row[1])
    data[rowindex][colindex] = row[2]
newf = pa.DataFrame(data)
newf.columns = newcols
newf.index = rows
print "New data frame"
print newf
This works for this particular instance:
这适用于这个特定的实例:
Original Data Frame
  col1 col2  col3
0    a    c     1
1    a    d     2
2    b    c     3
3    b    d     4
New data frame
   c  d
a  1  2
b  3  4
It will fail if the values in col3 are not numbers. My question is, is there a more elegant/robust way of doing this?
如果 col3 中的值不是数字,它将失败。我的问题是,有没有更优雅/更健壮的方法来做到这一点?
回答by unutbu
This looks like a job for pivot:
这看起来像一个 pivot 的工作:
import pandas as pd
oldcols = {'col1':['a','a','b','b'], 'col2':['c','d','c','d'], 'col3':[1,2,3,4]}
a = pd.DataFrame(oldcols)  
newf = a.pivot(index='col1', columns='col2')
print(newf)
yields
产量
      col3   
col2     c  d
col1         
a        1  2
b        3  4
If you don't want a MultiIndex column, you can drop the col3using:
如果您不想要 MultiIndex 列,则可以删除col3using:
newf.columns = newf.columns.droplevel(0)
which would then yield
然后会产生
col2  c  d
col1      
a     1  2
b     3  4

