Pandas - 重新索引以便我可以保留值

Question

提问by jwillis0720

Long story short

长话短说

I have a nested dictionary. When I turn it into a dataframe.

我有一个嵌套字典。当我把它变成一个数据框时。

import pandas
pdf = pandas.DataFrame(nested_dict)

 95     96     97     98     99    100   101   102   103    104    105  \
A  70019    102   4243   3083   3540  6311  4851  5938  4140   4659   3100   
C      0    185    427    433   1190   910  3898  3869  2861   2149   3065   
D      8      9  23463   1237   2574  4174  3640  4747  3557   4582   5934   
E    141     89   5034   1576   2303  3416  2377  1252  1204   1703    718   
F      7     12   1937   2246   1687  1154  1317  3473  1881   2221   3060   
G    343   1550  13497  10659  12343  8213  9251  7341  6354   9058   9022   
H      1   1978   1829   1394   1945  1003  1382  1489  4182    932    556   
I      5    772   1361   3914   3255  3242  2808  3765  3284   2127   3120   
K      3  10353    540   2364   1196   882  3439  2107   803    743    621   
L      6     14   1599  11759   4571  4821  3450  5071  4364   1891   3677   
M      1      6    158    211    524  2738   686   443   612    509   1721   
N      6    186    299   2971    791  1440  2028  1163  1689   4296   1535   
P     54     31    726   6208   7160  5494  6184  4282  3587   3727   3821   
Q     10     87   1228   2233   1016  1801  1768  1693  3414    515    563   
R      7  53939   3030   8904   6712  6134  5127  3223  4764   3768   6429   
S     76   5213   3676   7480   9831  7666  5410  8185  7508  11237   8298   
T   4369   1253   3087   2487   6559  4572  6863  3184  7352   6068   4756   
V    732      5   7595   4331   5216  5444  5187  6013  4245   4545   4761   
W      0      6    103   1225    598   888   601   713  1298   1323    908   
Y     12      9   1968   1085   2787  5489  5529  7840  8691   9745  10136

Eventually I want to melt down this data frame to look like the following.

最终我想把这个数据框分解成如下所示。

residue residue_num count
A       95          70019
A       96          102
A       97          4243
....

The residue column is being marked as the index so I don't know how to make it an arbitrary index like 0,1,2,3 and call "A C D E F.." another name.

残差列被标记为索引，所以我不知道如何使它成为像 0,1,2,3 这样的任意索引并调用“ACDE F..”另一个名称。

EDITAnswered myself as per suggestion

编辑根据建议回答我自己

Answer 1

回答by jwillis0720

Answered from hereand here

从这里和这里回答

import pandas
pdf = pandas.DataFrame(the_matrix)
pdf = pdf.reset_index()
pdf.rename(columns={'index':'aa'},inplace=True)
pandas.melt(pdf,id_vars='aa',var_name="position",value_name="counts")

     aa position counts
0    A   95  70019
1    C   95  0
2    D   95  8
3    E   95  141
4    F   95  7
5    G   95  343
6    H   95  1
7    I   95  5
8    K   95  3

Answer 2

回答by tozCSS

Your pdf looks like a pivot table. Let's assume we have a dataframe with three columns. We can pivot it with a single function like this:

您的 pdf 看起来像一个数据透视表。假设我们有一个包含三列的数据框。我们可以使用这样的单个函数来旋转它：

pivoted = df.pivot(index='col1',columns='col2',values='col3')

Unpivoting it back without losing the index requires a reset_indexdance:

在不丢失索引的情况下反转它需要一个reset_index舞蹈：

pivoted.reset_index().melt(id_vars=pivoted.index.name)

To get the exact original df:

要获得确切的原始 df：

pivoted.reset_index().melt(id_vars=pivoted.index.name, var_name='col2', value_name='col3')

PS. To my surprise, melt does not get a kwarg like keep_index=True. Enhancement suggestion is still open: https://github.com/pandas-dev/pandas/issues/17440

附注。令我惊讶的是，melt 并没有像keep_index=True. 增强建议仍然开放：https: //github.com/pandas-dev/pandas/issues/17440

Pandas - 重新索引以便我可以保留值

提问by jwillis0720

回答by jwillis0720

回答by tozCSS

相关推荐

最近更新

标签

Pandas - 重新索引以便我可以保留值

提问by jwillis0720

回答by jwillis0720

回答by tozCSS

相关推荐

复制 Pandas DF N 次

Pandas GroupBy.apply 方法复制第一组

如何重新采样时间序列 Pandas 数据框？

pandas ValueError：在 LinearSVC 期间，数组在 _assert_all_finite 中包含 NaN 或无穷大

相关推荐

最近更新

标签