pandas 如何合并数据帧熊猫中的两行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41693000/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:48:27  来源:igfitidea点击:

How to merge two rows in a dataframe pandas

pandasdataframemerge

提问by Carmen

I have a dataframe with two rows and I'd like to merge the two rows to one row. The df Looks as follows:

我有一个包含两行的数据框,我想将两行合并为一行。df 如下所示:

              PC           Rating CY   Rating PY    HT
0             DE101           NaN            AA     GV
0             DE101           AA+           NaN     GV

I have tried to create two seperate dataframes and Combine them with df.merge(df2) without success. The result should be the following

我试图创建两个单独的数据帧并将它们与 df.merge(df2) 结合但没有成功。结果应该如下

              PC           Rating CY   Rating PY    HT
0             DE101           AA+            AA     GV

Any ideas? Thanks in advance Could df.update be a possible solution?

有任何想法吗?提前致谢 df.update 是一个可能的解决方案吗?

EDIT:

编辑:

df.head(1).combine_first(df.tail(1))

This works for the example above. However, for columns containing numerical values, this approach doesn't yield the desired output, e.g. for

这适用于上面的示例。但是,对于包含数值的列,这种方法不会产生所需的输出,例如对于

              PC           Rating CY   Rating PY    HT    MV1   MV2
0             DE101           NaN            AA     GV    0     20 
0             DE101           AA+           NaN     GV    10    0

The output should be:

输出应该是:

              PC           Rating CY   Rating PY    HT   MV1    MV2
0             DE101           AA+            AA     GV   10     20

The formula above doesn't sum up the values in the last two columns, but takes the values in the first row of the dataframe.

上面的公式不会对最后两列中的值求和,而是采用数据帧第一行中的值。

              PC           Rating CY   Rating PY    HT   MV1    MV2
0             DE101           AA+            AA     GV   0     20

How could this problem be fixed?

如何解决这个问题?

采纳答案by Nickil Maveli

You can make use of DF.combine_first()method after separating the DFinto 2 parts where the null values in the first half would be replaced with the finite values in the other half while keeping it's other finite values untouched:

您可以在将前半部分的空值替换为另一半的有限值的 2 部分DF.combine_first()后使用方法,DF同时保持其他有限值不变:

df.head(1).combine_first(df.tail(1))
# Practically this is same as → df.head(1).fillna(df.tail(1))

enter image description here

在此处输入图片说明



Incase there are columns of mixed datatype, partitioning them into it's constituent dtypecolumns and then performing various operations on it would be feasible by chaining them across.

如果存在混合数据类型的列,将它们划分为组成dtype列,然后通过将它们链接起来对其执行各种操作是可行的。

obj_df = df.select_dtypes(include=[np.object])
num_df = df.select_dtypes(exclude=[np.object])

obj_df.head(1).combine_first(obj_df.tail(1)).join(num_df.head(1).add(num_df.tail(1)))

enter image description here

在此处输入图片说明

回答by Zero

You could use maxwith transpose like

你可以max像转置一样使用

In [2103]: df.max().to_frame().T
Out[2103]:
      PC Rating CY Rating PY  HT MV1 MV2
0  DE101       AA+        AA  GV  10  20