pandas 熊猫基于列合并csv

Question

提问by Rajat Vij

Hi I know this has been answered before, but i am getting weird result those solutions. So would appreciate an explanation on what's wrong with my approach.

嗨，我知道之前已经回答过这个问题，但是我在这些解决方案中得到了奇怪的结果。所以希望能解释一下我的方法有什么问题。

I have 2 csv files

我有 2 个 csv 文件

f1

A,B,C
1,2,3
1,2,3
3,3,3

f2

C,D,F
3,3,1
1,1,1

I am trying to merge them. Simple

我正在尝试合并它们。简单的

f = pd.merge(left=f1, right=f2, how='outer', on='C')

But the merge result instead of giving expected table as

但是合并结果而不是给出预期的表

A,B,C,D,F
1,2,3,3,1
1,2,3,3,1
3,3,3,3,1

I am getting result as:

我得到的结果是：

A,B,C,D,F
1,2,3
1,2,3
3,3,3
,,3,3,1
,,1,1,1

Not sure why i am getting this.

不知道为什么我得到这个。

I am not dealing this exact data. I reading this data from csv files as

我不是在处理这个确切的数据。我从 csv 文件中读取这些数据作为

pd.read_csv('filename.csv', usecols=[colnames])

EDIT:

编辑：

Here is my code:

这是我的代码：

import pandas as pd
f2 = pd.read_csv('filename1.csv', usecols=[colnames])
f1 = pd.read_csv('filename2.csv', usecols=[colnames])
f = pd.merge(left=f1, right=f2, how='left', on='MergeCol')

Answer 1

回答by Scratch'N'Purr

Here's your solution. You want to do a left join instead of outer:

这是您的解决方案。您想要进行左连接而不是外部连接：

import pandas as pd
f1 = pd.DataFrame({'A':[1,1,3], 'B':[2,2,3], 'C':[3,3,3]})
f2 = pd.DataFrame({'C':[3,1], 'D':[3,1], 'F':[1,1]})
f = f1.merge(f2, how='left', on='C')

Output:

输出：

   A  B  C  D  F
0  1  2  3  3  1
1  1  2  3  3  1
2  3  3  3  3  1

If you want to write back into a csv, just do:

如果您想写回 csv，只需执行以下操作：

f.to_csv('yourfile.csv', index=False)

pandas 熊猫基于列合并csv

提问by Rajat Vij

f1

f1

f2

f2

回答by Scratch'N'Purr

相关推荐

最近更新

标签

pandas 熊猫基于列合并csv

提问by Rajat Vij

f1

f1

f2

f2

回答by Scratch'N'Purr

相关推荐

在 Pandas 中用 .loc 覆盖 Nan 值

pandas 将我的列转换为 2 个小数位

pandas 带有 matplotlib 散射的条件颜色

使用 Pandas 的指数加权移动平均线

相关推荐

最近更新

标签