pandas 熊猫基于列合并csv
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42583664/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas merge csv base on columns
提问by Rajat Vij
Hi I know this has been answered before, but i am getting weird result those solutions. So would appreciate an explanation on what's wrong with my approach.
嗨,我知道之前已经回答过这个问题,但是我在这些解决方案中得到了奇怪的结果。所以希望能解释一下我的方法有什么问题。
I have 2 csv files
我有 2 个 csv 文件
f1
f1
A,B,C
1,2,3
1,2,3
3,3,3
f2
f2
C,D,F
3,3,1
1,1,1
I am trying to merge them. Simple
我正在尝试合并它们。简单的
f = pd.merge(left=f1, right=f2, how='outer', on='C')
But the merge result instead of giving expected table as
但是合并结果而不是给出预期的表
A,B,C,D,F
1,2,3,3,1
1,2,3,3,1
3,3,3,3,1
I am getting result as:
我得到的结果是:
A,B,C,D,F
1,2,3
1,2,3
3,3,3
,,3,3,1
,,1,1,1
Not sure why i am getting this.
不知道为什么我得到这个。
I am not dealing this exact data. I reading this data from csv files as
我不是在处理这个确切的数据。我从 csv 文件中读取这些数据作为
pd.read_csv('filename.csv', usecols=[colnames])
EDIT:
编辑:
Here is my code:
这是我的代码:
import pandas as pd
f2 = pd.read_csv('filename1.csv', usecols=[colnames])
f1 = pd.read_csv('filename2.csv', usecols=[colnames])
f = pd.merge(left=f1, right=f2, how='left', on='MergeCol')
回答by Scratch'N'Purr
Here's your solution. You want to do a left join instead of outer:
这是您的解决方案。您想要进行左连接而不是外部连接:
import pandas as pd
f1 = pd.DataFrame({'A':[1,1,3], 'B':[2,2,3], 'C':[3,3,3]})
f2 = pd.DataFrame({'C':[3,1], 'D':[3,1], 'F':[1,1]})
f = f1.merge(f2, how='left', on='C')
Output:
输出:
A B C D F
0 1 2 3 3 1
1 1 2 3 3 1
2 3 3 3 3 1
If you want to write back into a csv, just do:
如果您想写回 csv,只需执行以下操作:
f.to_csv('yourfile.csv', index=False)