Pandas:如何迭代两个格式完全相同的数据帧?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24709557/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: How could I iterate two dataframes which have exactly same format?
提问by JonghoKim
My final goal is making list which contain a pair for corresponding location of dataframes, like below
我的最终目标是制作包含一对对应数据框位置的列表,如下所示
[df_one_first_element, df_two_first_element, column_first, index_first]
:[0.619159, 0.510162, 20140109,0.50], [0.264191,0.269053,20140213,0.50]...
So I am trying to iterate two dataframe but got stuck now. How could I iterate two dataframe which has exactly same format but different data.
所以我试图迭代两个数据帧,但现在卡住了。我如何迭代两个具有完全相同格式但不同数据的数据帧。
For example, I have two dataframes; df_one and df_two that appear like the below:
例如,我有两个数据框;df_one 和 df_two 如下所示:
df_one =
20140109 20140213 20140313 20140410 20140508 20140612 20140710 \
0.50 0.619159 0.264191 0.438849 0.465287 0.445819 0.412582 0.397366
0.55 0.601379 0.303953 0.457524 0.432335 0.415333 0.382093 0.382361
df_two =
20140109 20140213 20140313 20140410 20140508 20140612 20140710 \
0.50 0.510162 0.269053 0.308494 0.300554 0.294360 0.286980 0.280494
0.55 0.489953 0.258690 0.290044 0.283933 0.278180 0.271426 0.266580
And I want to access the same location of the dataframe by iterating over the whole values in the dataframe.
我想通过迭代数据帧中的整个值来访问数据帧的相同位置。
Firstly I tried iterrows()
首先我尝试了 iterrows()
i = 0
for index, row in df_one.iterrows():
j= 0
for item in row:
print df_two(i,j)
j= j+1
i = i+1
but as you know we can not access like:
但如您所知,我们无法访问:
df_two(i,j)
So I am currently lost the way. Or could we access the data by index name and column name?
所以我目前迷失了方向。或者我们可以通过索引名和列名访问数据吗?
采纳答案by kimal
Below code will also enable you to find values on both dataframes in same locations.
下面的代码还将使您能够在相同位置的两个数据帧上查找值。
python 2x
蟒蛇 2x
for i in range(0, len(df_one.index)):
for j in range(0, len(df_one.columns)):
print df_one.values[i,j],df_two.values[i,j],i,j
python 3x
蟒蛇 3 倍
for i in range(0, len(df_one.index)):
for j in range(0, len(df_one.columns)):
print(df_one.values[i,j],df_two.values[i,j],i,j)
回答by usual me
You could use itertools.izip:
你可以使用itertools.izip:
for ( idxRow, s1 ), ( _, s2 ) in itertools.izip( df0.iterrows(), df1.iterrows() ) :
for ( idxCol, v1 ), ( _, v2 ) in itertools.izip( s1.iteritems(), s2.iteritems() ) :
print ( v1, v2, idxCol, idxRow )
In:
在:
X Y Z
a 1.171124 0.853229 1.416635
b 0.971665 -1.727410 -0.055180
Out:
出去:
(1.1711241491561419, 1.3715317727366974, 'X', 'a')
(0.85322862359611618, 0.72799908412372294, 'Y', 'a')
(1.4166350896829785, 2.0068549773211006, 'Z', 'a')
(0.9716653056530119, 0.94413346620976102, 'X', 'b')
(-1.727409829928936, 2.9839447205351157, 'Y', 'b')
(-0.055180403519242693, 0.0030448769325464513, 'Z', 'b')
回答by JonghoKim
I solved this problem by get_value mehtod
我通过 get_value mehtod 解决了这个问题
http://pandas.pydata.org/pandas-docs/version/0.8.1/indexing.html
http://pandas.pydata.org/pandas-docs/version/0.8.1/indexing.html
Here is my code it looks working
这是我的代码,它看起来有效
df_columns = df_one.columns.values
for index, row in df_one.iterrows():
j= 0
for item in row:
print df_two.get_value(index, df_columns[j])
j= j+1

