pandas 熊猫切片系列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36840877/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Slicing Series in pandas
提问by bcf
Consider the Series
obj
:
考虑Series
obj
:
In [50]: obj = Series(np.arange(6,10), index = ['a', 'b', 'c', 'd'])
In [51]: obj
Out[51]:
a 6
b 7
c 8
d 9
dtype: int64
I'd like to take a slice of obj
, and I can do that in a couple of ways:
我想取一片obj
,我可以通过以下几种方式做到这一点:
In [52]: obj[1:3]
Out[52]:
b 7
c 8
dtype: int64
In [53]: obj['b' : 'c']
Out[53]:
b 7
c 8
dtype: int64
Now consider the DataFrame
param_estimates_good
:
现在考虑DataFrame
param_estimates_good
:
In [54]: param_estimates_good
Out[54]:
a b sigma a_se b_se sigma_se success
1968 0.648508 1.803889 0.498017 0.784340 0.082366 0.529649 1
1972 0.539485 1.733304 0.451311 1.084170 0.174030 0.677134 1
1973 1.205704 2.054114 1.465606 0.095780 0.052851 0.090562 1
1974 1.398968 2.105287 2.029865 0.451929 0.056154 0.428696 1
1975 1.570900 1.877486 2.016978 0.186177 0.052413 0.183577 1
1976 0.688932 1.651232 0.874860 0.065038 0.099080 0.055247 1
1977 0.816918 1.949563 0.691899 0.516742 0.083973 0.385799 1
1980 0.730454 2.569974 2.297921 0.619403 0.157950 0.439383 1
1986 1.053362 1.770256 1.115229 0.235353 0.063867 0.202970 1
1993 2.531327 1.235418 2.005588 0.107785 0.011513 0.060647 1
1994 -0.759318 2.556910 0.175695 0.052099 0.078433 0.044315 1
1998 1.007787 1.548161 0.911332 2.538235 0.040285 2.001148 1
2000 -0.693261 1.518839 -0.290453 3.763934 1.329302 0.868444 1
2001 0.662391 0.650003 0.854752 0.550188 0.547999 0.376354 1
2002 0.652630 0.424864 0.524909 0.413478 0.334703 0.251172 1
2004 -0.169553 1.290054 -0.040504 0.279700 0.093937 0.115120 1
2005 0.146209 1.610219 -0.233461 0.171832 0.083844 0.123676 1
2007 -0.301397 0.822584 0.309423 1.119639 0.860818 0.377673 1
2008 1.334283 0.065856 1.704950 0.462811 0.489639 0.427041 1
2009 2.082782 -0.727128 1.072343 0.464726 0.093574 0.472603 1
2010 2.309353 -1.202509 0.906165 0.037950 0.080356 0.031981 1
2013 3.490101 -2.033734 1.468027 0.251317 0.030869 0.259732 1
2014 1.820431 -1.961015 -0.050831 0.262710 0.176057 0.266525 1
2016 1.818855 -0.580492 0.312369 0.450659 0.065661 0.474896 1
I take a slice of this to form a Series
g
:
我把这个切片形成一个Series
g
:
In [55]: g = param_estimates_good['a']
In [56]: g
Out[56]:
1968 0.648508
1972 0.539485
1973 1.205704
1974 1.398968
1975 1.570900
1976 0.688932
1977 0.816918
1980 0.730454
1986 1.053362
1993 2.531327
1994 -0.759318
1998 1.007787
2000 -0.693261
2001 0.662391
2002 0.652630
2004 -0.169553
2005 0.146209
2007 -0.301397
2008 1.334283
2009 2.082782
2010 2.309353
2013 3.490101
2014 1.820431
2016 1.818855
Name: a, dtype: float64
The Index
of g
are ints...
该Index
的g
是个整数...
In [57]: g.index
Out[57]:
Int64Index([1968, 1972, 1973, 1974, 1975, 1976, 1977, 1980, 1986, 1993, 1994,
1998, 2000, 2001, 2002, 2004, 2005, 2007, 2008, 2009, 2010, 2013,
2014, 2016],
dtype='int64')
... so I try to slice g
in an analagous way to to obj
:
...所以我尝试以g
类似的方式切片到obj
:
In [58]: g[0:7]
Out[58]:
1968 0.648508
1972 0.539485
1973 1.205704
1974 1.398968
1975 1.570900
1976 0.688932
1977 0.816918
Name: a, dtype: float64
In [59]: g[1968 : 1977]
Out[59]: Series([], Name: a, dtype: float64)
Why does the latter method return an empty Series
?
为什么后一种方法返回一个空的Series
?
回答by jezrael
I think it wants to find rows from position 1968
to 1977
, because it selects rows by positions - Slicing ranges [] in docs
:
我想,它要找到位置行1968
到1977
,因为它的位置选择行- Slicing ranges [] in docs
:
With Series, the syntax works exactly as with an ndarray, returning a slice of the values and the corresponding labels
对于 Series,语法与 ndarray 完全一样,返回值的切片和相应的标签
It is same as Selection by position in docs
with iloc
.
它是一样Selection by position in docs
有iloc
。
print g.iloc[1968 : 1977]
Series([], Name: a, dtype: float64)
With loc
it works perfectly Selection By Label in docs
:
有了loc
它完美Selection By Label in docs
:
print g.loc[1968 : 1977]
1968 0.648508
1972 0.539485
1973 1.205704
1974 1.398968
1975 1.570900
1976 0.688932
1977 0.816918
Name: a, dtype: float64
回答by vuvu
You get an empty Series because when using the slicing operator as in g[1968:1977], these are taken as locations (row indexes running from 0 to (N-1), where N is the size/length of the Series) and you seem to have 24 rows in g, so when you ask for all elements between locations 1968 and 1977 you get nothing (your last location is 23).
您会得到一个空系列,因为在使用 g[1968:1977] 中的切片运算符时,这些被视为位置(从 0 到 (N-1) 的行索引,其中 N 是系列的大小/长度)和你似乎在 g 中有 24 行,所以当你要求位置 1968 和 1977 之间的所有元素时,你什么也得不到(你的最后一个位置是 23)。
You want to use instead the g.index labels as in g.loc[1968:1977] and then you get all elements between labels 1968 and 1977 (inclusive).
您想改用 g.index 标签,如 g.loc[1968:1977] 中那样,然后您将获得标签 1968 和 1977(含)之间的所有元素。