Python 删除 Pandas 系列中的行并清理索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14487562/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Drop row in Pandas Series and clean up index
提问by Jonas
I have a Pandas Series and based on a random number I want to pick a row (5 in the code example below) and drop that row. When the row is dropped I want to create a new index for the remaining rows (0 to 8). The code below:
我有一个 Pandas 系列,基于一个随机数,我想选择一行(在下面的代码示例中为 5)并删除该行。当行被删除时,我想为剩余的行(0 到 8)创建一个新索引。下面的代码:
print 'Original series: ', sample_mean_series
print 'Length of original series', len(sample_mean_series)
sample_mean_series = sample_mean_series.drop([5],axis=0)
print 'Series with item 5 dropped: ', sample_mean_series
print 'Length of modified series:', len(sample_mean_series)
print sample_mean_series.reindex(range(len(sample_mean_series)))
And this is the output:
这是输出:
Original series:
0 0.000074
1 -0.000067
2 0.000076
3 -0.000017
4 -0.000038
5 -0.000051
6 0.000125
7 -0.000108
8 -0.000009
9 -0.000052
Length of original series 10
Series with item 5 dropped:
0 0.000074
1 -0.000067
2 0.000076
3 -0.000017
4 -0.000038
6 0.000125
7 -0.000108
8 -0.000009
9 -0.000052
Length of modified series: 9
0 0.000074
1 -0.000067
2 0.000076
3 -0.000017
4 -0.000038
5 NaN
6 0.000125
7 -0.000108
8 -0.000009
My problem is that the row number 8 is dropped. I want to drop row "5 NaN" and keep -0.000052 with an index 0 to 8. This is what I want it to look like:
我的问题是第 8 行被删除了。我想删除行“5 NaN”并保持 -0.000052 的索引为 0 到 8。这就是我想要的样子:
0 0.000074
1 -0.000067
2 0.000076
3 -0.000017
4 -0.000038
5 0.000125
6 -0.000108
7 -0.000009
8 -0.000052
采纳答案by BrenBarn
Somewhat confusingly, reindexdoes not mean "create a new index". To create a new index, just assign to the indexattribute. So at your last step just do sample_mean_series.index = range(len(sample_mean_series)).
有点令人困惑,reindex并不意味着“创建新索引”。要创建新索引,只需分配给index属性。所以在你的最后一步就做sample_mean_series.index = range(len(sample_mean_series))。
回答by Zelazny7
Here's a one-liner:
这是一个单行:
In [1]: s
Out[1]:
0 -0.942184
1 0.397485
2 -0.656745
3 1.415797
4 1.123858
5 -1.890870
6 0.401715
7 -0.193306
8 -1.018140
9 0.262998
I use the Series.dropmethod to drop row 5 and then use reset_indexto re-number the indices to be consecutive. Without using reset_index, the indices would jump from 4 to 6 with no 5.
我使用该Series.drop方法删除第 5 行,然后使用该方法reset_index将索引重新编号以使其连续。如果不使用reset_index,索引将从 4 跳到 6 而没有 5。
By default, reset_indexwill move the original index into a DataFrameand return it alongside the series values. Passing drop=Trueprevents this from happening.
默认情况下,reset_index将原始索引移动到 aDataFrame并将其与系列值一起返回。通过drop=True防止这种情况发生。
In [2]: s2 = s.drop([5]).reset_index(drop=True)
In [3]: s2
Out[3]:
0 -0.942184
1 0.397485
2 -0.656745
3 1.415797
4 1.123858
5 0.401715
6 -0.193306
7 -1.018140
8 0.262998
Name: 0

