Python 在 Pandas DataFrame 中定位第一个和最后一个非 NaN 值

Question

提问by Jason

I have a Pandas DataFrameindexed by date. There a number of columns but many columns are only populated for part of the time series. I'd like to find where the first and last values non-NaNvalues are located so that I can extracts the dates and see how long the time series is for a particular column.

我有一个DataFrame按日期索引的 Pandas 。有许多列，但许多列仅填充时间序列的一部分。我想找到第一个和最后一个非NaN值所在的位置，以便我可以提取日期并查看特定列的时间序列有多长。

Could somebody point me in the right direction as to how I could go about doing something like this? Thanks in advance.

有人可以指出我如何做这样的事情的正确方向吗？提前致谢。

Answer 1

采纳答案by Jason

@behzad.nouri's solution worked perfectly to return the first and last non-NaN valuesusing Series.first_valid_indexand Series.last_valid_index, respectively.

@ behzad.nouri的解决方案完美工作，返回第一个和最后不NaN values使用Series.first_valid_index和Series.last_valid_index，分别。

Answer 2

回答by cs95

Here's some helpful examples.

这里有一些有用的例子。

Series

系列

s = pd.Series([np.NaN, 1, np.NaN, 3, np.NaN], index=list('abcde'))
s

a    NaN
b    1.0
c    NaN
d    3.0
e    NaN
dtype: float64

# first valid index
s.first_valid_index()
# 'b'

# first valid position
s.index.get_loc(s.first_valid_index())
# 1

# last valid index
s.last_valid_index()
# 'd'

# last valid position
s.index.get_loc(s.last_valid_index())
# 3

Alternative solution using notnaand idxmax:

使用notna和的替代解决方案idxmax：

# first valid index
s.notna().idxmax()
# 'b'

# last valid index
s.notna()[::-1].idxmax()
# 'd'

DataFrame

数据框

df = pd.DataFrame({
    'A': [np.NaN, 1, np.NaN, 3, np.NaN], 
    'B': [1, np.NaN, np.NaN, np.NaN, np.NaN]
})
df

     A    B
0  NaN  1.0
1  1.0  NaN
2  NaN  NaN
3  3.0  NaN
4  NaN  NaN

(first|last)_valid_indexisn't defined on DataFrames, but you can apply them on each column using apply.

(first|last)_valid_index未在 DataFrames 上定义，但您可以使用apply.

# first valid index for each column
df.apply(pd.Series.first_valid_index)

A    1
B    0
dtype: int64

# last valid index for each column
df.apply(pd.Series.last_valid_index)

A    3
B    0
dtype: int64

As before, you can also use notnaand idxmax. This is slightly more natural syntax.

和以前一样，您也可以使用notnaand idxmax。这是稍微更自然的语法。

# first valid index
df.notna().idxmax()

A    1
B    0
dtype: int64

# last valid index
df.notna()[::-1].idxmax()

A    3
B    0
dtype: int64

Python 在 Pandas DataFrame 中定位第一个和最后一个非 NaN 值

提问by Jason

采纳答案by Jason

回答by cs95

Series

系列

DataFrame

数据框

相关推荐

最近更新

标签

Python 在 Pandas DataFrame 中定位第一个和最后一个非 NaN 值

提问by Jason

采纳答案by Jason

回答by cs95

Series

系列

DataFrame

数据框

相关推荐

Python 将 Pandas 数据框导出为表格图像

Python 如何将新行添加到空的 numpy 数组

Python 格式化时间为 %d-%m-%y

Python 如何在 Spyder 中使用 argv

相关推荐

最近更新

标签