Pandas - 在列中找到第一个非空值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42137529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:56:18  来源:igfitidea点击:

Pandas - find first non-null value in column

pythonpandas

提问by code base 5000

If I have a series that has either NULL or some non-null value. How can I find the 1st row where the value is not NULL so I can report back the datatype to the user. If the value is non-null all values are the same datatype in that series.

如果我有一个具有 NULL 或某些非空值的系列。如何找到值不为 NULL 的第一行,以便我可以向用户报告数据类型。如果该值为非空,则该系列中的所有值都是相同的数据类型。

Thanks

谢谢

回答by jezrael

You can use first_valid_indexwith select by loc:

您可以first_valid_index与 select by 一起使用loc

s = pd.Series([np.nan,2,np.nan])
print (s)
0    NaN
1    2.0
2    NaN
dtype: float64

print (s.first_valid_index())
1

print (s.loc[s.first_valid_index()])
2.0

# If your Series contains ALL NaNs, you'll need to check as follows:

s = pd.Series([np.nan, np.nan, np.nan])
idx = s.first_valid_index()  # Will return None
first_valid_value = s.loc[idx] if idx is not None else None
print(first_valid_value)
None

回答by PdevG

For a series this will return the first no null value:

对于一个系列,这将返回第一个非空值:

Creating Series s:

创建系列:

s = pd.Series(index=[2,4,5,6], data=[None, None, 2, None])

which creates this Series:

这创建了这个系列:

2    NaN
4    NaN
5    2.0
6    NaN
dtype: float64

You can get the first non-NaN value by using:

您可以使用以下方法获取第一个非 NaN 值:

s.loc[~s.isnull()].iloc[0]

which returns

返回

2.0

If you on the other hand have a dataframe like this one:

另一方面,如果您有这样的数据框:

df = pd.DataFrame(index=[2,4,5,6], data=np.asarray([[None, None, 2, None], [1, None, 3, 4]]).transpose(), 
                  columns=['a', 'b'])

which looks like this:

看起来像这样:

    a       b
2   None    1
4   None    None
5   2       3
6   None    4

you can select per column the first non null value using this (for column a):

您可以使用此选择每列的第一个非空值(对于 a 列):

df.a.loc[~df.a.isnull()].iloc[0]

or if you want the first row containing no Null values anywhere you can use:

或者,如果您希望第一行不包含 Null 值,则可以使用:

df.loc[~df.isnull().sum(1).astype(bool)].iloc[0]

Which returns:

返回:

a    2
b    3
Name: 5, dtype: object

回答by Daniil Mashkin

You can also use getmethod instead

您也可以使用get方法代替

(Pdb) type(audio_col)
<class 'pandas.core.series.Series'>
(Pdb) audio_col.first_valid_index()
19
(Pdb) audio_col.get(first_audio_idx)
'first-not-nan-value.ogg'