访问包含列表的 Pandas DataFrame 列的每个第一个元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37125174/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:12:15  来源:igfitidea点击:

Accessing every 1st element of Pandas DataFrame column containing lists

pythonpandasdataframe

提问by mkoala

I have a Pandas DataFrame with a column containing lists objects

我有一个包含列表对象的列的 Pandas DataFrame

      A
0   [1,2]
1   [3,4]
2   [8,9] 
3   [2,6]

How can I access the first element of each list and save it into a new column of the DataFrame? To get a result like this:

如何访问每个列表的第一个元素并将其保存到 DataFrame 的新列中?要得到这样的结果:

      A     new_col
0   [1,2]      1
1   [3,4]      3
2   [8,9]      8
3   [2,6]      2

I know this could be done via iterating over each row, but is there any "pythonic" way?

我知道这可以通过迭代每一行来完成,但是有什么“pythonic”的方式吗?

回答by DSM

As always, remember that storing non-scalar objects in frames is generally disfavoured, and should really only be used as a temporary intermediate step.

与往常一样,请记住,在帧中存储非标量对象通常是不受欢迎的,并且实际上应该仅用作临时的中间步骤。

That said, you can use the .straccessor even though it's not a column of strings:

也就是说,您可以使用.str访问器,即使它不是一列字符串:

>>> df = pd.DataFrame({"A": [[1,2],[3,4],[8,9],[2,6]]})
>>> df["new_col"] = df["A"].str[0]
>>> df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2
>>> df["new_col"]
0    1
1    3
2    8
3    2
Name: new_col, dtype: int64

回答by dmb

You can use mapand a lambdafunction

您可以使用map和一个lambda功能

df.loc[:, 'new_col'] = df.A.map(lambda x: x[0])


回答by jezrael

Use applywith x[0]:

使用applyx[0]

df['new_col'] = df.A.apply(lambda x: x[0])
print df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2

回答by Alexander

You can just use a conditional list comprehension which takes the first value of any iterable or else uses None for that item. List comprehensions are very Pythonic.

您可以只使用条件列表推导式,它采用任何可迭代对象的第一个值,或者为该项目使用 None 。列表推导式非常 Pythonic。

df['new_col'] = [val[0] if hasattr(val, '__iter__') else None for val in df["A"]]

>>> df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2

Timings

时间安排

df = pd.concat([df] * 10000)

%timeit df['new_col'] = [val[0] if hasattr(val, '__iter__') else None for val in df["A"]]
100 loops, best of 3: 13.2 ms per loop

%timeit df["new_col"] = df["A"].str[0]
100 loops, best of 3: 15.3 ms per loop

%timeit df['new_col'] = df.A.apply(lambda x: x[0])
100 loops, best of 3: 12.1 ms per loop

%timeit df.A.map(lambda x: x[0])
100 loops, best of 3: 11.1 ms per loop

Removing the safety check ensuring an interable.

删除安全检查以确保可交互。

%timeit df['new_col'] = [val[0] for val in df["A"]]
100 loops, best of 3: 7.38 ms per loop