访问包含列表的 Pandas DataFrame 列的每个第一个元素

Question

提问by mkoala

I have a Pandas DataFrame with a column containing lists objects

我有一个包含列表对象的列的 Pandas DataFrame

      A
0   [1,2]
1   [3,4]
2   [8,9] 
3   [2,6]

How can I access the first element of each list and save it into a new column of the DataFrame? To get a result like this:

如何访问每个列表的第一个元素并将其保存到 DataFrame 的新列中？要得到这样的结果：

      A     new_col
0   [1,2]      1
1   [3,4]      3
2   [8,9]      8
3   [2,6]      2

I know this could be done via iterating over each row, but is there any "pythonic" way?

我知道这可以通过迭代每一行来完成，但是有什么“pythonic”的方式吗？

Answer 1

回答by DSM

As always, remember that storing non-scalar objects in frames is generally disfavoured, and should really only be used as a temporary intermediate step.

与往常一样，请记住，在帧中存储非标量对象通常是不受欢迎的，并且实际上应该仅用作临时的中间步骤。

That said, you can use the .straccessor even though it's not a column of strings:

也就是说，您可以使用.str访问器，即使它不是一列字符串：

>>> df = pd.DataFrame({"A": [[1,2],[3,4],[8,9],[2,6]]})
>>> df["new_col"] = df["A"].str[0]
>>> df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2
>>> df["new_col"]
0    1
1    3
2    8
3    2
Name: new_col, dtype: int64

Answer 2

回答by dmb

You can use mapand a lambdafunction

您可以使用map和一个lambda功能

df.loc[:, 'new_col'] = df.A.map(lambda x: x[0])

Answer 3

回答by jezrael

Use applywith x[0]:

使用apply有x[0]：

df['new_col'] = df.A.apply(lambda x: x[0])
print df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2

Answer 4

回答by Alexander

You can just use a conditional list comprehension which takes the first value of any iterable or else uses None for that item. List comprehensions are very Pythonic.

您可以只使用条件列表推导式，它采用任何可迭代对象的第一个值，或者为该项目使用 None 。列表推导式非常 Pythonic。

df['new_col'] = [val[0] if hasattr(val, '__iter__') else None for val in df["A"]]

>>> df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2

Timings

时间安排

df = pd.concat([df] * 10000)

%timeit df['new_col'] = [val[0] if hasattr(val, '__iter__') else None for val in df["A"]]
100 loops, best of 3: 13.2 ms per loop

%timeit df["new_col"] = df["A"].str[0]
100 loops, best of 3: 15.3 ms per loop

%timeit df['new_col'] = df.A.apply(lambda x: x[0])
100 loops, best of 3: 12.1 ms per loop

%timeit df.A.map(lambda x: x[0])
100 loops, best of 3: 11.1 ms per loop

Removing the safety check ensuring an interable.

删除安全检查以确保可交互。

%timeit df['new_col'] = [val[0] for val in df["A"]]
100 loops, best of 3: 7.38 ms per loop

访问包含列表的 Pandas DataFrame 列的每个第一个元素

提问by mkoala

回答by DSM

回答by dmb

回答by jezrael

回答by Alexander

相关推荐

最近更新

标签

访问包含列表的 Pandas DataFrame 列的每个第一个元素

提问by mkoala

回答by DSM

回答by dmb

回答by jezrael

回答by Alexander

相关推荐

pandas 熊猫：过去 n 天的平均值

pandas 访问pandas value_counts 的第一列

如何将 numpy 矩阵转换为 Pandas 系列？

pandas 我应该如何使用熊猫读取没有“未命名”行的 csv 文件？

相关推荐

最近更新

标签