Python 大熊猫使用startswith从Dataframe中选择

Question

提问by dartdog

This works (using Pandas 12 dev)

这有效（使用 Pandas 12 dev）

table2=table[table['SUBDIVISION'] =='INVERNESS']

Then I realized I needed to select the field using "starts with" Since I was missing a bunch. So per the Pandas doc as near as I could follow I tried

然后我意识到我需要使用“开始于”来选择字段，因为我错过了一堆。所以根据 Pandas doc 尽可能接近我尝试过的

criteria = table['SUBDIVISION'].map(lambda x: x.startswith('INVERNESS'))
table2 = table[criteria]

And got AttributeError: 'float' object has no attribute 'startswith'

并得到 AttributeError: 'float' object has no attribute 'startswith'

So I tried an alternate syntax with the same result

所以我尝试了一种具有相同结果的替代语法

table[[x.startswith('INVERNESS') for x in table['SUBDIVISION']]]

Reference http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexingSection 4: List comprehensions and map method of Series can also be used to produce more complex criteria:

参考http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing第 4 节：Series 的列表推导式和映射方法也可用于生成更复杂的标准：

What am I missing?

我错过了什么？

Answer 1

采纳答案by Andy Hayden

You can use the str.startswithDataFrame method to give more consistent results:

您可以使用str.startswithDataFrame 方法提供更一致的结果：

In [11]: s = pd.Series(['a', 'ab', 'c', 11, np.nan])

In [12]: s
Out[12]:
0      a
1     ab
2      c
3     11
4    NaN
dtype: object

In [13]: s.str.startswith('a', na=False)
Out[13]:
0     True
1     True
2    False
3    False
4    False
dtype: bool

and the boolean indexing will work just fine (I prefer to use loc, but it works just the same without):

并且布尔索引将工作得很好（我更喜欢使用loc，但没有它的工作原理相同）：

In [14]: s.loc[s.str.startswith('a', na=False)]
Out[14]:
0     a
1    ab
dtype: object

.

It looks least one of your elements in the Series/column is a float, which doesn't have a startswith method hence the AttributeError, the list comprehension should raise the same error...

它看起来系列/列中至少有一个元素是浮点数，它没有startswith方法因此AttributeError，列表理解应该引发相同的错误......

Answer 2

回答by Vinoj John Hosan

To retrieve all the rows which startwithrequired string

检索以所需字符串开头的所有行

dataFrameOut = dataFrame[dataFrame['column name'].str.match('string')]

To retrieve all the rows which containsrequired string

检索包含所需字符串的所有行

dataFrameOut = dataFrame[dataFrame['column name'].str.contains('string')]

Answer 3

回答by AleAve81

You can use applyto easily apply any string matching function to your column elementwise.

您可以apply轻松地将任何字符串匹配函数应用于您的列元素。

table2=table[table['SUBDIVISION'].apply(lambda x: x.startswith('INVERNESS'))]

this assuming that your "SUBDIVISION" column is of the correct type (string)

这假设您的“SUBDIVISION”列是正确的类型（字符串）

Edit: fixed missing parenthesis

编辑：修复缺少的括号

Answer 4

回答by Saurabh

Using startswith for a particular column value

对特定列值使用开始

df  = df.loc[df["SUBDIVISION"].str.startswith('INVERNESS', na=False)]

Python 大熊猫使用startswith从Dataframe中选择

提问by dartdog

采纳答案by Andy Hayden

回答by Vinoj John Hosan

回答by AleAve81

回答by Saurabh

相关推荐

最近更新

标签

Python 大熊猫使用startswith从Dataframe中选择

提问by dartdog

采纳答案by Andy Hayden

回答by Vinoj John Hosan

回答by AleAve81

回答by Saurabh

相关推荐

Python 将请求的响应保存到文件

如何在python中为类动态创建类方法

通过键对python中的计数器进行排序

Python 包含多个元素的数组的真值是不明确的。使用 a.any() 或 a.all()

相关推荐

最近更新

标签