Python 过滤时从熊猫数据框中获取子字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30780742/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get substring from pandas dataframe while filtering
提问by Eduardo
Say I have a dataframe with the following information:
假设我有一个包含以下信息的数据框:
Name Points String John 24 FTS8500001A Richard 35 FTS6700001B John 29 FTS2500001A Richard 35 FTS3800001B John 34 FTS4500001A
Here is the way to get a DataFrame with the sample above:
以下是使用上述示例获取 DataFrame 的方法:
import pandas as pd
keys = ('Name', 'Points', 'String')
names = pd.Series(('John', 'Richard', 'John', 'Richard', 'John'))
ages = pd.Series((24,35,29,35,34))
strings = pd.Series(('FTS8500001A','FTS6700001B','FTS2500001A','FTS3800001B','FTS4500001A'))
df = pd.concat((names, ages, strings), axis=1, keys=keys)
I want to select every row that meet the following criteria: Name=Richard And Points=35. And for such rows I want to read the 4th and 5th char of the String column (the two numbers just after FTS).
我想选择满足以下条件的每一行:Name=Richard And Points=35。对于这样的行,我想读取 String 列的第 4 个和第 5 个字符(FTS 之后的两个数字)。
The output I want is the numbers 67 and 38.
我想要的输出是数字 67 和 38。
I've tried several ways to achieve it but with zero results. Can you please help?
我尝试了几种方法来实现它,但结果为零。你能帮忙吗?
Thank you very much.
Eduardo
非常感谢。
爱德华多
采纳答案by EdChum
回答by firelynx
Pandas string methods
熊猫字符串方法
You can mask it on your criteria and then use pandas string methods
您可以根据您的条件屏蔽它,然后使用熊猫字符串方法
mask_richard = df.Name == 'Richard'
mask_points = df.Points == 35
df[mask_richard & mask_points].String.str[3:5]
1 67
3 38