pandas 从数据框中的字符串中提取子字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29294017/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract substring from string in dataframe
提问by nicholas.reichel
I have the following ddataframe:
我有以下 ddataframe:
Company Name Time Expectation
0 Asta Funding Inc. (ASFI) 9:35 AM ET -
1 BlackBerry (BBRY) 7:00 AM ET (for company in df['Company Name']:
ticker = re.search("\(.*\)",company).group(0)
ticker = ticker[1:len(ticker)-1]
tickers.append(ticker)
.03)
2 Carnival Corp. (CCL) 9:15 AM ET df['ticker'] = df['Company Name'].str.extract("\((.*)\)")
.09
3 Carnival PLC (CUK) 0:00 AM ET -
I would like to have the company symbols in their own seperate column instead of inside the Company Name column. Right now I just have it iterate over the company names, and a RE pulls the symbols, puts it into a list, and then I apply it to the new column, but I'm wondering if there is a cleaner/easier way.
我想将公司符号放在他们自己的单独列中,而不是在公司名称列中。现在我只是让它遍历公司名称,然后 RE 提取符号,将其放入列表中,然后将其应用于新列,但我想知道是否有更清洁/更简单的方法。
I'm new to the whole map reduce lambda stuff.
我是整个地图减少 lambda 的新手。
df['Company Symbol'] = df['Company Name'].str.rstrip(')').str.split('(').str[1] # Make new column
df['Company Name'] = df['Company Name'].str.replace(r'\(.*?\)$', '') # Remove symbol from company name

