从 Python 数据帧的列中的每一行中删除前 x 个字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42349572/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove first x number of characters from each row in a column of a Python dataframe
提问by d84_n1nj4
I have a Python dataframe with about 1,500 rows and 15 columns. With one specific column I would like to remove the first 3 characters of each row. As a simple example here is a dataframe:
我有一个包含大约 1,500 行和 15 列的 Python 数据框。对于一个特定的列,我想删除每行的前 3 个字符。作为一个简单的例子,这里是一个数据框:
import pandas as pd
d = {
'Report Number':['8761234567', '8679876543','8994434555'],
'Name' :['George', 'Bill', 'Sally']
}
d = pd.DataFrame(d)
I would like to remove the first three characters from each field in the Report Number
column of dataframe d
.
我想从Report Number
dataframe 列中的每个字段中删除前三个字符d
。
回答by EdChum
回答by jpp
It is worth noting Pandas "vectorised" str
methods are no more than Python-level loops.
值得注意的是 Pandas 的“矢量化”str
方法只不过是 Python 级别的循环。
Assuming clean data, you will often find a list comprehension more efficient:
假设数据干净,您通常会发现列表理解更有效:
# Python 3.6.0, Pandas 0.19.2
d = pd.concat([d]*10000, ignore_index=True)
%timeit d['Report Number'].str[3:] # 12.1 ms per loop
%timeit [i[3:] for i in d['Report Number']] # 5.78 ms per loop
Note these aren't equivalent, since the list comprehension does not deal with null data and other edge cases. For these situations, you may prefer the Pandas solution.
请注意,这些并不等效,因为列表推导式不处理空数据和其他边缘情况。对于这些情况,您可能更喜欢 Pandas 解决方案。