从 Pandas 数据框中的单元格中提取字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32896387/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract string from cell in Pandas dataframe
提问by robahall
I have a data frame, df:
我有一个数据框,df:
Filename Weight
0 '\file path\file.txt' NaN
1 '\file path\file.txt' NaN
2 '\file path\file.txt' NaN
and I have an function where I input the file name and it extracts a float value for me from the file. What I want is to call the file path from Filenamefrom each row in dfinto my function and then output the data into the Weightcolumn. My current code is:
我有一个函数,我输入文件名,它从文件中为我提取一个浮点值。我想要的是将文件路径从Filename每一行调用df到我的函数中,然后将数据输出到Weight列中。我目前的代码是:
df['Weight'] = df['Weight'].apply(x_wgt_pct(df['filename'].to_string()), axis = 1)
My error is:
我的错误是:
pandas\parser.pyx in pandas.parser.TextReader.__cinit__ (pandas\parser.c:3173)()
pandas\parser.pyx in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:5912)()
IOError: File 0 file0.txt
1 file1.txt
2 file2.txt
3 file3.txt does not exist
Not sure whether this error is bc it is calling all the file paths simultaneously as a string or I did not input the file path correctly.
不确定此错误是否是 bc 它同时将所有文件路径作为字符串调用,或者我没有正确输入文件路径。
采纳答案by Andy Hayden
to_stringcreates a string from the column, which isn't what you want:
to_string从列中创建一个字符串,这不是您想要的:
In [11]: df['Filename'].to_string()
Out[11]: "0 '\file path\file.txt'\n1 '\file path\file.txt'\n2 '\file path\file.txt'"
Assuming that x_wgt_pctis the function that takes a filepath and returns a float... you can loop through the entries:
假设这x_wgt_pct是采用文件路径并返回浮点数的函数......您可以遍历条目:
for i, f in enumerate(df["Filename"]):
weight = x_wgt_pct(f) # Note: you may have to slice off the 's i.e. f[1:-1]
df.ix[i, "Weight"] = weight
Note: some further care has to be taken if you have duplicate rows indices.
注意:如果您有重复的行索引,则必须进一步小心。

