pandas Blast 解析：AttributeError：'float' 对象没有属性 'split'

Question

提问by Brindha Lekshmisaran

I am trying to write a script to parse the Ncbi BLAST report. The column that is causing this error is the genome GI number.

我正在尝试编写一个脚本来解析 Ncbi BLAST 报告。导致此错误的列是基因组 GI 编号。

E.g. LT697097.1

例如 LT697097.1

There is a decimal at the end. When i try to split this and just get the GI number, I get this error.

最后有一个小数点。当我尝试拆分它并仅获取 GI 编号时，出现此错误。

Django AttributeError 'float' object has no attribute 'split'tells me that this error is because split assumes that it is a float value.

Django AttributeError 'float' object has no attribute 'split'告诉我这个错误是因为 split 假定它是一个浮点值。

So, I used the advice from Pandas reading csv as string typeto import the pandas column as string.

因此，我使用Pandas 将 csv 作为字符串类型读取的建议将Pandas列作为字符串导入。

I am using column number as the report doesn't automatically have column names.

我正在使用列号，因为报告不会自动包含列名。

import pandas as pd    
df = pd.read_csv("out.txt", sep="\t", dtype=object, names = ['query id','subject ids','query acc.ver','subject acc.ver','% identity','alignment length', 'mismatches','gap opens','q.start','q.end','s.start','s.end','evalue','bit score'])

sacc = df['subject acc.ver']
sacc = [i.split('.',1)[0] for i in sacc]

I still get the error AttributeError: 'float' object has no attribute 'split'.

我仍然收到错误 AttributeError: 'float' object has no attribute 'split'。

I then tried astype(str) as suggested by Convert Columns to String in Pandas.

然后我按照Convert Columns to String in Pandas 的建议尝试了 astype(str) 。

This fails to read the column, and only has the columns names attribute as the output value.

这无法读取列，并且只有列名称属性作为输出值。

Can you please advice me where I'm going wrong in my approach?

你能告诉我我的方法哪里出错了吗？

Answer 1

采纳答案by jezrael

I think you need str.splitwith selecting first list which working with NaNs very nice. Another problem should be some values without .:

我认为您需要str.split选择第一个与NaNs一起使用的列表非常好。另一个问题应该是一些没有的值.：

df['subject acc.ver'] = df['subject acc.ver'].str.split('.',1).str[0]

Sample:

样本：

df = pd.DataFrame({'subject acc.ver':['LT697097.1',np.nan,None, 'LT6']})

df['subject acc.ver'] = df['subject acc.ver'].str.split('.',1).str[0]
print (df)
  subject acc.ver
0        LT697097
1             NaN
2            None
3             LT6

pandas Blast 解析：AttributeError：'float' 对象没有属性 'split'

提问by Brindha Lekshmisaran

采纳答案by jezrael

相关推荐

最近更新

标签

pandas Blast 解析：AttributeError：'float' 对象没有属性 'split'

提问by Brindha Lekshmisaran

采纳答案by jezrael

相关推荐

pandas Seaborn 和 pd.scatter_matrix() 绘图颜色问题

使用 groupby 填充 Pandas

如何将 Pandas 列中的 JSON 数据转换为新列

pandas 分组并减去熊猫中的列

相关推荐

最近更新

标签