Python 熊猫 - 'dataframe' 对象没有属性 'str'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51502263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas - 'dataframe' object has no attribute 'str'
提问by David Luong
I am trying to filter out the dataframe that contains a list of product. However, I am getting the pandas - 'dataframe' object has no attribute 'str' error whenever I run the code.
我正在尝试过滤掉包含产品列表的数据框。但是,我得到了熊猫 - 每当我运行代码时,'dataframe' 对象都没有属性 'str' 错误。
Here is the line of code:
这是代码行:
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
If anyone has any ideas of suggestions, please let me know. I've searched many times and I'm quite stuck.
如果有人有任何建议的想法,请告诉我。我已经搜索了很多次,但我很困惑。
Product is an object datatype.
产品是一种对象数据类型。
EDIT:
编辑:
import __future__
import os
import pandas as pd
import numpy as np
import tensorflow as tf
import math
data = pd.read_csv("FILE.csv", header = None)
headerName=["DRID","Product","M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]
cliques = [(Confidential)]
data.columns=[headerName]
log_df = data
log_df = np.log(1+data[["M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]])
copy = data[["DRID","Product"]].copy()
log_df = copy.join(log_df)
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
Here is the head:
这是头部:
ID PRODUCT M24 M23 M22 M21
0 123421 A 0.000000 0.000000 1.098612 0.0
1 141840 A 0.693147 1.098612 0.000000 0.0
2 212006 A 0.693147 0.000000 0.000000 0.0
3 216097 A 1.098612 0.000000 0.000000 0.0
4 219517 A 1.098612 0.693147 1.098612 0.0
edit 2: here is print(data), A is the product. it looks like A is not under the category product when I print it out.
编辑 2:这里是打印(数据),A 是产品。当我打印出来时,它看起来 A 不在类别产品下。
DRID Product M24 M23 M22 M21 M20 \
0 52250 A 0.0 0.0 2.0 0.0 0.0
1 141840 A 1.0 2.0 0.0 0.0 0.0
2 212006 A 1.0 0.0 0.0 0.0 0.0
3 216097 A 2.0 0.0 0.0 0.0 0.0
回答by hoang tran
Short answer:change data.columns=[headerName]
into data.columns=headerName
答案很简单:改变data.columns=[headerName]
成data.columns=headerName
Explanation:when you set data.columns=[headerName]
, the columns are MultiIndex object. Therefore, your log_df['Product']
is a DataFrame and for DataFrame, there is no str
attribute.
说明:设置时data.columns=[headerName]
,列是 MultiIndex 对象。因此,您log_df['Product']
是一个 DataFrame,而对于 DataFrame,则没有str
属性。
When you set data.columns=headerName
, your log_df['Product']
is a single column and you can use str
attribute.
设置时data.columns=headerName
,您log_df['Product']
是单列,您可以使用str
属性。
For any reason, if you need to keep your data as MultiIndex object, there is another solution: first convert your log_df['Product']
into Series. After that, str
attribute is available.
无论出于何种原因,如果您需要将数据保留为 MultiIndex 对象,还有另一种解决方案:首先将您的数据转换log_df['Product']
为 Series。之后,str
属性可用。
products = pd.Series(df.Product.values.flatten())
include_clique = products[products.str.contains("Product A")]
However, I guess the first solution is what you're looking for
但是,我想第一个解决方案是您正在寻找的