Python Pandas DtypeWarning 在导入时指定 dtype 选项 - 如何？

Question

提问by Jarad

I have these columns:

我有这些列：

['Campaign', 'Ad group', 'Keyword', 'Status', 'Match type', 'Max. CPC', 'Quality score', 'Impressions', 'Clicks', 'CTR', 'Avg. CPC', 'Cost', 'Avg. position', 'Converted clicks', 'Click conversion rate', 'Cost / converted click', 'Bounce rate', 'Pages / session', 'Avg. session duration (seconds)', '% new sessions']

The error I'm receiving says:

我收到的错误说：

Warning (from warnings module):
  File "C:\Python34\lib\site-packages\pandas\io\parsers.py", line 1164
    data = self._reader.read(nrows)
DtypeWarning: Columns (5) have mixed types. Specify dtype option on import or set low_memory=False.

What does the Columns (5)part mean? Is that the column position? Does Campaigncolumn start at position 0 or 1?

什么是Columns (5)部分是什么意思？那是柱位吗？不Campaign列开始在位置0或1？

Also, I suspect this error is because my Max. CPCcolumn has ' --'in a few areas instead of zeros. I want this column datatype to be a float. How do I translate these ' --'to 0.00and also set this column as a float datatype when reading the CSV?

另外，我怀疑这个错误是因为我的Max. CPC列有' --'几个区域而不是零。我希望此列数据类型为浮点数。如何翻译这些' --'到0.00和读取CSV时，也设置此列作为一个float数据类型？

I've tried:

我试过了：

import pandas as pd
import numpy as np

df = pd.read_csv('file.csv', dtype={'Max. CPC': pd.np.float64})

print(df.head())

But get a ValueError:

但是得到一个 ValueError：

ValueError: could not convert string to float: ' --'

Answer 1

采纳答案by EdChum

There are 2 approaches I can think of, one is to pass a list of values that read_csvcan consider to treat as NaNvalues, this would convert those values in the list to be converted to NaNso that the dtype of that column remains as a floatand not object:

我可以想到两种方法，一种是传递read_csv可以考虑视为NaN值的值列表，这会将列表中的这些值转换为要转换为的值，NaN以便该列的 dtype 保持为 afloat而不是object：

df = pd.read_csv('file.csv', dtype={'Max. CPC': pd.np.float64}, na_values=[' --'])

You can then convert these NaNvalues to 0.00calling fillna:

然后，您可以将这些NaN值转换为0.00调用fillna：

df['Max. CPC'] = df['Max. CPC'].fillna(0.00)

The other is to load as before and replacethese values to 0.00:

另一种是像以前一样加载replace这些值0.00：

df['Max. CPC'] = df['Max. CPC'].replace(' --', 0.00)

Python Pandas DtypeWarning 在导入时指定 dtype 选项 - 如何？

提问by Jarad

采纳答案by EdChum

相关推荐

最近更新

标签

Python Pandas DtypeWarning 在导入时指定 dtype 选项 - 如何？

提问by Jarad

采纳答案by EdChum

相关推荐

pandas 熊猫离开加入并更新现有列

pandas 使用日期时间索引插入和填充熊猫数据框

pandas 如何通过熊猫数据框获取行增量值

pandas 在 Python 中将数字转换为时间

相关推荐

最近更新

标签