Python ValueError: 无法将字符串转换为浮点数:

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43030363/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:27:20  来源:igfitidea点击:

ValueError: could not convert string to float:

pythoncsvmachine-learningscikit-learn

提问by Thom Elliott

I am following a this tutorial to write a Naive Bayes Classifier: http://machinelearningmastery.com/naive-bayes-classifier-scratch-python/

我正在按照本教程编写朴素贝叶斯分类器:http: //machinelearningmastery.com/naive-bayes-classifier-scratch-python/

I keep getting this error:

我不断收到此错误:

dataset[i] = [float(x) for x in dataset[i]]
ValueError: could not convert string to float: 

Here is the part of my code where the error occurs:

这是我的代码发生错误的部分:

def loadDatasetNB(filename):
    lines = csv.reader(open(filename, "rt"))
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset

And here is how the file is called:

这是文件的调用方式:

def NB_Analysis():
    filename = 'fvectors.csv'
    splitRatio = 0.67
    dataset = loadDatasetNB(filename)
    trainingSet, testSet = splitDatasetNB(dataset, splitRatio)
    print('Split {0} rows into train={1} and test={2} rows').format(len(dataset), len(trainingSet), len(testSet))
    # prepare model
    summaries = summarizeByClassNB(trainingSet)
    # test model
    predictions = getPredictionsNB(summaries, testSet)
    accuracy = getAccuracyNB(testSet, predictionsNB)
    print('Accuracy: {0}%').format(accuracy)

NB_Analysis()

My file fvectors.csv looks like this

我的文件 fvectors.csv 看起来像这样

What is going wrong here and how do I fix it?

这里出了什么问题,我该如何解决?

回答by Taras Matsyk

Try to skip a header, an empty header in the first column is causing the issue.

尝试跳过标题,第一列中的空标题导致问题。

>>> float(' ')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not convert string to float:

If you want to skip the header you can achieve it with:

如果你想跳过标题,你可以通过以下方式实现:

def loadDatasetNB(filename):
    lines = csv.reader(open(filename, "rt"))
    next(reader, None)  # <<- skip the headers
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset

(2) Or you can just ignore the exception:

(2) 或者你可以忽略异常:

try:
    float(element)
except ValueError:
    pass

If you decide to go with option (2), make sure that you skip only first row or only rows that contain text and you know it for sure.

如果您决定使用选项 (2),请确保仅跳过第一行或仅跳过包含文本的行,并且您肯定知道它。

回答by James

Looking at the image of your data, python cannot convert the last column of your data with the values squareand circle. Also, you have a header in your data that you need to skip.

查看您的数据图像,python 无法将数据的最后一列转换为值squarecircle. 此外,您的数据中有一个标题需要跳过。

Try using this code:

尝试使用此代码:

def loadDatasetNB(filename):
    with open(filename, 'r') as fp:
        reader= csv.reader(fp)
        # skip the header line
        header = next(reader)
        # save the features and the labels as different lists
        data_features = []
        data_labels = []
        for row in reader:
            # convert everything except the label to a float
            data_features.append([float(x) for x in row[:-1]])
            # save the labels separately
            data_labels.append(row[-1])
    return data_features, data_labels

回答by Yuval Pruss

There is an empty line.

有一个空行。

>> float('')
ValueError: could not convert string to float:

You can check the value before casting it:

您可以在转换之前检查该值:

dataset[i] = [float(x) for x in dataset[i] if x != '']

回答by Julien

You are loading strings into the floatconstructor here, which unless are under specific conditions, raises an error:

您正在float此处将字符串加载到构造函数中,除非在特定条件下,否则会引发错误:

dataset[i] = [float(x) for x in dataset[i]]

Instead of using a list comprehension, perhaps it would be better to use a for loop so you can more easily handle this case:

与其使用列表推导式,也许最好使用 for 循环,以便您可以更轻松地处理这种情况:

data = []
for x in dataset[i]:
    try:
        value = float(x)
    except ValueError:
        value = x
    data.append(value)
dataset[i] = data

See more about catching exceptions here:

在此处查看有关捕获异常的更多信息:

Try/Except in Python: How do you properly ignore Exceptions?

Python 中的 Try/Except:如何正确忽略异常?