Python 类型错误：“float”类型的参数不可迭代

Question

提问by jaspreet kaur bassan

I am new to python and TensorFlow. I recently started understanding and executing TensorFlow examples, and came across this one: https://www.tensorflow.org/versions/r0.10/tutorials/wide_and_deep/index.html

我是 python 和 TensorFlow 的新手。我最近开始理解和执行 TensorFlow 示例，并遇到了这个：https: //www.tensorflow.org/versions/r0.10/tutorials/wide_and_deep/index.html

I got the error, TypeError: argument of type 'float' is not iterable, and I believe that the problem is with the following line of code:

我收到错误TypeError: argument of type 'float' is not iterable，我相信问题出在以下代码行：

df_train[LABEL_COLUMN] = (df_train['income_bracket'].apply(lambda x: '>50K' in x)).astype(int)

(income_bracket is the label column of the census dataset, with '>50K' being one of the possible label values, and the other label is '=<50K'. The dataset is read into df_train. The explanation provided in the documentation for the reason to do the above is, "Since the task is a binary classification problem, we'll construct a label column named "label" whose value is 1 if the income is over 50K, and 0 otherwise.")

（income_bracket 是人口普查数据集的标签列，其中 '>50K' 是可能的标签值之一，另一个标签是 '=<50K'。数据集被读入 df_train。文档中提供的解释这样做的原因是，“由于任务是一个二元分类问题，我们将构造一个名为“label”的标签列，如果收入超过 50K，其值为 1，否则为 0。”）

If anyone could explain me what is exactly happening and how should I fix it, that'll be great. I tried using Python2.7 and Python3.4, and I don't think that the problem is with the version of the language. Also, if anyone is aware of great tutorials for someone who is new to TensorFlow and pandas, please share the links.

如果有人能向我解释到底发生了什么以及我应该如何解决它，那就太好了。我尝试使用Python2.7和Python3.4，我认为问题不是语言版本的问题。另外，如果有人知道适合 TensorFlow 和 Pandas 新手的很棒的教程，请分享链接。

Complete program:

完整程序：

import pandas as pd
import urllib
import tempfile
import tensorflow as tf

gender = tf.contrib.layers.sparse_column_with_keys(column_name="gender", keys=["female", "male"])
race = tf.contrib.layers.sparse_column_with_keys(column_name="race", keys=["Amer-Indian-Eskimo", "Asian-Pac-Islander", "Black", "Other", "White"])
education = tf.contrib.layers.sparse_column_with_hash_bucket("education", hash_bucket_size=1000)
marital_status = tf.contrib.layers.sparse_column_with_hash_bucket("marital_status", hash_bucket_size=100)
relationship = tf.contrib.layers.sparse_column_with_hash_bucket("relationship", hash_bucket_size=100)
workclass = tf.contrib.layers.sparse_column_with_hash_bucket("workclass", hash_bucket_size=100)
occupation = tf.contrib.layers.sparse_column_with_hash_bucket("occupation", hash_bucket_size=1000)
native_country = tf.contrib.layers.sparse_column_with_hash_bucket("native_country", hash_bucket_size=1000)


age = tf.contrib.layers.real_valued_column("age")
age_buckets = tf.contrib.layers.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
education_num = tf.contrib.layers.real_valued_column("education_num")
capital_gain = tf.contrib.layers.real_valued_column("capital_gain")
capital_loss = tf.contrib.layers.real_valued_column("capital_loss")
hours_per_week = tf.contrib.layers.real_valued_column("hours_per_week")

wide_columns = [gender, native_country, education, occupation, workclass, marital_status, relationship, age_buckets, tf.contrib.layers.crossed_column([education, occupation], hash_bucket_size=int(1e4)), tf.contrib.layers.crossed_column([native_country, occupation], hash_bucket_size=int(1e4)), tf.contrib.layers.crossed_column([age_buckets, race, occupation], hash_bucket_size=int(1e6))]

deep_columns = [
  tf.contrib.layers.embedding_column(workclass, dimension=8),
  tf.contrib.layers.embedding_column(education, dimension=8),
  tf.contrib.layers.embedding_column(marital_status, dimension=8),
  tf.contrib.layers.embedding_column(gender, dimension=8),
  tf.contrib.layers.embedding_column(relationship, dimension=8),
  tf.contrib.layers.embedding_column(race, dimension=8),
  tf.contrib.layers.embedding_column(native_country, dimension=8),
  tf.contrib.layers.embedding_column(occupation, dimension=8),
  age, education_num, capital_gain, capital_loss, hours_per_week]

model_dir = tempfile.mkdtemp()
m = tf.contrib.learn.DNNLinearCombinedClassifier(
    model_dir=model_dir,
    linear_feature_columns=wide_columns,
    dnn_feature_columns=deep_columns,
    dnn_hidden_units=[100, 50])


COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num",
  "marital_status", "occupation", "relationship", "race", "gender",
  "capital_gain", "capital_loss", "hours_per_week", "native_country", "income_bracket"]
LABEL_COLUMN = 'label'
CATEGORICAL_COLUMNS = ["workclass", "education", "marital_status", "occupation", "relationship", "race", "gender", "native_country"]
CONTINUOUS_COLUMNS = ["age", "education_num", "capital_gain", "capital_loss", "hours_per_week"]


train_file = tempfile.NamedTemporaryFile()
test_file = tempfile.NamedTemporaryFile()
urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", train_file.name)
urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test", test_file.name)


df_train = pd.read_csv(train_file, names=COLUMNS, skipinitialspace=True)
df_test = pd.read_csv(test_file, names=COLUMNS, skipinitialspace=True, skiprows=1)
df_train[LABEL_COLUMN] = (df_train['income_bracket'].apply(lambda x: '>50K' in x)).astype(int)
df_test[LABEL_COLUMN] = (df_test['income_bracket'].apply(lambda x: '>50K' in x)).astype(int)


def input_fn(df):

  continuous_cols = {k: tf.constant(df[k].values)
                     for k in CONTINUOUS_COLUMNS}

  categorical_cols = {k: tf.SparseTensor(
      indices=[[i, 0] for i in range(df[k].size)],
      values=df[k].values,
      shape=[df[k].size, 1])
                      for k in CATEGORICAL_COLUMNS}

  feature_cols = dict(continuous_cols.items() + categorical_cols.items())
  label = tf.constant(df[LABEL_COLUMN].values)
  return feature_cols, label


def train_input_fn():
    return input_fn(df_train)


def eval_input_fn():
    return input_fn(df_test)

m.fit(input_fn=train_input_fn, steps=200)
results = m.evaluate(input_fn=eval_input_fn, steps=1)
for key in sorted(results):
    print("%s: %s" % (key, results[key]))

Thank you

谢谢

PS: Full stack trace for the error

PS：错误的完整堆栈跟踪

Traceback (most recent call last):

File "/home/jaspreet/PycharmProjects/TicTacTensorFlow/census.py", line 73, in <module>
    df_train[LABEL_COLUMN] = (df_train['income_bracket'].apply(lambda x: '>50K' in x)).astype(int)

File "/usr/lib/python2.7/dist-packages/pandas/core/series.py", line 2023, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)

File "inference.pyx", line 920, in pandas.lib.map_infer (pandas/lib.c:44780)

File "/home/jaspreet/PycharmProjects/TicTacTensorFlow/census.py", line 73, in <lambda>
    df_train[LABEL_COLUMN] = (df_train['income_bracket'].apply(lambda x: '>50K' in x)).astype(int)

TypeError: argument of type 'float' is not iterable

Answer 1

采纳答案by jaspreet kaur bassan

The program works verbatim with the latest version of pandas, i.e., 0.18.1

该程序可以与最新版本的熊猫（即 0.18.1）逐字运行

Answer 2

回答by Microos

As you can see, when you inspect the test.data, you will obviously see that the first line of data has "NAN" in income_bracketfield.

如您所见，当您检查时test.data，您会很明显地看到第一行数据的income_bracket字段中有“NAN” 。

I have further inspected that this is the only line contains "NAN" by doing:

我通过执行以下操作进一步检查了这是唯一包含“NAN”的行：

ib = df_test ["income_bracket"]
t = type('12')
for idx,i in enumerate(ib):
    if(type(i) != t):
        print idx,type(i)

RESULT: 0 <type 'float'>

So you may just skip this row by:

所以你可以跳过这一行：

df_test = pd.read_csv(file_test , names=COLUMNS, skipinitialspace=True, skiprows=1)

Answer 3

回答by no7dw

maybe got a number in the for loop after in keyword try to skip it with a test ("isinstance" )

在 in 关键字尝试跳过它后，可能在 for 循环中有一个数字（“isinstance”）

if(isinstance(lines, str)):
   for x in lines:
       foo()
else:
   skip

Python 类型错误：“float”类型的参数不可迭代

提问by jaspreet kaur bassan

采纳答案by jaspreet kaur bassan

回答by Microos

回答by no7dw

相关推荐

最近更新

标签

Python 类型错误：“float”类型的参数不可迭代

提问by jaspreet kaur bassan

采纳答案by jaspreet kaur bassan

回答by Microos

回答by no7dw

相关推荐

Python 使用 pip 在 Mac 上安装 Pandas

Python 导入错误：没有名为“加密”的模块

Python ValueError：张量必须与 Tensorflow 中具有双向 RNN 的张量来自同一图

Python 如何在 MacOSX 上正确卸载 numpy？

相关推荐

最近更新

标签