pandas 数据框 values.tolist() 数据类型

Question

提问by Meng Qian

I have a dataframe like this:

我有一个这样的数据框：

This dataframe has several columns. Two are of type float: priceand change, while volmeand amountare of type int. I use the method df.values.tolist()change df to list and get the data:

此数据框有几列。两个属于类型float：priceand change、whilevolme和amount属于类型int。我使用方法df.values.tolist()change df 列出并获取数据：

datatmp = df.values.tolist()
print(datatmp[0])

[20160108150023.0, 11.12, -0.01, 4268.0, 4746460.0, 2.0]

The inttypes in dfall change to floattypes. My question is why do inttypes change to the floattypes? How can I get the intdata I want?

所有的int类型df都变成了float类型。我的问题是为什么int类型会更改为float类型？我怎样才能得到int我想要的数据？

Answer 1

采纳答案by Mike Müller

You can convert column-by-column:

您可以逐列转换：

by_column = [df[x].values.tolist() for x in df.columns]

This will preserve the data type of each column.

这将保留每列的数据类型。

Than convert to the structure you want:

比转换为你想要的结构：

list(list(x) for x in zip(*by_column))

You can do it in one line:

您可以在一行中完成：

list(list(x) for x in zip(*(df[x].values.tolist() for x in df.columns)))

You can check what datatypes your columns have with:

您可以检查您的列具有哪些数据类型：

df.info()

Very likely your column amountis of type float. Do you have any NaNin this column? These are always of type floatand would make the whole column float.

您的列很可能amount是类型float。你有NaN这个专栏吗？这些总是类型的float并且可以构成整个列float。

You can cast to intwith:

你可以投射到int：

df.values.astype(int).tolist()

Answer 2

回答by Pachelbel

I think the pandas documentation helps:

我认为Pandas文档有帮助：

DataFrame.values
Numpy representation of NDFrame
The dtype will be a lower-common-denominator dtype (implicit upcasting); that is to say if the dtypes (even of numeric types) are mixed, the one that accommodates all will be chosen. Use this with care if you are not dealing with the blocks.

DataFrame.values
NDFrame 的 Numpy 表示
dtype 将是一个较低的公分母 dtype（隐式向上转换）；也就是说，如果 dtypes（甚至是数字类型）混合在一起，则将选择容纳所有类型的 dtypes。如果您不处理块，请小心使用它。

So here apparently float is chosen to accomodate all component types. A simple method would be (however, most possibly there are more elegant solutions around, I'm not too familiar with pandas):

所以这里显然选择了 float 来容纳所有组件类型。一个简单的方法是（但是，很可能有更优雅的解决方案，我对Pandas不太熟悉）：

datatmp = map(lambda row: list(row[1:]), df.itertuples())

Here the itertuples()gives an iterator with elements of the form (rownumber, colum1_entry, colum2_entry, ...). The map takes each such tuple and applies the lambda function, which removes the first component (rownumber), and returns a list containing the components of a single row. You can also remove the list()invocation if it's ok for you to work with a list of tuples.

这里itertuples()给出了一个具有以下形式元素的迭代器 (rownumber, colum1_entry, colum2_entry, ...)。该映射采用每个这样的元组并应用 lambda 函数，该函数删除第一个组件（行号），并返回一个包含单行组件的列表。如果您可以list()使用元组列表，您也可以删除调用。

[Dataframe values property][1] "http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.values.html#pandas.DataFrame.values"

[数据框值属性][1]“ http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.values.html#pandas.DataFrame.values”

pandas 数据框 values.tolist() 数据类型

提问by Meng Qian

采纳答案by Mike Müller

回答by Pachelbel

相关推荐

最近更新

标签

pandas 数据框 values.tolist() 数据类型

提问by Meng Qian

采纳答案by Mike Müller

回答by Pachelbel

相关推荐

将 pandas.DataFrame 转换为字节

pandas 由于“完美分离错误”而无法运行逻辑回归

将列表中具有零值的多列添加到 Pandas 数据框中

如何使用 Pandas 将巨大的 CSV 转换为 SQLite？

相关推荐

最近更新

标签