pandas.read_sql 处理速度

Question

提问by Yann

I need for further processing the result set of a MySQL query as a dataframe. The SQL table contains about 2 million rows and 12 columns (Data size = 180 MiB). I'm running OS X 10.9 with 8 GB memory. Is it normal that pandas.read_sqltakes more than 20 secs to return the dataframe? How to implement a chunk size option like in pandas.read_csv?

我需要进一步处理 MySQL 查询的结果集作为数据帧。SQL 表包含大约 200 万行和 12 列（数据大小 = 180 MiB）。我正在运行具有 8 GB 内存的 OS X 10.9。pandas.read_sql需要超过 20 秒才能返回数据帧是否正常？如何在pandas.read_csv 中实现块大小选项？

Edit:Python 2.7.6, pandas 0.13.1

编辑：Python 2.7.6，Pandas 0.13.1

Answer 1

采纳答案by Adrien Pacifico

Pandas documentationshows that read_sql()/read_sql_query()takes about 10 times the time to read a file compare to read_hdf()and 3 times the time of read_csv().

Pandas的文件显示，read_sql()/read_sql_query()需要约10倍的时间来阅读比较文件read_hdf()和3倍的时间read_csv()。

The read_sql()has now a chunk-size argument ( see the documentation)

在read_sql()现在有一个块大小的参数（见文档）

pandas.read_sql 处理速度

提问by Yann

采纳答案by Adrien Pacifico

相关推荐

最近更新

标签

pandas.read_sql 处理速度

提问by Yann

采纳答案by Adrien Pacifico

相关推荐

pandas numpy数组维度不匹配

pandas 从熊猫中的单个字符串列创建新的二进制列

使用 Pandas 在不同轴上绘制条形图和时间序列图

在 Pandas 中搜索多个字符串而不预先定义要使用的字符串数量

相关推荐

最近更新

标签