pandas 在python中将查询结果转换为DataFrame
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31473457/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting query results into DataFrame in python
提问by maggs
I am trying to perform manipulation on the result from a query using psycog2. Thus I have to covert result into pandas DataFrame. But when i use the following code and print, only the columns name are printed not the rows. I used 'pd.DataFrame.from_records' too but that did not work.
我正在尝试使用 psycog2 对查询结果执行操作。因此我必须将结果转换为 Pandas DataFrame。但是当我使用以下代码并打印时,只打印列名而不是行。我也使用了 'pd.DataFrame.from_records' 但这没有用。
import psycopg2
import pandas as pd
import numpy as np
conn_string = "Connect_Info"
conn = psycopg2.connect(conn_string)
cursor = conn.cursor()
cursor.execute(query)
rows=pd.DataFrame(cursor.fetchall(),columns=['page_num','Frequency'])
for row in rows:
print row
conn.commit();
conn.close();
The result of cursor.fetchall() -
cursor.fetchall() 的结果 -
(1L, 90990L)
(3L, 6532L)
(2L, 5614L)
(4L, 4016L)
(5L, 2098L)
(6L, 1651L)
(7L, 1158L)
(8L, 854L)
(9L, 658L)
(10L, 494L)
(11L, 345L)
(12L, 301L)
(13L, 221L)
(15L, 152L)
(14L, 138L)
(16L, 113L)
(17L, 93L)
(18L, 73L)
(20L, 62L)
(19L, 55L)
(22L, 44L)
(21L, 35L)
(23L, 29L)
(25L, 24L)
(27L, 19L)
(26L, 18L)
回答by joris
Maybe not directly an answer on your question, but you should use read_sql_queryfor this instead doing the fetchall and wrap in DataFrame yourself. This would look like:
也许不是直接回答您的问题,但您应该read_sql_query为此使用它,而不是自己执行 fetchall 并包装在 DataFrame 中。这看起来像:
conn = psycopg2.connect(...)
rows = pd.read_sql_query(query, conn)
instead of all your code above.
而不是上面的所有代码。
And for your actual question, see http://pandas.pydata.org/pandas-docs/stable/basics.html#iterationfor an explanation and the different options.
The basics is that iterating over a dataframe, iterates over the column names. To iterate over the rows you can use other functions like .iterrows()and .itertuples(). But remember, in most cases iterating manually over the rows is not needed.
对于您的实际问题,请参阅http://pandas.pydata.org/pandas-docs/stable/basics.html#iteration以获得解释和不同的选项。
基本原理是迭代数据帧,迭代列名。要遍历行,您可以使用其他函数,例如.iterrows()和.itertuples()。但请记住,在大多数情况下,不需要手动遍历行。
回答by Padraic Cunningham
That is exactly what should happen when you iterate over a dataframe, you see the column names. If you want to see the df just print the df. To see the rows:
这正是迭代数据框时应该发生的情况,您会看到列名。如果您想查看 df,只需打印 df。要查看行:
for ind, row in df.iterrows():
print(row.values)
Or .values:
或 .values:
for row in df.values:
print(row)
回答by omri_saadon
another suggestion is using itertuples, which yields (index, row_value1, row_value2...) tuples.
另一个建议是使用 itertuples,它产生 (index, row_value1, row_value2...) 元组。
for tup in rows.itertuples():
print tup
'(0, 1, 90990)
(1, 3, 6532)
(2, 2, 5614)
(3, 4, 4016)
...'
as you can see the first position is the index, the socend is the value of the first column and the second is the value of the second column.
如您所见,第一个位置是索引,socend 是第一列的值,第二个是第二列的值。

