使用 index_col 时 Pandas read_sql 列不起作用 - 而是返回所有列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35937579/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas read_sql columns not working when using index_col - returns all columns instead
提问by Henhuy
I'm using pandas.read_sql()
command to get data from my postgresql database.
The SQL query is created generically with many columns from which I only want to get specific columns using one column as index.
Creating an example table test_table
like this:
我正在使用pandas.read_sql()
命令从我的 postgresql 数据库中获取数据。SQL 查询通常使用许多列创建,我只想使用一列作为索引从中获取特定列。创建这样的示例表test_table
:
column1 column2 column3
1 2 3
2 4 6
3 6 9
I tried to use the index_col
and columns
parameter from pandas.read_sql()
to get column1
as index and column2
as data (and neglecting column3
!). But it always returns the whole table. Also when writing columns=['column1', 'column2']
nothing changes...
我尝试使用index_col
andcolumns
参数 frompandas.read_sql()
来获取column1
索引和column2
数据(而忽略了column3
!)。但它总是返回整个表。也当写columns=['column1', 'column2']
什么都没有改变......
I'm using python 2.7.6 with pandas 0.17.1 - Thanks for help!
我正在使用 python 2.7.6 和 pandas 0.17.1 - 感谢您的帮助!
Example Code:
示例代码:
import pandas
import psycopg2
import sqlalchemy
def connect():
connString = (
"dbname=test_db "
"host=localhost "
"port=5432 "
"user=postgres "
"password=password"
)
return psycopg2.connect(connString)
engine = sqlalchemy.create_engine(
'postgresql://',
creator=connect)
sql = (
'SELECT '
'column1, '
'column2, '
'column3 '
'FROM test_table'
)
data = pandas.read_sql(
sql,
engine,
index_col=['column1'],
columns=['column2'])
print(data)
采纳答案by mdls
I think the argument columns
did not work for you because you were using sql statement instead of providing it with your table name.
我认为该参数columns
对您不起作用,因为您使用的是 sql 语句而不是为它提供表名。
As mentioned from pandas website:
如Pandas网站所述:
columns : list, default: None List of column names to select from sql table (only used when reading a table).
列:列表,默认值:无 要从 sql 表中选择的列名列表(仅在读取表时使用)。
Therefore, I think if you try:
因此,我认为如果您尝试:
pandas.read_sql('test_table', engine, index_col=['column1'], columns=['column2'])
columns
argument will actually work.
columns
论点实际上会起作用。