Python 对 Pandas 数据集执行 SQL 查询
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45865608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Executing an SQL query over a pandas dataset
提问by Miguel Santos
I have a pandas data set, called 'df'.
我有一个名为“df”的熊猫数据集。
How can I do something like below;
我怎么能做下面的事情;
df.query("select * from df")
Thank you.
谢谢你。
For those who know R, there is a library called sqldf where you can execute SQL code in R, my question is basically, is there some library like sqldf in python
对于那些了解 R 的人来说,有一个叫做 sqldf 的库,你可以在 R 中执行 SQL 代码,我的问题基本上是,python 中有没有像 sqldf 这样的库
回答by YOBEN_S
This is not pandas.query
supposed to do , You can look at package pandasql
(same like sqldf
in R )
这不pandas.query
应该做,您可以查看包pandasql
(与sqldf
R 中相同)
import pandas as pd
import pandasql as ps
df = pd.DataFrame([[1234, 'Customer A', '123 Street', np.nan],
[1234, 'Customer A', np.nan, '333 Street'],
[1233, 'Customer B', '444 Street', '333 Street'],
[1233, 'Customer B', '444 Street', '666 Street']], columns=
['ID', 'Customer', 'Billing Address', 'Shipping Address'])
q1 = """SELECT ID FROM df """
print(ps.sqldf(q1, locals()))
ID
0 1234
1 1234
2 1233
3 1233
回答by user1717828
You can use DataFrame.query(condition)
to return a subset of the data frame matching condition
like this:
您可以使用DataFrame.query(condition)
返回匹配的数据框子集,condition
如下所示:
df = pd.DataFrame(np.arange(9).reshape(3,3), columns=list('ABC'))
df
A B C
0 0 1 2
1 3 4 5
2 6 7 8
df.query('C < 6')
A B C
0 0 1 2
1 3 4 5
df.query('2*B <= C')
A B C
0 0 1 2
df.query('A % 2 == 0')
A B C
0 0 1 2
2 6 7 8
This is basically the same effect as an SQL statement, except the SELECT * FROM df WHERE
is implied.
这与 SQL 语句的效果基本相同,只是SELECT * FROM df WHERE
隐含了 。
回答by Zach Brookler
There's actually a new package that was just released, called dataframe_sql. This gives you the ability to query pandas dataframes using SQL just as you want to. You can find the package here https://github.com/zbrookle/dataframe_sql
实际上有一个刚刚发布的新包,称为dataframe_sql。这使您能够根据需要使用 SQL 查询 Pandas 数据帧。你可以在这里找到这个包https://github.com/zbrookle/dataframe_sql