Python 对 Pandas 数据集执行 SQL 查询

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45865608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:20:43  来源:igfitidea点击:

Executing an SQL query over a pandas dataset

pythonsqlitepandas

提问by Miguel Santos

I have a pandas data set, called 'df'.

我有一个名为“df”的熊猫数据集。

How can I do something like below;

我怎么能做下面的事情;

df.query("select * from df")

Thank you.

谢谢你。

For those who know R, there is a library called sqldf where you can execute SQL code in R, my question is basically, is there some library like sqldf in python

对于那些了解 R 的人来说,有一个叫做 sqldf 的库,你可以在 R 中执行 SQL 代码,我的问题基本上是,python 中有没有像 sqldf 这样的库

回答by YOBEN_S

This is not pandas.querysupposed to do , You can look at package pandasql(same like sqldfin R )

这不pandas.query应该做,您可以查看包pandasql(与sqldfR 中相同)

import pandas as pd
import pandasql as ps

df = pd.DataFrame([[1234, 'Customer A', '123 Street', np.nan],
               [1234, 'Customer A', np.nan, '333 Street'],
               [1233, 'Customer B', '444 Street', '333 Street'],
              [1233, 'Customer B', '444 Street', '666 Street']], columns=
['ID', 'Customer', 'Billing Address', 'Shipping Address'])

q1 = """SELECT ID FROM df """

print(ps.sqldf(q1, locals()))

     ID
0  1234
1  1234
2  1233
3  1233

回答by user1717828

You can use DataFrame.query(condition)to return a subset of the data frame matching conditionlike this:

您可以使用DataFrame.query(condition)返回匹配的数据框子集,condition如下所示:

df = pd.DataFrame(np.arange(9).reshape(3,3), columns=list('ABC'))
df
   A  B  C
0  0  1  2
1  3  4  5
2  6  7  8

df.query('C < 6')
   A  B  C
0  0  1  2
1  3  4  5


df.query('2*B <= C')
   A  B  C
0  0  1  2


df.query('A % 2 == 0')
   A  B  C
0  0  1  2
2  6  7  8

This is basically the same effect as an SQL statement, except the SELECT * FROM df WHEREis implied.

这与 SQL 语句的效果基本相同,只是SELECT * FROM df WHERE隐含了 。

回答by Zach Brookler

There's actually a new package that was just released, called dataframe_sql. This gives you the ability to query pandas dataframes using SQL just as you want to. You can find the package here https://github.com/zbrookle/dataframe_sql

实际上有一个刚刚发布的新包,称为dataframe_sql。这使您能够根据需要使用 SQL 查询 Pandas 数据帧。你可以在这里找到这个包https://github.com/zbrookle/dataframe_sql