Python SQLAlchemy 中的 GroupBy 和 Sum?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25558582/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:22:36  来源:igfitidea点击:

GroupBy and Sum in SQLAlchemy?

pythonsqlflasksqlalchemyflask-sqlalchemy

提问by TH22

I am trying to group a few fields in a table, and then sum the groups, but they are getting double counted.

我正在尝试将表中的几个字段分组,然后对这些组求和,但它们被重复计算了。

My models are as follows:

我的模型如下:

class CostCenter(db.Model):
     __tablename__ = 'costcenter'
     id = db.Column(db.Integer, primary_key=True, autoincrement=True)
     name = db.Column(db.String)
     number = db.Column(db.Integer)

class Expense(db.Model):

    __tablename__ = 'expense'
    id = db.Column(db.Integer, primary_key=True, autoincrement=True)
    glitem_id = db.Column(db.Integer, db.ForeignKey('glitem.id'))
    glitem = db.relationship('GlItem')
    costcenter_id = db.Column(db.Integer, db.ForeignKey('costcenter.id'))
    costcenter = db.relationship('CostCenter')
    value = db.Column(db.Float)
    date = db.Column(db.Date)

I have been using:

我一直在使用:

expenses=db.session.query(Expense,func.sum(Expense.value)).group_by(Expense.date).filter(CostCenter.id.in_([1,2,3]))

When I print expenses it shows the SQL statement that follows. It looks correct to me, but I am not that familiar with SQL. The problem is that the values it outputs as sum_1 are being counted multiple times. If I have [1] item in the "in statment" it will sum all three. If I have [1,2], it will sum all three and then double it, and if i have [1,2,3], it will sum all three and triple it. I am not sure why it is counting multiple times. How do I fix this?

当我打印费用时,它会显示后面的 SQL 语句。在我看来它是正确的,但我对 SQL 不太熟悉。问题是它输出为 sum_1 的值被多次计算。如果我在“in statment”中有 [1] 项,它将对所有三个项求和。如果我有 [1,2],它会将所有三个相加,然后将其加倍,如果我有 [1,2,3],它将所有三个相加并使其三倍。我不知道为什么它会多次计数。我该如何解决?

SELECT expense.id AS expense_id, expense.glitem_id AS expense_glitem_id, expense.costcenter_id AS         expense_costcenter_id, expense.value AS expense_value, expense.date AS expense_date, sum(expense.value) AS sum_1 
FROM expense, costcenter 
WHERE costcenter.id IN (:id_1, :id_2, :id_3) GROUP BY expense.date

Thanks!

谢谢!

采纳答案by davidism

There are a few issues here; you don't seem to be querying the right things. It's meaningless to select an Expense object when grouping by Expense.date. There needs to be some join condition between CostCenter and Expense, otherwise the rows will be duplicated, each count for each cost center but with no relation between the two.

这里有几个问题;你似乎没有在查询正确的东西。按 Expense.date 分组时选择一个 Expense 对象是没有意义的。CostCenter 和 Expense 之间需要有一些连接条件,否则行将重复,每个成本中心都计数,但两者之间没有关系。

Your query should look like this:

您的查询应如下所示:

session.query(
    Expense.date,
    func.sum(Expense.value).label('total')
).join(Expense.cost_center
).filter(CostCenter.id.in_([2, 3])
).group_by(Expense.date
).all()

producing this sql:

产生这个sql:

SELECT expense.date AS expense_date, sum(expense.value) AS total 
FROM expense JOIN cost_center ON cost_center.id = expense.cost_center_id 
WHERE cost_center.id IN (?, ?) GROUP BY expense.date

Here is a simple runnable example:

这是一个简单的可运行示例:

from datetime import datetime
from sqlalchemy import create_engine, Column, Integer, ForeignKey, Numeric, DateTime, func
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session, relationship

engine = create_engine('sqlite://', echo=True)
session = Session(bind=engine)
Base = declarative_base(bind=engine)


class CostCenter(Base):
    __tablename__ = 'cost_center'

    id = Column(Integer, primary_key=True)


class Expense(Base):
    __tablename__ = 'expense'

    id = Column(Integer, primary_key=True)
    cost_center_id = Column(Integer, ForeignKey(CostCenter.id), nullable=False)
    value = Column(Numeric(8, 2), nullable=False, default=0)
    date = Column(DateTime, nullable=False)

    cost_center = relationship(CostCenter, backref='expenses')


Base.metadata.create_all()

session.add_all([
    CostCenter(expenses=[
        Expense(value=10, date=datetime(2014, 8, 1)),
        Expense(value=20, date=datetime(2014, 8, 1)),
        Expense(value=15, date=datetime(2014, 9, 1)),
    ]),
    CostCenter(expenses=[
        Expense(value=45, date=datetime(2014, 8, 1)),
        Expense(value=40, date=datetime(2014, 9, 1)),
        Expense(value=40, date=datetime(2014, 9, 1)),
    ]),
    CostCenter(expenses=[
        Expense(value=42, date=datetime(2014, 7, 1)),
    ]),
])
session.commit()

base_query = session.query(
    Expense.date,
    func.sum(Expense.value).label('total')
).join(Expense.cost_center
).group_by(Expense.date)

# first query considers center 1, output:
# 2014-08-01: 30.00
# 2014-09-01: 15.00
for row in base_query.filter(CostCenter.id.in_([1])).all():
    print('{}: {}'.format(row.date.date(), row.total))

# second query considers centers 1, 2, and 3, output:
# 2014-07-01: 42.00
# 2014-08-01: 75.00
# 2014-09-01: 95.00
for row in base_query.filter(CostCenter.id.in_([1, 2, 3])).all():
    print('{}: {}'.format(row.date.date(), row.total))