database 在 Qlikview 中处理多个事实表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18337272/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Handling multiple fact tables in Qlikview
提问by bdiamante
I have a PostgreSQL database containing various education data such school-level test scores and enrollment figures. I need to separate enrollment from test scores because the data is on different grains. Even though enrollment is on a different granularity from the test-score data, many of the dimensions are the same. For example, I have:
我有一个 PostgreSQL 数据库,其中包含各种教育数据,例如学校级别的考试成绩和入学人数。我需要将入学与考试成绩分开,因为数据来自不同的谷物。尽管注册与测试分数数据的粒度不同,但许多维度是相同的。例如,我有:
~ ---------------------------------------------------------------------------------~
| Test Scores Fact |
|-------------|-----------|----------|-----------|--------------|------------|-----|
| school_code | test_code | grade_id | gender_id | ethnicity_id | subject_id | ... |
|-------------|-----------|----------|-----------|--------------|------------|-----|
~ --------------------------------------------------------~
| Enrollment Fact |
|-------------|----------|-----------|--------------|-----|
| school_code | grade_id | gender_id | ethnicity_id | ... |
|-------------|----------|-----------|--------------|-----|
This structure is fine on the backend, but in Qlikview, this creates a synthetic key. The solution for synthetic keys seems to usually be replacing it with a link table via Qlikview scripting, which has been my approach as well. But this does not seem to scale, as when I add a third fact table (on yet another grain) that contains more of the same dimensions, if I create another link table, now my two link tables start to associate as they contain several commonly named fields, and Qlikview's response is to create more synthetic keys?
这种结构在后端很好,但在 Qlikview 中,这会创建一个合成密钥。合成键的解决方案似乎通常是通过 Qlikview 脚本用链接表替换它,这也是我的方法。但这似乎没有扩展,因为当我添加包含更多相同维度的第三个事实表(在另一个粒度上)时,如果我创建另一个链接表,现在我的两个链接表开始关联,因为它们包含几个常见的命名字段,而 Qlikview 的回应是创建更多合成键?
I'm relatively new to Qlikview and am working by myself. How are multiple facts of different grains with common dimensions usually handled?
我对 Qlikview 比较陌生,我自己工作。通常如何处理具有共同尺寸的不同晶粒的多个事实?
EDIT:
编辑:
I've provided my solution to this problem which has been working in a production environment for just under a year! See my answer below...
我已经为这个问题提供了解决方案,该解决方案已经在生产环境中工作了不到一年!看我下面的回答...
回答by bdiamante
Seeing the popularity of this question, I'm going to add my actual solution to the mix so people have an example to work from, which for some reason is really hard to find for such a common problem...
看到这个问题的受欢迎程度,我将把我的实际解决方案添加到混合中,这样人们就有了一个可以工作的例子,出于某种原因,对于这样一个常见问题来说真的很难找到......
I proceeded with creating a Link Table. This solution still to this day feels like a hack as it creates one huge table containing the Cartesian product of every one of your keys in all of your fact tables... but it does work.
我继续创建一个链接表。这个解决方案直到今天仍然感觉像是一个黑客,因为它创建了一个巨大的表,其中包含所有事实表中每个键的笛卡尔乘积......但它确实有效。
The problem: You have multiple fact tables in your database; an occurrence in almost every database ever. Some (or all) of these fact tables share the same key fields; no problem, right? Wrong. Unfortunately, due to Qlik's associative nature, instead of each one of your fact tables linking nicely to their lookup tables, your fact tables are now associating with each other and wreaking havoc on your data model; creating circular references and untold amounts of synthetic keys.
问题:您的数据库中有多个事实表;几乎在每个数据库中都会发生。这些事实表中的一些(或全部)共享相同的关键字段;没问题吧?错误的。不幸的是,由于 Qlik 的关联性质,您的每个事实表并不是很好地链接到它们的查找表,而是现在彼此关联并对您的数据模型造成严重破坏;创建循环引用和无数的合成键。
The Solution: Create a Link Table. Sounds simple, right? Well, it is, but it's also very poorly documented and hard to understand without an initial explanation. You might be wondering... what's a Link Table? It is the Cartesian product of all keys from all of your fact tables. How does this correct the problem? It removes all of the unwanted associations between your fact tables as each will now only contain a single unique concatenated key. Those unique keys will associate with the Link Table only, which contains all your unique concatenated keys as well as all the individual keys. The Link Table will subsequently associate with your lookup tables and all will be well.
解决方案:创建一个链接表。听起来很简单,对吧?嗯,是的,但它的文档也很差,如果没有初步解释就很难理解。您可能想知道……什么是链接表?它是所有事实表中所有键的笛卡尔积。这如何纠正问题?它删除了事实表之间所有不需要的关联,因为每个事实表现在只包含一个唯一的连接键。这些唯一键将仅与链接表关联,其中包含所有唯一的连接键以及所有单独的键。链接表随后将与您的查找表相关联,一切都会好起来的。
Implementation:
执行:
This implementation will use the two tables contained in my question above; test_scores_factand enrollment_fact.
这个实现将使用我上面问题中包含的两个表;test_scores_fact和enrollment_fact。
test_scores_fact | enrollment_fact | school | gender | ...
---------------- | --------------- | ------ | ------ | ---
school_code (FK) | school_code (FK) | school_code (PK) | gender_id (PK) |
test_code (FK) | grade_id (FK) | school_name (FK) | gender_desc |
grade_id (FK) | ethnicity_id (FK) | address | ... |
gender_id (FK) | gender_id (FK) | ... |
ethnicity_id (FK) | number_enrolled (F) |
subject_id (FK) |
test_score (F) |
FK = Foreign Key
PK = Primary Key
F = Fact
As you can see, the two fact tables have overlapping keys, school_code, grade_id, gender_id, and ethnicity_id. In a relational model, each key field has a corresponding table with additional information about the key. This model does not jive with Qlikview's associative nature as Qlikview associates tables based on the name of a field; even when you don't want it to. You do want like named fields to associate to their lookup tables, however you don't want like named fields in your fact tables to associate. Unfortunately you cannot stop this behavior. You must implement a Link Table...
正如你所看到的,两个事实表有重叠的键,school_code,grade_id,gender_id,和ethnicity_id。在关系模型中,每个键字段都有一个对应的表,其中包含有关该键的附加信息。该模型不符合 Qlikview 的关联性质,因为 Qlikview 根据字段名称关联表;即使你不想要它。您确实希望命名字段与它们的查找表相关联,但是您不希望事实表中的命名字段相关联。不幸的是,您无法阻止这种行为。您必须实现一个链接表...
In your Qlikview script, create a temporary fact table, which loads in all fields from your database table:
[temp_test_scores]: LOAD school_code, test_code, grade_id, gender_id, ethnicity_id, subject_id, test_score; SQL SELECT * FROM <database connection>Concatenate your keys and remove all individual keys:
[test_scores]: LOAD school_code & '_' test_code & '_' grade_id & '_' gender_id & '_' ethnicity_id & '_' subject_id as test_key, test_score RESIDENT [temp_test_scores];Repeat Steps 1 & 2 for each fact table:
[temp_enrollment]: LOAD school_code, grade_id, ethnicity_id, gender_id, number_enrolled; SQL SELECT * FROM <database connection> [enrollment]: LOAD school_code & '_' & grade_id & '_' & ethnicity_id & '_' & gender_id as enrollment_key, number_enrolled RESIDENT [temp_enrollment];Create your Link Table by concatenating your individual keys into a single table:
[temp_link_table]: LOAD DISTINCT school_code, test_code, grade_id, gender_id, ethnicity_id, subject_id RESIDENT [temp_test_scores]; CONCATENATE ([temp_link_table]) LOAD DISTINCT school_code, grade_id, ethnicity_id, gender_id, number_enrolled RESIDENT [temp_enrollment]; /** * The final Link Table will contain all of the individual keys one time as well as your concatenated keys */ [link_table]: LOAD DISTINCT school_code, test_code, grade_id, gender_id, ethnicity_id, subject_id, school_code & '_' test_code & '_' grade_id & '_' gender_id & '_' ethnicity_id & '_' subject_id as test_key, school_code & '_' & grade_id & '_' & ethnicity_id & '_' & gender_id as enrollment_key RESIDENT [temp_link_table]Drop your temp tables so they do not appear in your data model:
DROP TABLE [temp_test_scores]; DROP TABLE [temp_enrollment]; DROP TABLE [temp_link_table];
在您的 Qlikview 脚本中,创建一个临时事实表,该表从您的数据库表中加载所有字段:
[temp_test_scores]: LOAD school_code, test_code, grade_id, gender_id, ethnicity_id, subject_id, test_score; SQL SELECT * FROM <database connection>连接您的密钥并删除所有单个密钥:
[test_scores]: LOAD school_code & '_' test_code & '_' grade_id & '_' gender_id & '_' ethnicity_id & '_' subject_id as test_key, test_score RESIDENT [temp_test_scores];对每个事实表重复步骤 1 和 2:
[temp_enrollment]: LOAD school_code, grade_id, ethnicity_id, gender_id, number_enrolled; SQL SELECT * FROM <database connection> [enrollment]: LOAD school_code & '_' & grade_id & '_' & ethnicity_id & '_' & gender_id as enrollment_key, number_enrolled RESIDENT [temp_enrollment];通过将您的各个键连接到一个表中来创建您的链接表:
[temp_link_table]: LOAD DISTINCT school_code, test_code, grade_id, gender_id, ethnicity_id, subject_id RESIDENT [temp_test_scores]; CONCATENATE ([temp_link_table]) LOAD DISTINCT school_code, grade_id, ethnicity_id, gender_id, number_enrolled RESIDENT [temp_enrollment]; /** * The final Link Table will contain all of the individual keys one time as well as your concatenated keys */ [link_table]: LOAD DISTINCT school_code, test_code, grade_id, gender_id, ethnicity_id, subject_id, school_code & '_' test_code & '_' grade_id & '_' gender_id & '_' ethnicity_id & '_' subject_id as test_key, school_code & '_' & grade_id & '_' & ethnicity_id & '_' & gender_id as enrollment_key RESIDENT [temp_link_table]删除您的临时表,使其不会出现在您的数据模型中:
DROP TABLE [temp_test_scores]; DROP TABLE [temp_enrollment]; DROP TABLE [temp_link_table];
This will remove all associations between your fact tables as there now exist no common field names between them. Each fact table will link to the link table via the created concatenated key. The link table will then associate with each individual lookup table. Your Qlikview data model will not contain any synthetic keys or circular references.
这将删除事实表之间的所有关联,因为它们之间现在不存在公共字段名称。每个事实表将通过创建的连接键链接到链接表。然后链接表将与每个单独的查找表相关联。您的 Qlikview 数据模型将不包含任何合成键或循环引用。
If you create another fact table in the future, just follow steps 1 & 2 again, and add any new individual keys to the Link Table and also add the new concatenated key to the Link Table as well. It scales with little effort.
如果您将来创建另一个事实表,只需再次执行步骤 1 和 2,将任何新的单独键添加到链接表,并将新的连接键也添加到链接表。它可以轻松扩展。
Good Luck!
祝你好运!
回答by Jonas Valleskog
There are two main strategies to modelling data in QlikView to handle multiple fact tables:
在 QlikView 中建模数据以处理多个事实表有两种主要策略:
Append your fact tables into one single fact table - usually referred to as a CONCATENATED FACT as QlikView's syntax for appending data to tables is by use of the CONCATENATE prefix (the equivalent of a SQL UNION operation)
Build a link table (what you have done so far) For a majority of implementations, option 1 is the appropriate method. Attributes of a CONCATENATED fact can be summarised as:
将您的事实表附加到一个单一的事实表中 - 通常称为 CONCATENATED FACT,因为 QlikView 将数据附加到表的语法是使用 CONCATENATE 前缀(相当于 SQL UNION 操作)
构建链接表(到目前为止您所做的) 对于大多数实现,选项 1 是合适的方法。CONCATENATED 事实的属性可以概括为:
Positives:
正面:
- Performs well due to the reduced number of large tables in the data model
- Simple to implement, just append all data to one generic fact table whilst ensuring common dimensions are referenced by common field names
- 由于数据模型中的大表数量减少,因此表现良好
- 实现简单,只需将所有数据附加到一个通用事实表,同时确保通用字段名称引用通用维度
Negatives:
否定:
- The different facts are NOT directly associated with each other. The implication is important to understand. It means that cross-analysis of facts is typically only achievable by the common dimensions. Any fact specific dimensions do not connect in any way to the records of the facts that do not reference these dimensions. Complex 'set analysis' syntax can to some degree mitigate this shortcoming, but if your core requirement is to do indirect analysis of fact A by fact B's fact specific dimensions then you may need to revert to a link table model instead.
- 不同的事实彼此没有直接关联。理解其含义很重要。这意味着事实的交叉分析通常只能通过公共维度来实现。任何特定于事实的维度都不会以任何方式连接到未引用这些维度的事实记录。复杂的“集合分析”语法可以在一定程度上缓解这个缺点,但如果您的核心需求是通过事实 B 的事实特定维度对事实 A 进行间接分析,那么您可能需要转而使用链接表模型。
How to construct Link Tables is a complex subject but relies upon traditional database linking table design techniques. It is easy to go wrong and produce linking tables that may seem to produce the correct results in the front-end but is excessively large, consuming memory and CPU resources.
如何构建链接表是一个复杂的主题,但依赖于传统的数据库链接表设计技术。很容易出错,产生链接表,在前端看似能产生正确结果,但体积过大,消耗内存和CPU资源。
In my experience, a poorly modelled QlikView data model is the most common culprit for causing poor performance.
根据我的经验,建模不当的 QlikView 数据模型是导致性能不佳的最常见罪魁祸首。
I hope this quick, far from exhaustive, introduction to multi-fact modelling in QlikView proves of some help and sets you on the right course.
我希望这篇对 QlikView 中的多事实建模的快速介绍,远非详尽无遗,可以证明对您有所帮助,并使您走上正确的道路。
回答by JMon
The two quickest ways I can think of:
我能想到的两种最快的方法:
A) You can just left join the fact table into the corresponding tables that they are used in. You will just need to rename the fields to avoid conflicts with the other tables.
A) 您可以将事实表左连接到使用它们的相应表中。您只需要重命名字段以避免与其他表发生冲突。
B)You can rename the common fields, which can be done by
B)您可以重命名公共字段,这可以通过
- using a
QUALIFY(before you load the fact tables) andUNQUALIFY(after you load the fact tables) - renaming the field using "[Old Field Name] as [New Field Name]"
- 使用
QUALIFY(在加载事实表之前)和UNQUALIFY(在加载事实表之后) - 使用“[旧字段名称] 作为 [新字段名称]”重命名字段
Assuming that the fact tables have unique id field names that can be linked to the main tables , you shouldn't have to rename anything in the main tables
假设事实表具有可以链接到主表的唯一 id 字段名称,您不应该重命名主表中的任何内容
I'd go with B-1, since that seems a little less of a hassle.
我会选择 B-1,因为这似乎不那么麻烦。
QUALIFY
A,
B,
C,
ID;
FactTable1:
Load ID,
A,
B,
C,
From [FactTable1];
FactTable2:
Load ID,
A,
B,
C,
From [FactTable2];
UNQUALIFY
A,
B,
C,
ID;
EDIT: If you want to create a link table from these, you can concatenate the fact tables into one table where you put all the columns into it (there will be nulls for a lot of the columns, but QlikView is good with nulls).
编辑:如果您想从这些创建链接表,您可以将事实表连接到一个表中,将所有列放入其中(很多列都会有空值,但 QlikView 对空值很好)。
What i usually do is load the fact tables in and create an id field (either RowNo() or autonumberhash128([list of unique id fieldnames]), then when i load them into a link table, I include that id field in the link table as well. Finally, i drop the all the common fields from the fact tables, so they only exist in the link table.
我通常做的是加载事实表并创建一个 id 字段(RowNo() 或 autonumberhash128([唯一 id 字段名列表]),然后当我将它们加载到链接表中时,我在链接中包含该 id 字段表也是如此。最后,我从事实表中删除了所有公共字段,因此它们只存在于链接表中。
回答by Nick.McDermaid
However, each fact table has a different subset of the "shared" fields, so I wouldn't be able to properly key in my fact tables.
但是,每个事实表都有不同的“共享”字段子集,因此我无法正确键入我的事实表。
One of the inputs to your cartesian dimension would be 'N/A' against Subject and Test Code (since that is not in the enrollments table)
笛卡尔维度的输入之一是针对主题和测试代码的“N/A”(因为这不在注册表中)
So when you measure by 'Gender' the Test Scores match against dimension records with valid Subjects and Test Codes, and Enrollment matches against records with 'N/A' Subjects and Test Codes
因此,当您按“性别”衡量时,测试分数与具有有效主题和测试代码的维度记录相匹配,并且注册与具有“N/A”主题和测试代码的记录相匹配
Then when you roll up by Gender, everyhing 'just works' nicely.
然后,当您按性别汇总时,一切都“正常工作”。

