Principle of Simplest Query: The WAFE Style Rule
We present a simple style rule for SQL select queries with the intention
to make it easier to develop queries.
The rule also applies to Access Query Builder in which context
the rule was developed.
I am tempted to call it the Law of Demeter for Queries.
The Law of Demeter promotes simple methods in object-oriented systems
while the Principle of Simplest Query promotes simple queries
in relational data bases.
Formulation
A query is broken down into several simple select-from-where subqueries (producing virtual tables).
A subquery joins the appropriate number of tables (using the appropriate join kinds)
to provide the required fields and their values.
A subquery also projects the appropriate number of fields needed.
In addition, a query does only exactly one of four things
-
Widen : introduce a calculated field which keeps the
number of rows the same.
-
Aggregate : aggregate over rows (sum, average, count etc.), i.e., using a Totals-query in Access.
Also introduces a calculated field but typically reduces the number of rows.
-
Filter: filter a subset of the rows using a (possibly empty) condition or a join.
-
Eliminate duplicates: eliminate duplicate rows.
This leads to queries that are easier to develop and debug.
We want each subquery to be simple and do one thing in a similar way as a method in an OO system should do one thing (see Law of Demeter).
Simple queries make it explicit in which order the steps are applied.
We abbreviate the "Principle of Simplest Query" as WAFE because of the four primitive subquery kinds: Widen, Aggregate, Filter and Eliminate.
The appropriate number of fields are selected (projection) to achieve the correct result.
Each query implements one task involving selecting tables and their joins, a projection and one of widening, aggregation, filtering of rows and elimination of duplicates.
Filtering is done either with a Boolean expression or by relying on a join which implements the "at least one" condition. The join itself might increase the number of rows but a subsequent "eliminate duplicates" operation will choose a subset of rows.
Note that the WAFE style rule does not constrain what can be expressed (we need a formal proof).
However, it requires more subqueries.
But the significant benefits are (1) each subquery
can be easily tested and (2) less knowledge is needed about how query evaluation works and
(3) the confusing feature combinations are eliminated.
Experienced users can put multiple WAFE steps into the same subquery, for example an aggregation and a filter based on the aggregation result.
A join can be any join allowed in SQL. Here is a sketch of the WAFE grammar derived from the MS ACCESS setup.
WAFEQuery = TableList List(WAFESubQuery).
// last subquery gives final result.
WAFESubQuery = QueryKind VirtualTableName SelectedTablesWithJoins ProjectedFields.
QueryKind = W | A | F | E.
W = "widen" = NewFieldName Formula.
A = "aggregate" AggregatedField "=" AggregationExpression.
F = "filter" BooleanExpression.
E = "eliminate".
References
Richard Rasala:
Access Queries.
Focus on:
Our Query Philosophy and
Our Query Style.
Karl Lieberherr and the Demeter Team:
Law of Demeter
Private Communication with Ken Baclawski fall 2014.
Teaching Materials (Google Slides)
PoSQ Introduction
Debugging with PoSQ
Karl Lieberherr, Spring 2015