Relational Algebras for Subset Selection and Optimisation
Abstract
The database community lacks a unified relational query language for subset selection and optimisation queries, limiting both user expression and query optimiser reasoning about such problems. Decades of research (latterly under the rubric of prescriptive analytics) have produced powerful evaluation algorithms with incompatible, ad-hoc SQL extensions that specify and filter through distinct mechanisms. We present the first unified algebraic foundation for these queries, introducing relational exponentiation to complete the fundamental algebraic operations alongside union (addition) and cross product (multiplication). First, we extend relational algebra to complete domain relations-relations defined by characteristic functions rather than explicit extensions-achieving the expressiveness of NP-complete/hard problems, while simultaneously providing query safety for finite inputs. Second, we introduce solution sets, a higher-order relational algebra over sets of relations that naturally expresses search spaces as functions f: Base to Decision, yielding |Decision|^|Base| candidate relations. Third, we provide structure-preserving translation semantics from solution sets to standard relational algebra, enabling mechanical translation to existing evaluation algorithms. This framework achieves the expressiveness of the most powerful prior approaches while providing the theoretical clarity and compositional properties absent in previous work. We demonstrate the capabilities these algebras open up through a polymorphic SQL where standard clauses seamlessly express data management, subset selection, and optimisation queries within a single paradigm.