Sun, 19 Jan 2014
Notes on a system for abbreviating SQL queries
(This post inaugurates a new section on my blog, for incomplete notes. It often happens that I have some idea, usually for software, and I write up a bunch of miscellaneous notes about it, and then never work on it. I'll use this section to post some of those notes, mainly just because I think they might be interesting, but also in the faint hope that someone might get interested and do something with it.)
Why are simple SQL queries so verbose?
(This is 208 characters.)
I guess about two-thirds of this is unavoidable, but those join-using clauses ought to be omittable, or inferrable, or abbreviatable, or something.
(Only 94 characters.)
Then the question arises of how to join the batches to the clients. This is the only really interesting part of this project, and the basic rule is that it shouldn't do anything really clever. There is a graph, which the program can figure out from looking at the foreign key constraints. And the graph should clearly have a short path from batches through products to clients.
If something is truly ambiguous, we can issue an intelligent request for clarification:
Can 1 and 2 really be separated? They can in the example above, but maybe not in general.
I think separating 3 and putting it at the end is a good idea: don't
try to use field name abbreviations to disambiguate and debreviate
table names. Only go the other way. But this means that we can't
What if something like
There is a tradeoff here: the more different kinds of abbreviations you accept, the more likely there are to be ambiguities.
About table inference
There could also be a preferences file that lists precedences for
tables and fields: if it lists
About join inference
Short join paths are preferred to long join paths.
If it takes a long time to generate the join graph, cache it. Build it automatically on the first run, and then rebuild it on request later on.
(this section blank)
Maybe convert the input to a
Note that this requires that the input be valid SQL. Your original idea for the abbreviated SQL began with
but the original version would probably be ruled out by this implementation. In this case that is not a big deal, but this choice of implementation might rule out more desirable abbreviations in the future.
Correcting dumb mistakes in the SQL language design might be in Quirky's purview. For example, suppose you do
RJBS said he would be reluctant to use the abbreviated version of a query in a program. I agree: it would be very foolish to do so, because adding a table or a field might change the meaning of an abbreviated SQL query that was written into a program ten years ago and has worked ever since. This project was never intended to abbreviate queries in program source code.
Quirky is mainly intended for one-off queries. I picture it going into an improved replacement for the MySQL command-line client. It might also find use in throwaway programs. I also picture a command-line utility that reads your abbreviated query and prints the debreviated version for inserting into your program.
(In the original document this section was blank. I have added here some notes I made in pen on a printout of the foregoing, on an unknown date.)
Maybe also abbreviate
Since debreviation is easier [than join inference] do it first!
Another idea: "
[Other articles in category /notes] permanent link