# Areas of future interest Here are some areas of future interest. Consider them an enumeration of "R&D areas" rather than a technical roadmap: * The most obvious work is to **expand the parsing coverage**, both by adding support for more SQL dialects (e.g. Postgres, Mysql, Sqlite) and by adding support for the long tail of infrequently-used language features in existing SQL dialects. * Another logical extension is to add support for **query translation**. It would require implementing a `render` function as the inverse of the `parse` function: `render` would transform an AST into a string in a particular dialect. Once implemented, query translation would be accomplished by parsing a query in one dialect, then rendering it into another dialect. * A more exploratory project is to implement **type-checking of SQL**, with the goal of detecting errors in queries through static analysis. The principle is to define the types of columns, then use type-inference rules to look for type errors in queries, such as "column A and column B are being compared for equality, but have incompatible types". * In practice, this would probably start by seeding the catalog information with annotations for the business types of certain columns. Then, the types of other columns would be inferred, using known or observed relationships between columns. For example, a known foreign-key relationship would generate the inference that the foreign key has the same type as the primary key. Alternatively, a list of candidate relationships could be generated by applying type-inference rules to the stream of queries. For example, if two columns are related by an equality or inequality operator, then they have compatible types. * There are non-trivial use cases for **concrete evaluation of queries**. At first glance, the idea of "put in sample data, get sample results" may seem redundant when one could just use an actual database. However, concrete evaluation would allow QuickCheck for queries. Imagine an interface that let users specify a query to be tested, as well as post-conditions in the form of SQL queries that relate the original data to the output data. The QuickCheck test would generate arbitrary input data sets, run the query, and assert the post-conditions. It would then produce minimized examples failing those post-conditions. Now, imagine applying that to a set of queries, such as the steps in an ETL to produce a dimensionally modeled table. * Post-condition queries would produce one row with one column with a boolean value (True or False). Such a query could also be called an assertion, predicate, or property. * There are similar use cases for **generating arbitrary queries**. Arbitrary queries would allow for QuickCheck testing of databases themselves, particularly for catching errors in parsing. Subsequently, generating arbitrary table data would permit catching errors in execution. * We could enrich our understanding of data access patterns by adding support for **fingerprinting of queries**, to categorize similar queries together. A single query could have multiple fingerprints under different fingerprinting algorithms. For example, a "table fingerprint" could be generated by hashing a sorted list of all the tables that appeared in the query. A "template fingerprint" could be generated by removing all constants and literals in the query, then hashing the resulting querystring. This algorithm would give the same fingerprint for `SELECT * FROM foo WHERE date > '2018-01-01'` and `SELECT * FROM foo WHERE date > '2018-01-02'`. * We could **improve SQL hygiene** by adding support for query standardization in the same spirit as [gofmt](https://blog.golang.org/go-fmt-your-code). Query standardization would include auto-formatting of whitespace like line breaks and indentation and afford communally-owned queries all the same benefits described in the gofmt blogpost: queries would become easier to read, easier to write, easier to maintain, and less controversial. More aggressive formatting changes would also be possible, such as removing unused clauses in a query.