Schema » History » Revision 2
Revision 1 (Alison Craig, 03/03/2015 09:14 AM) → Revision 2/11 (Alison Craig, 03/03/2015 09:56 AM)
Schema
======
Within the <a href="http://www.postgresql.org">PostgreSQL</a>, the database is divided by Tables. For the CMIP5 dataset, datasets have differing variables (and, therefore, differing Name schemes) from the Models the dataset is based upon. As such, each Table is determined by the Model. From there, we have the Columns which are each component of the Name. And, finally, the Rows are the individual Name entries for each dataset.
While a tempting optimization would be to combine Models with the same Name schemes, it is unknown if the Name schemes will always be the same. As such, it is recommended they stay separate.
<b>Example</b><br>
<b><i>Table A</i></b><br>
<tt>|Institution|Model|Submodel|Instrument|</tt>
<br>
<b><i>Table B</i></b><br>
<tt>|Institution|Model|Submodel|Frequency|</tt>
----
On a search, the standard queries are defined as "[field]=[search]" with [field] being a Column and [search] being any exact matches in those Columns. These are comma-seperated lists, so we can continue to use <tt>JOIN</tt>s to narrow the results down. If <i>any</i> of the fields are Model, then we can identify the Table and eliminate a large portion of the search; Model was chosen as it was observed to be the most commonly and first used for narrowing down results by users.
In the autocomplete query case, the query follows a "?=[search]" pattern. The question mark indicates we do not know what to look for, and the search is what has already been entered by the user. The autocomplete can be used with the standard query to do an initial narrowing down of results, prior to doing a regular expression search for what can come next.