Multics Technical Bulletin MTB 644 To: MTB Distribution From: Ron Barstad Date: 12/20/83 Subject: The View Manager Facility: SQL Parser ABSTRACT The Sequential Query Language (SQL) for the View Manager Facility is described. The use of a LALR parser to convert the SQL info a canonical form is described. This MTB is only one of a group of related documents in the View Manager series. These documents are MTB-641 The View Manager Facility MTB-642 The View Manager Facility: Subroutine Interfaces MTB-643 The View Manager Facility: The View Master Subsystem MTB-644 The View Manager Facility: SQL Parser MTB-645 The View Manager Facility: Data Dictionary Interface Notes for printing: This document contains lines longer than the standard pagewidth of 65. The control argument "-forms elite" should be used for hardcopies from the x9700 and "-rqt pmdc_12c" should be used on the PMDC diablo. Comments may be made via forum: >udd>Multics>meetings>End_User_Data_Access (euda) via electronic mail: RBarstad.Multics at System M via telephone: (HVN) 357-6617 or (602) 862-6617 _________________________________________________________________ Multics Project internal working documentation. Not to be reproduced outside the Multics Project. MTB 644 Multics Technical Bulletin 1. INTRODUCTION The Structured Query Language (SQL) defined by IBM in the mid 70's for their System R project has become the defacto industry standard. Many vendors, including GCOS, are converting to or adding a SQL like interface to their data base products. A Multics dialect of SQL will be used for the View Manager Facility. This dialect will be familiar to users of other SQL systems. Earlier proposals by Jim Gray and others to provide a unified query language which was a combination of SQL, MRDS and LINUS were found to be wanting. It was felt that these approaches provided a lowest-common-denominator language that combined the worst features of all the others. The syntax for Gray's unified language also proved to be ambiguous and incapable of LALR analysis without major change. Multics Technical Bulletin MTB 644 2. FUNCTIONAL DESCRIPTION 2.1 Overview The parser translates SQL into a Canonical Query Language (CQL). The SQL is the user entered text such as "select foo from bar" and the CQL is an internal representation passed on down to the routines that do the user's bidding. (The CQL is not described here.) The parser can be considered a kind of compiler that converts the SQL source statements into the CQL object. This is true even though the parser deals with only a single statement at a time and appears to the user to be a pre-processor. Errors in syntax or semantics will be detected and passed back to the caller. Some checking of column names and expressions may be done to verify their appropriateness for the requested table. Any error would prevent the creation of the CQL. The core of the parser is a LALR translator built from a BNF definition of the SQL. Semantic processing code will build the CQL as each production is scanned and verified. 2.2 About LALR LALR means "Look Ahead, Left to right, Right most derivation". The Multics LALR system translates a BNF description of a grammar into a parser for that language. The parser is efficient compared to traditional "hand coding" and provides an expeditious way to build and test grammars like SQL. After the decision was made to use the LALR parser for the View Manager SQL, we found that IBM made the same decision when building their parser for the original SEQUEL for System R. It too was a PL/1 based interpreter that created a "canonical" form of the SEQUEL. Based on an original design by Dave Ward and Jim Falksen, the Multics LALR system is now maintained by Pat Prange at BCO. Development there is continuing with LALR as part of the SLANG project to support Ada/SIL. Expectations are that LALR will be included in MR11. 2.3 Interfaces, hooks, handles SQL statements are built at the terminal in a variety of the tradition ways: request loop, editor, ec, and so forth. The parser expects a complete statement for parsing and cannot make much sense of fragments. However, it may be possible to take advantage of the error recovery mechanisms to provide some fancy user interfaces. For example, if the parser were warned that the statement passed to it was incomplete, it could (after verifying correctness so far) return a list of possibilities for the next token. MTB 644 Multics Technical Bulletin As in any language, there are three classes of error possible in SQL. First are simple syntax errors. The parser will give up processing a statement on the first one and indicate the errant token and most probable error. Semantic errors will be delt with similarily. The parser will be able to find some, but not all, execution errors. It may be able to verify from the data dictionary that the requested columns are in the referenced table or that the tables even exist but it can't know if the selection condition (where clause) is valid. 2.4 Subroutine Interface To allow for growth, the parser is a free-standing module not bound with the rest of the view manager. This design allows other parsers, as long as they produce CQL, to be substited or added. Multics Technical Bulletin MTB 644 entry: convert_sql_to_cql_ This entry point is used to parse a sql statement string and generate the associated cql description. USAGE dcl convert_sql_to_cql_ entry options (variable); call convert_sql_to_cql_ (dd_path, sql_string, return_area_ptr, replacement_arg_1, ... replacement_arg_N, cql_ptr); ARGUMENTS dd_path is a character string char (*) that contains the pathname of the data dictionary to be used while parsing the sql. (input) sql_string is a character string char (*) that contains the sql statement to be parsed. (input) return_area_ptr is a pointer to an area where this routine will allocate the returned cql structure. (input) replacement_arg_I are a variable number of arguments (possibly none), each char (*), that will be used as substitution arguments for &N in the sql_str. The &N is associated with the Nth replacement_arg. (input) cql_ptr is a pointer to the generated cql_structure that is allocated in the area pointed to by return_area_ptr. (output) MTB 644 Multics Technical Bulletin 3. DESCRIPTION OF SQL The following BNF defines the SQL syntax. This is the same BNF that is passed as input to the lalr command. It has been shown to be a unambiguous grammar and capable of correctly parsing SQL statements. 3.1 Reserved words These reserved words are defined for the SQL. They may not be used in any other context, such as for table or column names. access create_tablespace immediate register_table add create_view in remove admin current index restore all dec insert revoke alter decimal insertion rollback alter_tablespace delete intersect save and deletion into select any deregister is set as deregister_link like share asc deregister_table link special ascending desc lock sum assertion descending max synonym avg does min table between drop minus tablespace bin drop_synonym mode then binary drop_table new to bit drop_tablespace nonvar trigger but drop_view nonvarying uc by exclusive not union char exists null unique character fix of update check fixed old using column float on values comment for option var commit from or varying contain grant order view contains group read where count having records with create if register work create_table image register_link Multics Technical Bulletin MTB 644 3.2 SQL BNF (* Begin Multics View Manager SQL *) <statement> ::= (* 83-12-20.01 *) <query> | <dml_statement> | <ddl_statement> | <control_statement> ! <dml_statement> ::= <insertion> | <deletion> | <update> ! <query> ::= <query_expr> <order_clause> ! <insertion> ::= insert into <receiver> <insert_spec> ! <deletion> ::= delete <from_clause> <where_current_clause> ! <update> ::= update <table_label_name> set <set_clause_list> <where_current_clause> ! <order_clause> ::= order by <ord_spec_list> | ! <receiver> ::= <table_name> | <table_name> ( <column_name_list> ) ! <column_name_list> ::= <column_name> | <column_name_list> , <column_name> ! <insert_spec> ::= <query_expr> | values <literal_row> ! MTB 644 Multics Technical Bulletin <table_label_name> ::= <table_name> <label_name> | <table_name> ! <set_clause_list> ::= <set_clause> | <set_clause_list> , <set_clause> ! <set_clause> ::= <column_name> = <expr> | <column_name> = ( <query_block> ) ! <query_expr> ::= <query_block> | <query_expr> <set_op> <query_block> | ( <query_expr> ) ! <set_op> ::= intersect | union | minus ! <query_block> ::= <sub_query_block> | <sub_query_block> <group_clause> ! <sub_query_block> ::= <select_clause> <from_clause> <where_clause> ! <select_clause> ::= select <sel_list_spec> <sel_expr_list> | select <sel_list_spec> * ! <from_clause> ::= from <table_list> ! <where_current_clause> ::= <where_clause> | where current of <cursor_name> ! <where_clause> ::= where <boolean_expr> | ! <group_clause> ::= group by <column_spec_list> | group by <column_spec_list> <having_clause> ! <having_clause> ::= having <boolean_expr> ! Multics Technical Bulletin MTB 644 <sel_list_spec> ::= all | unique | ! <sel_expr_list> ::= <sel_expr> | <sel_expr_list> , <sel_expr> ! <sel_expr> ::= <expr> | <table_name> . * ! <table_list> ::= <table_label_name> | <table_list> , <table_label_name> ! <column_spec_list> ::= <column_spec> | <column_spec_list> , <column_spec> ! <ord_spec_list> ::= <column_spec> <direction> | <ord_spec_list> , <column_spec> <direction> | <integer> ! <direction> ::= asc | ascending | desc | descending ! <boolean_expr> ::= <boolean_term> | <boolean_expr> <or_op> <boolean_term> ! <boolean_term> ::= <boolean_factor> | <boolean_term> <and_op> <boolean_factor> ! <boolean_factor> ::= <boolean_primary> | <not_op> <boolean_primary> ! <boolean_primary> ::= <predicate> | ( <boolean_expr> ) ! <predicate> ::= <expr> <comparison> <expr_or_table_spec> | <table_spec> <comparison> <table_spec_lit> | exists ( <query_expr> ) | <expr> <maybe_not> like <regular_expr> | MTB 644 Multics Technical Bulletin <expr> is <maybe_not> null | <expr> between <expr> <and_op> <expr> | if <predicate> then <predicate> (* assert only *) ! <expr_or_table_spec> ::= <expr> | <table_spec> ! <regular_expr> ::= <string> ! <table_spec> ::= ( <query_expr> ) | '< <literal_row_list> > | <literal_row> ! <table_spec_lit> ::= ( <query_expr> ) | <literal> ! <expr> ::= <arith_term> | <expr> <add_op> <arith_term> ! <arith_term> ::= <arith_factor> | <arith_term> <mult_op> <arith_factor> ! <arith_factor> ::= <add_op> <primary> | <primary> ! <primary> ::= <column_spec> | old <column_spec> | new <column_spec> | <set_function> ( <maybe_unique> <expr> ) | <builtin> ( <arg_list> ) | count ( * ) | <constant> | ( <expr> ) ! <column_spec> ::= <column_name> | <table_name> . <column_name> ! <comparison> ::= <comp_op> | <comp_op> any | <comp_op> all | contains | does <not_op> contain | Multics Technical Bulletin MTB 644 is <maybe_not> in | <maybe_not> in ! <maybe_unique> ::= unique | ! <comp_op> ::= = | ^= | > | >= | '< | '<= | ^> | ^'< ! <add_op> ::= + | - ! <mult_op> ::= * | / ! <or_op> ::= or | '| ! <and_op> ::= and | & ! <not_op> ::= not | ^ ! <maybe_not> ::= <not_op> | ! <set_function> ::= avg | max | min | sum | count | <symbol> ! <builtin> ::= <symbol> ! MTB 644 Multics Technical Bulletin <arg_list> ::= <arg> | <arg_list> , <arg> ! <literal> ::= '< <literal_row_list> > | <literal_row> | ( <entry_list> ) | <constant> ! <literal_row_list> ::= <literal_row> | <literal_row_list> , <literal_row> ! <literal_row> ::= '< <entry_list> > ! <entry_list> ::= <entry> | <entry_list> , <entry> ! <entry> ::= <constant> | ! <constant> ::= <string> | <real literal> | <integer> | null ! (* Multics Technical Bulletin MTB 644 *) <ddl_statement> ::= <create_table> | <alter_table> | <create_tablespace> | <alter_tablespace> | <create_index> | <create_view> | <create_synonym> | <comment> | <drop> ! <create_table> ::= create table <table_name> ( <column_defn_list> ) ! <column_defn_list> ::= <column_defn_list> , <column_defn> | <column_defn> ! <column_defn> ::= <column_name> <data_type> <maybe_type_mod> ! <data_type> ::= <arith_data_type> | <string_data_type> ! <string_data_type> ::= <string_type> ( <integer> ) <variability> ! <string_type> ::= character | char | bit ! <variability> ::= varying | var | nonvarying | nonvar ! <arith_data_type> ::= <scale> <base> <precision> ! MTB 644 Multics Technical Bulletin <scale> ::= fixed | fix | float ! <base> ::= binary | bin | decimal | dec ! <precision> ::= ( <integer> ) | ( <integer> , <integer> ) ! <maybe_type_mod> ::= , <type_mod> | ! <type_mod> ::= not null | image <image_mod> | image ! <image_mod> ::= unique | uc ! <alter_table> ::= alter table <table_name> add <column_defn> ! <create_tablespace> ::= create tablespace <tablespace_name> in <path> <maybe_with_records> ! <maybe_with_records> ::= with <integer> records | ! <alter_tablespace> ::= alter tablespace <tablespace_name> <alter_mode> <integer> records ! <alter_mode> ::= add | remove ! Multics Technical Bulletin MTB 644 <create_index> ::= create <maybe_unique> index <index_name> on <table_name> ( <ord_spec_list> ) ! <create_view> ::= create view <view_name> <maybe_column_name_list> as <query> <maybe_check_option> ! <maybe_check_option> ::= with check option | ! <drop> ::= drop <system_entity> <name> ! <system_entity> ::= assertion | index | synonym | table | tablespace | trigger | view ! <create_synonym> ::= create synonym <table_name> for <table_name> ! <comment> ::= comment on table <table_name> is <string> | comment on column <table_name>.<column_name> is <string> ! (* MTB 644 Multics Technical Bulletin *) <control_statement> ::= <assert> | <trigger> | <grant> | <revoke> | <lock> | <register_table> | <deregister_table> | <register_link> | <deregister_link> | <commit> | <rollback> <save> | <restore> ! <assert> ::= create assertion <assert_name> <maybe_immed> <maybe_assert_cond> is <boolean_expr> ! <maybe_immed> ::= immediate | ! <maybe_assert_cond> ::= on <assert_condition> | ! <assert_condition> ::= <action_list> | <table_label_name> ! <action_list> ::= <action> | <action_list> , <action> ! <action> ::= insertion of <table_label_name> | deletion of <table_label_name> | update of <table_label_name> <maybe_column_name_list> ! <trigger> ::= create trigger <trigger_name> on <trigger_condition> is ( <statement_list> ) ! <trigger_condition> ::= <action> | read of <table_label_name> ! Multics Technical Bulletin MTB 644 <statement_list> ::= <conditional_statement> | <statement_list> ; <conditional_statement> ! <conditional_statement> ::= <statement> | if <boolean_expr> then <statement> ! <grant> ::= grant <authorization> <maybe_on_table> to <user_list> <maybe_grant_opt> ! <revoke> ::= revoke <authorization> <maybe_on_table> from <user_list> ! <authorization> ::= all | <privilege_list> | all but <privilege_list> ! <maybe_on_table> ::= on <table_name> | ! <user_list> ::= <user_list> , <user_name> | <user_name> ! <maybe_grant_opt> ::= (* not to *.*.* *) with grant option | ! <privilege_list> ::= <privilege_list> , <privilege> | <privilege> ! <privilege> ::= <table_privilege> | <dict_privilege> | <tablespace_privilege> | null ! <table_privilege> ::= alter | (* not views *) delete | index | (* not views *) insert | link | select | update <maybe_column_name_list> ! MTB 644 Multics Technical Bulletin <maybe_column_name_list> ::= ( <column_name_list> ) | ! <dict_privilege> ::= access | admin | create_tablespace | create_view | deregister_link | deregister_table | drop_synonym | drop_tablespace | drop_view | register_link | register_table ! <tablespace_privilege> ::= alter_tablespace | create_table | drop_table ! <lock> ::= lock table <table_name> in <lock_mode> mode ! <lock_mode> ::= exclusive | share ! <register_table> ::= register table <path> <maybe_as_table> <maybe_col_defn_list> <maybe_using> <special> ! <path> ::= <symbol> | <string> ! <maybe_as_table> ::= as <table_name> | ! <maybe_col_defn_list> ::= ( <column_defn_list> ) | ! <maybe_using> ::= using <procedure_name> | ! <special> ::= special <string> | ! Multics Technical Bulletin MTB 644 <deregister_table> ::= deregister table <table_name> ! <register_link> ::= register link <link_name> <maybe_as_link> from <path> ! <maybe_as_link> ::= as <link_name> | ! <deregister_link> ::= deregister link <link_name> ! <commit> ::= commit work ! <rollback> ::= rollback work ! <save> ::= save <save_name> ! <restore> ::= restore | restore <save_name> ! (* MTB 644 Multics Technical Bulletin *) <name> ::= <symbol> ! <assert_name> ::= <symbol> ! <column_name> ::= <symbol> ! <cursor_name> ::= <symbol> ! <index_name> ::= <symbol> ! <label_name> ::= <symbol> ! <link_name> ::= <symbol> ! <procedure_name> ::= <symbol> ! <save_name> ::= <symbol> ! <table_name> ::= <symbol> ! <tablespace_name> ::= <symbol> ! <trigger_name> ::= <symbol> ! <view_name> ::= <symbol> ! <user_name> ::= <symbol> | <symbol> . <symbol> | * . <symbol> | <symbol> . * ! (* End Multics View Manager SQL *) Multics Technical Bulletin MTB 644 4. BIBLIOGRAPHY Astrahan, M.M., et al. "A History and Evaluation of System R", IBM Research Report RJ2843, June 1980. Astrahan, M.M., et al. "System R: Relational Approach to Database Management" In Transactions on Database Systems, Vol. 1, No. 2, June 1976. Astrahan, M.M. and Chamberlin, D.D., "Implementation of a Structured English Query Language", In Communications of the ACM, Vol. 18, No. 10, October 1975. Chamberlin, D.D., et al. "SEQUEL 2: A Unified Approach to Data Definition, Manipulation, and Control", In IBM Journal of Research and Development, Vol. 20, No. 6, November 1976. Codd, E.F. and Blasgen, Michael, "Relational Data Base Management", ACM Professional Development Seminar lecture notes, October 1979. Gray, Jim, "New Translator System for MRDS/LINUS Query Language", MDC Proposal, Rev. 4.0, September, 16, 1981. Kroenke, David M., Database Processing: Fundamentals, Design, Implementation, (2nd ed.) Science Research Associates, 1983. IBM, Database 2 SQL Usage Guide, IBM Document GG24-1583-00, 1983. Prange, P., "LALR, a Translator Construction System", SLANG Project Technical Bulletin, July 26, 1983. Prange, P. and Margulies, Benson, "LALR, a Translator Construction System", MTB 602, October 4, 1982. Reisner, Phyllis, et al. "Human factors evaluation of two data base query languages--Square and Sequel", In Proceedings of the National Computer Conference, AFIPS, 1975. Relational Software Inc., ORACLE User's Guide, Version 2.3, 1981. Rosensteel, K., Relational Access Manager (RAM) EPS-1, LCPD document number 58075029, Rev. 4, September 21, 1982. Shneiderman, Ben, "Improving the Human Factors Aspect of Database Interactions", In ACM Transactions on Database Systems, Vol. 3, No. 4, December 1978.