Multics Technical Bulletin MTB-710 The LALR System To: Distribution From: Betty Wong Date: 23 May 1985 Subject: Multi System and Language Support 1. Abstract This MTB describes the systems and languages supported by the LALR system other than Multics and Multics PL/1 and ALM. The systems supported are GCOS and DPS 6. The languages supported are GMAP, GCOS PL/1, DPS 6 Assembly Language, Ada/SIL, and C. Comments on this MTB should be sent to the author - via Multics mail to: BWong.Multics on System M via posted mail to: Betty Wong Advanced Computing Technology Centre The University of Calgary Foothills Professional Building Room #301, 1620 - 29th Street N.W. Calgary, Alberta T2N 4L7 CANADA via telephone to: (403)-270-5400 (403)-270-5408 _________________________________________________________________ Multics project internal documentation; not to be reproduced or distributed outside the Multics project. MTB-710 Multics Technical Bulletin The LALR System TABLE OF CONTENTS Section Page Subject ======= ==== ======= 1 i Abstract 2 1 Introduction 3 2 Changes to the Control Section of the Grammar Source Segment 4 3 Changes to Handle Embedded Semantics 4.1 4 . . DPS 6 Assembly Language 4.2 6 . . Ada/SIL Language 4.3 8 . . C Language 5 10 Changes to Handle Separate Semantics 6 11 Changes to LALR Commands 6.1 11 . . The 'lalr' Command 6.2 15 . . The 'make_dpda' Command 6.3 18 . . The 'dps6_dpda' Command 7 21 Parse Tables Produced 7.1 21 . . GMAP Source Segment Parse Tables 7.2 24 . . DPS 6 Parse Tables Object Files 7.2.1 25 . . . . DPS 6 Files for Assembly Language Use 7.2.2 29 . . . . DPS 6 Files for Ada/SIL Use 7.2.3 30 . . . . DPS 6 Files for C Use Multics Technical Bulletin MTB-710 The LALR System 2. Introduction LALR translates a BNF-like language description into a parser for the language. The output from LALR is a set of tables that control the operation of a parser procedure. Because these tables are lists of signed integers they can be easily transported to computers other than Multics. The parser procedure is a simple routine and versions of it have been coded in PL/1, C and Assembly language. LALR has options which allow the tables to be generated as a Multics object segment, an ALM source segment, a GMAP source segment or a DPS 6 Multics Host Resident System object segment. Declarations for these segments can be generated in Multics PL/1, GCOS PL/1, DPS 6 Assembly Language, Ada/SIL, or C. These Multics-generated segments can then be transferred to other systems. The work for support on Multics systems, Multics PL/1, and ALM has already been done. Therefore, this document contains information describing Multics commands in the LALR system to produce tables for GCOS and DPS 6 systems and files in the languages supported on these systems. Most of the support for the specified systems and languages is already incorporated into the LALR system. The work remaining is to: 1) document the systems and languages supported 2) create parser procedures written in the supported languages that can be tailored to individual specifications 3) create equivalents of the lalrp command on other systems so that parts of the translator written in the supported languages can be tested. The LALR system was originally created by J. Falksen and Dave Ward of LISD. The information for this MTB is taken from LALR, a Translator Construction System, the SLANG Project Technical Bulletin written by Patrick Prange dated July 2, 1984. MTB-710 Multics Technical Bulletin The LALR System 3. Changes to the Control Section of the Grammar Source Segment Control arguments to the 'lalr' command can be included in the control section of the grammar source segment. Additional control lines to support other systems and languages are: -ada_sil -asm -c, -C -dps6_format -gmap -hrs_format -no_ada_sil -no_asm -no_c, -no_C, -noc, -noC -no_gmap For further information on the meanings of these control arguments, see Section 6.1 - The 'lalr' Command. Multics Technical Bulletin MTB-710 The LALR System 4. Changes to Handle Embedded Semantics In the embedded semantics format, the source segment will be able to contain code for a GCOS PL/1 procedure, a DPS 6 Assembly Language program, an Ada/SIL program unit, or a C function. The following sections describe the creation of the semantics segment from an embedded semantics source segment for DPS 6 Assembly Language, Ada/SIL, and C. A GCOS PL/1 semantics segment is generated by the same procedure used for a Multics PL/1 semantics segment. MTB-710 Multics Technical Bulletin The LALR System 4.1. DPS 6 Assembly Language If the source segment is a DPS 6 Assembly Language program unit (as indicated by the -semantics control argument), LALR creates the assemblable semantics segment from it by the following steps: 1) Begins the segment with the title statement: title X,'yymmdd00' where "yymmdd" is the current date, if the semantics segment is named X.nml or X.nml.MAC,. If the semantics segment is named X.incl.nml, it begins the segment with the comment lines mentioned in step 2 below. 2) Appends comment lines giving the name of the input grammar segment, the date and time it was translated, the version of LALR that was used to translate it, and the user_id of user who translated it are placed in the output semantics segment. 3) Appends the following statements defining the semantics procedure's entry point and transfering control to the semantics for the current rule. xdef X X lab $B4,jtable-1 ldr $R1,$B4.$R1 jmp $B4.$R1 These statements assume the parser passes the rule number or production number, as appropriate, by value in register R1. 4) Appends the source segment to the semantics segment making the following changes: a) Puts a "*" in front of each line of the control portion, if present. b) Puts a "*" in front of each line of each rule. If a rule does not begin at the beginning of a line or end at the end of a line, lines are split as necessary to make the rule do so. Multics Technical Bulletin MTB-710 The LALR System c) If the -production control is not in effect, each "%%%%" in the semantics is replaced with the 4-digit number of the rule which it represents. If the -production control is in effect, each "%%%%" immediately followed by an unsigned decimal number representing an alternative number, is replaced with the 4-digit number of the production which it represents. 5) Appends a DC statement defining the jump table used by the statements shown in step 3 above. If the -production control is not in effect the jump table is as follows: jtable dc R1-jtable+1; R2-jtable+1; ... Rn-jtable+1 The jump table contains an entry for each rule of the grammar. If the i-th rule has a significant semantic, Ri used in the i-th line of the DC statement is the letter "r" followed by the value of i as a 4-digit decimal number. Otherwise, Ri is "no_sem". (The user is assumed to have defined the tag "no_sem" somewhere in the semantics segment.) If the -production control is in effect the jump table is as follows: jtable dc P1-jtable+1; P2-jtable+1; ... Pn-jtable+1 The jump table contains an entry for each production of the grammar. If the i-th production has a significant semantic, Pi used in the i-th line of the DC statement is the letter "p" followed by the value of i as a 4-digit decimal number. Otherwise, Pi is "no_sem". (The user is assumed to have defined the tag "no_sem" somewhere in the semantics segment.) 6) Appends the following end statement to the semantics segment if it is named X.nml or X.nml.MAC. end X MTB-710 Multics Technical Bulletin The LALR System 4.2. Ada/SIL Language If the source segment is a Ada/SIL program unit (as indicated by the -semantics control argument), LALR creates the compilable semantics segment from it by the following steps. 1) Begins the semantics segment with a <subprogram specification> naming the subprogram. If the semantics segment is named X.ada, the following <subprogram specification> is generated: procedure X (rule_no: in natural; alt_no: in natural; lex_stack_ptr: in access; ls_top: in integer) is If the semantics segment is named X.incl.ada, the following <subprogram specification> is generated: procedure X (rule_no: in natural; alt_no: in natural) is If the -production control is in effect, the formal parameters rule_no and alt_no in the above <subprogram specification>s are replaced by a single input formal parameter "prod_no" of type natural. If the -rule_only control is in effect, the formal parameter alt_no is omitted from the above <subprogram specification>s. 2) Appends a sequence of <comment> lines giving the name of the input grammar segment, the date and time it was translated, the version of LALR that was used to translate it, and the user_id of the user who translated it. 3) Appends the source segment to the semantics segment making the following changes: a) Puts "--" in front of each line of the control portion, if present. b) Puts "--" in front of each line of each LALR rule. If a rule does not begin at the beginning of a line or end at the end of a line, lines are split as necessary to make each rule do so. Multics Technical Bulletin MTB-710 The LALR System c) If the -production control is not in effect, each "%%%%" in the semantics is replaced with the zero suppressed number of the rule which it represents. If the -production control is in effect, each "%%%%" immediately followed by an unsigned decimal number representing an alternative number, is replaced with the zero suppressed number of the production which it represents. 4) Ends the <subprogram body> with the following text: end X; NOTE: If the -no_semantics_header control has been given, steps 1 and 4 above are skipped. MTB-710 Multics Technical Bulletin The LALR System 4.3. C Language If the source segment is a C function (as indicated by the -semantics control argument), LALR creates the compilable semantics segment from it by the following steps. 1) Begins the semantics segment with a <type specifier> and a <function declarator> naming the procedure. If the semantics segment is named X.c, the following <type specifier> and <function declarator> are generated: int X (rule_no, alt_no, lex_stack_ptr, ls_top) If the semantics segment is named X.incl.c or X.h, the following <type specifier> and <function declarator> are generated: int X (rule_no, alt_no) If the -production control is in effect, the parameters rule_no and alt_no in the above <function declarator>s are replaced by the single parameter prod_no. If the -rule_only control is in effect, the parameter alt_no in the above <function declarator>s is omitted. 2) Appends a <comment> giving the name of the input grammar segment, the date and time it was translated, the version of LALR that was used to translate it, and the user_id of the user who translated it. 3) Appends a <declaration list> declaring the formal parameters. If the semantics segment is named X.c, the <declaration list> is as follows: int rule_no; int alt_no; int *lex_stack_ptr; int ls_top; If the semantics segment is named X.incl.c or X.h, the declaration of lex_stack_ptr and ls_top is omitted. If the -production control is in effect, the declaration of the formal parameters rule_no and alt_no in the above <declaration list>s are replaced by a declaration of a single int parameter "prod_no". If the -rule_only control is in effect, the declaration of the formal parameter alt_no in the above <declaration list> is omitted. 4) Appends a "{" (left brace) to the semantics segment. Multics Technical Bulletin MTB-710 The LALR System 5) Appends the source segment to the semantics segment making the following changes: a) Puts "/*" and "*/" around the control portion, if present. b) Replaces each occurrence of "/*" or "*/" within the control portion with the four character string "*//*". c) Puts "/*" and "*/" around each LALR rule. d) Replaces each occurrence of "/*" or "*/" within an LALR rule with the four character string "*//*". e) If the -production control is not in effect, each "%%%%" in the semantics is replaced with the zero suppressed number of the rule which it represents. If the -production control is in effect, each "%%%%" immediately followed by an unsigned decimal number representing an alternative number, is replaced with the zero suppressed number of the production which it represents. 6) Appends a "}" (right brace) to the semantics segment. NOTE: If the -no_semantics_header control has been given, only steps 2 and 5 above are performed. MTB-710 Multics Technical Bulletin The LALR System 5. Changes to Handle Separate Semantics In separate semantics source segments, the rules have this basic form: <var> ::= <prod list> ! <rule semantics> <prod list> represents a production list where a production is a sequence of terminals and variables. If there is a list of them, they are separated by "|". The production list may be empty. If the -production control is in effect, a production may end with the symbols "=> t : p$e", where t is an identifier tagging the production and p$e identifies an entry point in an external procedure to be called to perform the semantic action. If no tag is needed, "t" and the ":" following it may be omitted. There may not be any white-space between "p" and "$" nor between "$" and "e". If "p" and "e" are the same, the "$e" may be omitted. When the tables are produced as a GMAP source segment, "p" is ignored and "e" is taken to be an external symbol; i.e., it has been SYMDEF'ed. Each "t" generates a word, tagged with "t", containing the corresponding production number. Each "t" is also SYMDEF'ed. When the tables are produced as a DPS 6 object unit, "p" is taken to be the name of an object unit and "e" is considered to be an entry point defined within that object unit. If the -asm control is used to request the object unit, each "t" names an external value equal to the corresponding production number. If the -ada_sil control is used to request the object unit, each "t" generates a variable of type integer which is initialize with the corresponding production number. ! represents "end of production list". This must always be present. If the -rule control is in effect, the "!" of each rule may be followed by the symbols "=> t : p$e", where "t", "p", and "e" are as described above except that they pertain to rules instead of productions. Multics Technical Bulletin MTB-710 The LALR System 6. Changes to LALR Commands The support for other languages and systems requires changes to existing commands or the addition of other commands. 6.1. The 'lalr' Command This command requires additional control arguments and changes to the interpretation of the -semantics and -table control arguments. --------- lalr, lrk --------- SYNTAX: lalr path {control_args} FUNCTION: Invokes the LALR compiler to translate a source segment containing the text of the LALR source into a set of tables located in an object segment. The object segment is given two names consisting of the entryname portion of the source segment with the suffixes grammar and result. A listing segment is optionally produced. Packaged forms of the tables may be requested. These segments are placed in your working directory. ARGUMENTS: path is the pathname of the LALR source segment containing the grammar to be processed. The lalr suffix is assumed if not supplied. This argument may be an archive component pathname. MTB-710 Multics Technical Bulletin The LALR System CONTROL ARGUMENTS: -semantics {X}, -sem {X} produces a semantics file named X. X cannot be an archive component pathname. The equals convention is applied to the entryname of X and the entryname (or component name in case of an archive component) of the source segment. The suffix(s) of the resultant entryname must be pl1 or incl.pl1 (PL/I source), nml, incl.nml, or nml.MAC (DPS 6 Assembly Language source), ada or incl.ada (Ada/SIL source), or c, incl.c, or h (C source). If no suffix is present, incl.pl1 is assumed. If incl is present, it is treated as incl.pl1. If X is not given, "=_s.pl1" is assumed. This control argument is meaningless with a separate semantics format source segment. -table {X{.incl.pl1}}, -tb {X{.incl.pl1}} produces table named X and appropriately named source files. X may not be an archive component pathname. The equals convention is applied to the entryname of X and the entryname (or component name in case of an archive component) of the source segment. The table is produced as a Multics object segment unless otherwise specified by the control described below. If X is not given, "=_t" is assumed. This control argument implies the -terminals_hash_list, -terminals_list, -variables_list, -production_names and -synonyms control arguments. -ada_sil produces the table as a DPS 6 Multics Host Resident System object file name X.object or a DPS 6 native object file X.o and produce a DPS 6 Ada/SIL package specification named X.spec.ada. X is the name supplied with the -table control argument less all suffixes. -asm produces the table as a DPS 6 Multics Host Resident System object file named X.object or a DPS 6 native object file X.o and produce a DPS 6 Assembly Language include file named X.incl.nml. X is the name supplied with the -table control argument less all suffixes. Multics Technical Bulletin MTB-710 The LALR System -c, -C produces the table as a DPS 6 Multics Host Resident System object file named X.object or a DPS 6 native object file X.o and produce a DPS 6 C Language header file named X.h. X is the name supplied with the -table control argument less all suffixes. -dps6_format causes the DPS 6 object file produced because of the -asm, -ada_sil, or -C control argument to be generated in 'native' format. A native format DPS 6 object file may be transmitted to a DPS 6 running the Mod 400 Operating System via a network_request l6_ftf command specifying data_type binary. No intermediate format conversion step is required before transmitting native format files. Native format is the default format for DPS 6 object files. This control argument is meaningless if none of the control arguments -asm, -ada_sil, or -C are specified. -gmap produces the table as a gmap segment X.gmap and a GCOS III PL/I include file named X.incl.pl1. X is the name supplied with the -table control argument less all suffixes. -hrs_format causes the DPS 6 object file produced because of the -asm, -ada_sil, or -C control argument to be generated in the Multics Host Resident System format. This is the format required by the various HRS tools. This control argument is meaningless if none of the control arguments -asm, -ada_sil, or -C are specified. -no_ada_sil does not produce the table in the form described above for the -ada_sil control argument. -no_asm does not produce the table in the form described above for the -asm control argument. -no_c, -no_C, -noc, -noC does not produce the table in the form described above for the -c control argument. MTB-710 Multics Technical Bulletin The LALR System -no_gmap does not produce the table in the form described above for the -gmap control argument. -origin N, -org N specifies the lower bound, N, to be used with the arrays generated for DPS 6 format tables. N must be 0 or 1. The default is 0 if the -c control is present, otherwise it is 1. For the DPDA, the final state (state zero) is materialized when the origin is zero, otherwise it is a fictitious state. For the skip table, a dummy row zero is generated when the origin is zero. For the effect of this control on the terminals list structures, see the language specific discussions below. Notes: Options -alm, -gmap and one of -asm, -ada_sil, or -c may occur together. Options -asm, -ada_sil and -c are mutually exclusive. If -alm, -gmap, -asm, -ada_sil or -c is in effect but the -table parameter is not, -table =_t is assumed. Multics Technical Bulletin MTB-710 The LALR System 6.2. The 'make_dpda' Command This command requires additional control arguments. ------------- make_dpda, md ------------- SYNTAX: make_dpda result_file_path {table_path} {control_args} FUNCTION: produces a table containing the DPDA extracted from the result file of a previous LALR generation. This table is the same as the one produced by the lalr command when it is invoked with the -table control argument. ARGUMENTS: result_file_path is the pathname of the result file from a previous LALR generation from which the DPDA is to be extracted. The grammar suffix is assumed if not supplied. This argument may be an archive component pathname. table_path is the pathname of the table to be produced. If this argument is given with the suffix incl.pl1, the suffix is ignored. Any other suffix is retained as given. The default is "=_t". CONTROL ARGUMENTS: -ada_sil produces the table as a DPS 6 Multics Host Resident System object file named X.object or a DPS 6 native object file named X.o and produce a DPS 6 Ada/SIL package specification file named X.spec.ada. MTB-710 Multics Technical Bulletin The LALR System -asm produces the table as a DPS 6 Multics Host Resident System object file named X.object or a DPS 6 native object file named X.o and produce a DPS 6 Assembly Language include file named X.incl.nml. -c, -C produces the table as a DPS 6 Multics Host Resident System object file named X.object or a DPS 6 native object file named X.o and produce a DPS 6 C Language header file named X.h. -dps6_format causes the DPS 6 object file produced because of the -asm, -ada_sil, or -C control argument to be generated in 'native' format. A native format DPS 6 object file may be transmitted to a DPS 6 running the Mod 400 Operating System via a network_request l6_ftf command specifying data_type binary. No intermediate format conversion step is required before transmitting native format files. Native format is the default format for DPS 6 object files. This control argument is meaningless if none of the control arguments -asm, -ada_sil, or -C are specified. -gmap produces the table as a gmap segment named X.gmap and a GCOS III PL/I include file named X.incl.pl1. -hrs_format causes the DPS 6 object file produced because of the -asm, -ada_sil, or -C control argument to be generated in the Multics Host Resident System format. This is the format required by the various HRS tools. This control argument is meaningless if none of the control arguments -asm, -ada_sil, or -C are specified. -no_ada_sil does not produce the table in the form described above for the -ada_sil control argument. -no_asm does not produce the table in the form described above for the -asm control argument. Multics Technical Bulletin MTB-710 The LALR System -no_c, -no_C, -noc, -noC does not produce the table in the form described above for the -c control argument. -no_gmap does not produce the table in the form described above for the -gmap control argument. -origin N, -org N specifies the lower bound, N, to be used with the arrays generated for DPS 6 format tables. N must be 0 or 1. The default is 0 if the -c control is present otherwise it is 1. For the DPDA, the final state (state zero) is materialized when the origin is zero otherwise it is a fictitious state. For the skip table, a dummy row zero is generated when the origin is zero. For the effect of this control on the terminals list structures, see the language specific discussions under the lalr command. Notes: As used above, X is the name given, or assumed, for the table. Options -alm, -gmap and one of -asm, -ada_sil, or -c may occur together. Options -asm, -ada_sil, and -c are mutually exclusive. If none of the control arguments -alm, -gmap, -asm, -ada_sil, or -c are present, the table is produced as a Multics object segment named X and a Multics PL/I include file name X.incl.pl1. The -terminals_hash_list control argument is treated as if it were the -terminals_list control argument when producing a DPS 6 (Level 6) object file. The -synonyms control argument is meaningless when producing a DPS 6 object file with the -asm control argument. The -production_names and -variables_list control arguments are ignored when producing a DPS 6 object file. The DPS 6 object file is produced in LAF mode. MTB-710 Multics Technical Bulletin The LALR System 6.3. The 'dps6_dpda' Command This is an additional command. ------------------ dps6_dpda, l6_dpda ------------------ SYNTAX: dps6_dpda result_file_path {object_file_path} {control_args} FUNCTION: produces a DPS 6 Multics Host Resident System object file or a DPS 6 native object file containing the DPDA extracted from the result file of a previous LALR generation. This object file is the same as the one produced by the lalr command when it is invoked with the -table control argument and either the -asm, -ada_sil, or -c control argument. ARGUMENTS: result_file_path is the pathname of the result file from a previous LALR generation from which the DPDA is to be extracted. If result_file_path does not have a suffix of grammar, one is assumed. However, the suffix grammar must be the last component of the name of the result segment to be used. This argument may be an archive component pathname. object_file_path is the pathname of the object file to be produced. If object_file_path does not have a suffix of object, one is assumed. The default is "=.object". CONTROL ARGUMENTS: -ada_sil produces an Ada/SIL package specification describing the external variables defined in the object file. This package specification is stored in the same directory as the object file. Its entryname is obtained by changing the object suffix of the object file to spec.ada. Multics Technical Bulletin MTB-710 The LALR System -asm produces a DPS 6 Assembly Language include file describing the external variables defined in the object file. This include file is stored in the same directory as the object file. Its entryname is obtained by changed the object suffix of the object file to incl.nml. -c, -C produces a DPS 6 C Language header file describing the external variables and functions defined in the object file. This header file is stored in the same directory as the object file. Its entryname is obtained by changing the object suffix of the object file to h. -dps6_format causes the DPS 6 object file to be generated in 'native' format. A native format DPS 6 object file may be transmitted to a DPS 6 running the Mod 400 Operating System via a network_request l6_ftf command specifying data_type binary. No intermediate format conversion step is required before transmitting native format files. Native format is the default format for DPS 6 object files. -hrs_format causes the DPS 6 object file to be generated in the Multics Host Resident System format. This is the format required by the various HRS tools. -no_ada_sil does not produce the table in the form described above for the -ada_sil control argument. -no_asm does not produce the table in the form described above for the -asm control argument. -no_c, -no_C, -noc, -noC does not produce the table in the form described above for the -c control argument. MTB-710 Multics Technical Bulletin The LALR System -no_terminals_list, -ntl does not include the terminals list (TL and TC) in the table. (Default) -origin N, -org N specifies the lower bound, N, to be used with the various arrays generated for parse tables. N must be 0 or 1. The default is 0 if the -c control is present otherwise it is 1. For the DPDA, the final state (state zero) is materialized when the origin is zero otherwise it is a fictitious state. For the skip table, a dummy row zero is generated when the origin is zero. For the effect of this control on the terminals list structures, see the language specific discussions under the lalr command. -synonyms, -syn includes the terminal encoding as a field in the terminals list instead of using the index to the terminals list as the encoded value. This option is forced if the grammar contains a -synonyms control. The -synonyms control argument is meaningless unless the -terminals_list control argument is also specified. -terminals_list, -tl includes the terminals list in the object file. Notes: The object file is produced in LAF mode. The control arguments -asm, -ada_sil, -c are mutually exclusive. If none are specified, -asm is assumed. Multics Technical Bulletin MTB-710 The LALR System 7. Parse Tables Produced 7.1. GMAP Source Segment Parse Tables The gmap source segment produced by the -gmap control argument is equivalent to the data described by the following PL/I declarations. The generated include file X.incl.pl1 contains a copy of these declarations (unless the -alm control argument is also in effect). When a separate semantics format source segment is used, the gmap source segment also contains a transfer vector with the external name SEMVEC. This vector is used by the parser to call the various semantic actions. The rule number, or production number if the -production control is in effect, must be passed as the n-th argument, where n is the value specified by the -separate_semantics control argument, in the call to the transfer vector. Any additional arguments desired may be passed. The generated include file does not describe the transfer vector. dcl 1 THL (0:xx) bit (12) unaligned external static; dcl 1 TL (xx) external static, 2 lk fixed bin (17) unaligned, 2 pt fixed bin (17) unaligned, 2 ln fixed bin (17) unaligned, 2 cd fixed bin (17) unaligned; dcl TC char (xx) external static; dcl 1 DPDA (xx) external static, 2 v1 fixed bin (17) unaligned, 2 v2 fixed bin (17) unaligned, dcl 1 SKIP (xx) external static), 2 v1 fixed bin (17) unaligned, 2 v2 fixed bin (17) unaligned; dcl PN fixed bin (17) unaligned external static; dcl 1 VL (xx) external static, 2 pt fixed bin (17) unaligned, 2 ln fixed bin (17) unaligned; dcl VC char (xx) external static; binary(THL(i), 12, 0) is the TL index of the first terminal symbol whose hash value is i. The function lalr_hash_ (contained in the include file lalr_hash_.incl.pl1), when invoked by lalr_hash_ (T, dim (THL, 1)), returns the hash value of the character string T. The THL structure is generated only when the -terminals_hash_list control is in effect. MTB-710 Multics Technical Bulletin The LALR System The format shown above is generated when both the -terminals_hash_list and -terminals_list controls are in effect and synonyms have been defined. TL(i).lk is the TL index of the next terminal symbol having the same hash value as the i-th terminal symbol. substr (TC, TL(i).pt, TL(i).ln) is the normalized spelling of the i-th terminal symbol. And finally, TL(i).cd is the encoded value of the i-th terminal symbol. If the -terminals_hash_list and -terminals_list controls are both in effect but no synonyms are defined, the following structure is generated for the terminals list instead of the one shown above. When this structure is used, the encoded value of the i-th terminal symbol is i. dcl 1 TL external static, 2 lk fixed bin (10) unaligned, 2 pt fixed bin (13) unaligned, 2 ln fixed bin (10) unaligned; If the -terminals_hash_list control is not in effect but the -terminals_list control is in effect and synonyms are defined, the following structure is generated for the terminals list instead of one of those shown above. dcl 1 TL external static, 2 pt fixed bin (13) unaligned, 2 ln fixed bin (10) unaligned, 2 cd fixed bin (10) unaligned; If the -terminals_hash_list control is not in effect but the -terminals_list control is in effect and no synonyms are defined, the following structure is generated for the terminals list instead of any of those shown above. dcl 1 TL external static, 2 pt fixed bin (17) unaligned, 2 ln fixed bin (17) unaligned; If the -terminals_hash_list control is not in effect, the THL structure is omitted. If neither the -terminals_hash_list nor the -terminals_list control is in effect, THL, TL, and TC are all omitted. DPDA and SKIP are the Deterministic Push Down Automata implementing the parsing algorithm and its associated error recovery tables. The DPDA and SKIP structures are always generated. Multics Technical Bulletin MTB-710 The LALR System PN is the production names list. PN(i) is the negation of the VL index for the variable (non-terminal) naming the i-th production (or the i-th rule if the -rule_only control is in effect). If the -production_names control is not in effect, the PN structure is not generated. Vl is the variables list. substr (VC, VL(i).pt, VL(i).ln) is the normalized spelling of the i-th variable. If neither the -production_names control nor the -variables_list control is in effect, PN, VL, and VC are all omitted. MTB-710 Multics Technical Bulletin The LALR System 7.2. DPS 6 Parse Tables Object Files The -terminals_hash_list control argument is treated as if it were the -terminals_list control argument when producing a DPS 6 object file. The -production_names and -variables_list control arguments are ignored when producing a DPS 6 object file. The DPS 6 object file is produced in LAF mode. In the following discussion of DPS 6 Parse Table formats, the symbols N, R, S, T, U and V are used as extent expressions. In the generated data and its declarations, they are replaced by the appropriate constants. N is the lower bound specified by the -origin control or implied by the language intended for use with the parse tables U and V are the upper bounds of the DPDA and skip recovery tables, respectively. R, S, and T are used as upper bounds of the various terminals list tables. If N is zero, the final state (state 0) is materialized in the DPDA; otherwise it is a fictitious state. A dummy row zero is generated in the skip table when N is zero. See the language specific discussions below for the effect of N on the terminals list tables. Multics Technical Bulletin MTB-710 The LALR System 7.2.1. DPS 6 Files for Assembly Language Use The DPS 6 object file produced by the -asm control argument is equivalent to the data described by the PL/I declarations below. When a separate semantics format source segment is used, the object file also contains a transfer vector with the external name SEMVEC. The rule number, or production number if the -production control is in effect, must be passed to the transfer vector by value in register R1. The transfer vector's code destroys registers R1 and B4; all other registers are unchanged. dcl OP1C_n fixed binary (15) internal static options (constant) initial (R); dcl OP2C_n fixed binary (15) internal static options (constant) initial (S); dcl RSWD_n fixed binary (15) internal static options (constant) initial (T); dcl LIT_c fixed binary (15) internal static options (constant) initial (xx); dcl INT_c fixed binary (15) internal static options (constant) initial (xx); dcl LINT_c fixed binary (15) internal static options (constant) initial (xx); dcl NUMB_c fixed binary (15) internal static options (constant) initial (xx); dcl REAL_c fixed binary (15) internal static options (constant) initial (xx); dcl SYMB_c fixed binary (15) internal static options (constant) initial (xx); dcl EOL_c fixed binary (15) internal static options (constant) initial (xx); dcl HEXI_c fixed binary (15) internal static options (constant) initial (xx); dcl BIT_c fixed binary (15) internal static options (constant) initial (xx); dcl NIL_c fixed binary (15) internal static options (constant) initial (xx); dcl OP1C_s (N:R) char (1) external static initial ("x", "x", ... ); dcl OP2C_s (N:S) char (2) external static initial ("xx", "xx", ... ); dcl 1 RSWD (N:T) aligned external static, 2 RSWD_s char (xx) initial ("xx", "xx", ... ), 2 RSWD_c fixed bin (15) initial (xx, xx, ... ); dcl DPDA_n fixed binary (15) internal static options (constant) initial (U); dcl SKIP_n fixed binary (15) internal static options (constant) initial (V); MTB-710 Multics Technical Bulletin The LALR System dcl 1 DPDA (N:U) external static, 2 v1 fixed binary (15) initial (xx, xx, ... ), 2 v2 fixed binary (15) initial (xx, xx, ... ); dcl 1 SKIP (N:V) external static, 2 v1 fixed binary (15) initial (xx, xx, ... ), 2 v2 fixed binary (15) initial (xx, xx, ... ); The data with internal static options (constant) attributes are generated as "external value definitions" in the DPS 6 object file. The data with external static attributes are generated as "code section" constants with "external location definitions". OP1C_n and OP1C_s are the index of the last one character operator (e.g. +) and the one character operators themselves, respectively. OP2C_n and OP2C_s are the index of the last two character operator (e.g. >=) and the two character operators themselves, respectively. LIT_c is the code for the nonnumeric literal complicated termi- nal. This terminal may be specified as <character string>, <string>, <quoted string>, or <nonnumeric literal>. INT_c is the code for the integer literal complicated terminal. This terminal may be specified as <integer>. LINT_c is the code for the long integer complicated terminal, it may be specified as <long integer>. NUMB_c is the code for the fixed-point literal complicated terminal. This terminal may be specified as <number> or <fixed-point literal>. REAL_c is the code for the floating-point literal complicated terminal. This terminal may be specified as <real> or <floating-point literal>. SYMB_c is the code for the identifier complicated terminal. This terminal may be specified as <identifier> or <symbol>. EOL_c is the code for the end of line complicated terminal. This terminal may be specified as <eol>, <end of line>, <nl>, or <newline>. HEXI_c is the code for the hexadecimal integer literal complicated terminal. This terminal may be specified as <hexadecimal integer> or <hex integer>. BIT_c is the code for the bit string literal complicated terminal. This terminal may be specified as <bit string> or <boolean aggregate>. Multics Technical Bulletin MTB-710 The LALR System NIL_c is the code for the nil symbol terminal. This terminal may be specified as <nil> or <syntax error> For any of the above mentioned complicated terminals not used in the grammar, a code of zero is used. If a complicated terminal not listed above is encountered, an external value definition is generated for it. The symbol so defined is obtained by removing the enclosing angle brackets from the complicated terminal. If the resultant symbol is fewer than five characters in length, it is further modified by appending "_c". RSWD_n, RSWD_k, and RSWD are the index of the last reserved word, the length of each reserved word, and the reserved words themselves, respectively. All terminal symbols which are not complicated terminals and are not one or two character operators as defined above are considered reserved words. In RSWD (i), RSWD_s is the i-th reserved word padded with spaces and RSWD_c is the encoding for that reserved word. DPDA_n and DPDA are the index of the last DPDA entry and the DPDA itself, respectively. SKIP_n and SKIP are the index of the last SKIP table entry and the skip tables themselves, respectively. If the -terminals_list control is not in effect, only the declaration of DPDA_n, SKIP_n, DPDA and SKIP are generated. MTB-710 Multics Technical Bulletin The LALR System The text of a generated assembly language include file is shown below. * * SCANNER AND PARSER TABLES FROM SEGMENT * >user_dir_dir>SLANG>LANGUAGE>adasil_rel_0.grammar * Generated by: Lo.SLANG.a using LALR 7.3 * of Tuesday, December 6, 1983 * Generated at: TCO 68/80 Multics Billerica, Ma. * Generated on: 12/14/83 1453.2 est Wed * Generated from: >udd>slang>LANGUAGE>adasil_rel_0.lrk * >udd>slang>Lo>ada_decl_part.incl.lrk * >udd>slang>include>ada_statements.incl.lrk * xval OP1C_n Index of last one character operator xval OP2C_n Index of last two character operator xval RSWD_n Index of last reserved word xval RSWD_k Length of longest reserved word * xval LIT_c Code for nonnumeric literal xval INT_c Code for integer literal xval LINT_c No complicated terminal for long integer literal xval NUMB_c No complicated terminal for fixed-point literal xval REAL_c No complicated terminal for floating-point literal xval SYMB_c Code for identifier xval EOL_c No complicated terminal for end-of-line terminal xval HEXI_c No complicated terminal for hexadecimal literal xval BIT_c Code for bit string literal xval NIL_c Code for nil (syntax error) xval EE_c No complicated terminal for example element * xval DPDA_n Index of last DPDA row xval SKIP_n Index of last SKIP row * xloc OP1C_s The one character operators (2 per word) xloc OP1C_c The corresponding codes (1 per word) xloc OP2C_s The two character operators (1 per word) xloc OP2C_c The corresponding codes (1 per word) xloc RSWD The reserved word table * xloc DPDA The DPDA table xloc SKIP The SKIP table * Multics Technical Bulletin MTB-710 The LALR System 7.2.2. DPS 6 Files for Ada/SIL Use The DPS 6 file produced by the -ada_sil control argument is equivalent to the data described by the PL/I declarations below. When a separate semantics format source segment is used, the object file also contains a transfer vector with the external name SEMVEC. The rule number, or production number if the -production control is in effect, must be passed to the transfer vector by value in register R1. The transfer vector's code destroys registers R1 and B4, all other registers are unchanged. dcl 1 Terminal aligned based, 2 position fixed binary (15), 2 length fixed binary (15), 2 code fixed binary (15); dcl 1 T_List (N:R) aligned like Terminal external static; dcl T_Char char (S) external static init ("xxx ... "); dcl DPDAv1 (N:U) fixed binary (15) external static initial (xx, xx, ... ); dcl DPDAv2 (N:U) fixed binary (15) external static initial (xx, xx, ... ); dcl SKIPv1 (N:V) fixed binary (15) external static initial (xx, xx, ... ); dcl SKIPv2 (N:V) fixed binary (15) external static initial (xx, xx, ... ); All of the above external static variables are generated as "code section" constants to allow them to be shared constants. Because of this, this object file must be linked (with a LINKN linker directive) before the object file for any Ada/SIL compilation unit using the generated package specification. As used in the above declarations, R is the index of the last terminal (including complicated terminals) and S is the length of the T_Char variable. The based variable Terminal describes a single entry in the terminal list array T_List. The i-th terminal is substring (T_Char, T_List.position (i), T_List.length (i)). If the grammar uses synonyms, T_List.code (i) gives the code for the i-th terminal. Otherwise, the code component is omitted from the Terminal structure and the code for the i-th terminal is i. In this case, if N is zero, a dummy row zero with T_List.position (0) = 1 and T_List.length (0) = 0 is generated. U and V specify the index of the last entry in the DPDA and SKIP tables, respectively. DPDAv1 and DPDAv2 are the two columns of the DPDA. Similarly, SKIPv1 and SKIPv2 are the two columns of the SKIP tables. MTB-710 Multics Technical Bulletin The LALR System If the -terminals_list control is not in effect, Terminal, T_List, and T_Char are not generated. The text of a generated Ada/SIL package specification is shown below. -- SCANNER AND PARSER TABLES FROM SEGMENT -- >user_dir_dir>SLANG>LANGUAGE>adasil_rel_0.grammar -- Generated by: Lo.SLANG.a using LALR 7.3 -- of Tuesday, December 6, 1983 -- Generated at: TCO 68/80 Multics Billerica, Ma. -- Generated on: 12/14/83 1453.2 est Wed -- Generated from: >udd>slang>LANGUAGE>adasil_rel_0.lrk -- >udd>slang>Lo>ada_decl_part.incl.lrk -- >udd>slang>include>ada_statements.incl.lrk package adasil_rel_0_t is subtype TL_index is Integer range 1..125; subtype TC_index is Integer range 1..838; subtype DPDA_index is Integer range 1..3591; subtype SKIP_index is Integer range 1..77; type Terminal is record position: TC_index; -- index into T_Char. length: Positive; -- length of terminal. code: Integer; -- code for terminal. end record; T_list: array (TL_index) of Terminal; T_Char: string (TC_index); DPDAv1: array (DPDA_index) of Integer; DPDAv2: array (DPDA_index) of Integer; SKIPv1: array (SKIP_index) of Integer; SKIPv2: array (SKIP_index) of Integer; end adasil_rel_0_t; Multics Technical Bulletin MTB-710 The LALR System 7.2.3. DPS 6 Files for C Use The DPS 6 object file produced by the -c control argument is equivalent to the data described by the PL/I declarations below. When a separate semantics format source segment is used, the object file also contains a transfer vector with the external name SEMVEC. The rule number, or production number if the -production control is in effect, must be passed as the first argument in the call to the transfer vector. The transfer vector assumes B4 is the argument list pointer. It destroys B1 and R7; all other registers are unchanged. dcl gtoptb entry returns (pointer); dcl gtrwtb entry returns (pointer); dcl opmc (N:R) unaligned unsigned fixed bin (8); dcl rswd_t (N:S) fixed bin (15); dcl rswd_s (N:T) unaligned unsigned fixed bin (8); dcl gtdpda entry returns (pointer); dcl gtskip entry returns (pointer); dcl dpda (N:U, N:N+1) fixed bin (15); dcl skip (N:V, N:N+1) fixed bin (15); All of the above external static variables are generated as "code section" constants to allow them to be shared constants. The external functions gtoptb, gtrwtb, gtdpda and gtskip return pointers to opmc, rswd_t, dpda and skip respectively. opmc defines all of the terminal symbols that do not consist entirely of letters, digits, dollar signs and underscores. These terminal symbols are ordered by decreasing length. Each is stored as a byte containing the symbol's encoded value followed by a NUL terminated string giving the symbol's spelling. This list of terminal symbols is terminated by a byte containing the value -1. rswd_t and rswd_s define the remaining terminal symbols (excluding the complicated terminal symbols). rswd_t (N) con- tains hbound (rswd_t, 1); i.e. the index of the last symbol defined. rswd_t (i), for N < i <= rswd_t (N), gives the index into rswd_s to the definition of a terminal symbol. The entries in rswd_t are ordered so as to permit a binary search. Each symbol defined in rswd_s is stored as a byte containing the symbol's encoded value followed by a NUL terminated string giving the symbols's spelling. MTB-710 Multics Technical Bulletin The LALR System If a terminal symbol which would normally be defined by rswd_t and rswd_s is found to be the same as an initial substring of a terminal defined by opmc, it is placed in opmc instead of the rswd arrays. If the -terminals_list control is not in effect, only the declarations of gtdpda, gtskip, dpda, and skip are generated. If the -terminals_list control is in effect, a series of #define preprocessor statements is also generated to name the encoded value of the various complicated terminals. The names are chosen as described above for the DPS 6 Assembly Language format parse tables. Multics Technical Bulletin MTB-710 The LALR System The text of a generated C header file is shown below. /* SCANNER AND PARSER TABLES FROM SEGMENT >user_dir_dir>SLANG>LANGUAGE>adasil_rel_0.grammar Generated by: Lo.SLANG.a using LALR 7.3 of Tuesday, December 6, 1983 Generated at: TCO 68/80 Multics Billerica, Ma. Generated on: 12/14/83 1453.2 est Wed Generated from: >udd>slang>LANGUAGE>adasil_rel_0.lrk >udd>slang>Lo>ada_decl_part.incl.lrk >udd>slang>include>ada_statements.incl.lrk */ char (*gtoptb ()) []; int (*gtrwtb ()) []; #define OPmC_n 41 /* Index of last m character operator */ #define RSWD_n 57 /* Index of last reserved word */ #define LIT_c 107 /* Code for nonnumeric literal */ #define INT_c 108 /* Code for integer literal */ #define LINT_c 0 /* No complicated terminal for long integer literal */ #define NUMB_c 0 /* No complicated terminal for fixed-point literal */ #define REAL_c 0 /* No complicated terminal for floating-point literal */ #define SYMB_c 88 /* Code for identifier */ #define EOL_c 0 /* No complicated terminal for end-of-line terminal */ #define HEXI_c 0 /* No complicated terminal for hexadecimal literal */ #define BIT_c 105 /* Code for bit string literal */ #define NIL_c 103 /* Code for nil (syntax error) */ #define EE_c 0 /* No complicated terminal for example element */ #define DPDA_n 3591 /* Index of last DPDA row */ #define SKIP_n 77 /* Index of last SKIP row */ int (*gtdpda ()) []; int (*gtskip ()) [];