RSG(1) Icon Program Library RSG(1) NNNNAAAAMMMMEEEE rsg - generate random sentences SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS rrrrssssgggg [----llll _n] [----llll _n] [----tttt] DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN _R_s_g generates randomly selected sentences from a grammar specified by the user. The following options may appear in any order: ----ssss _n Set the seed for random generation to _n. The default seed is 0. ----llll _n Terminate generation if the number of symbols remaining to be processed exceeds _n. There is no default limit. ----tttt Trace the generation of sentences. Trace output goes to standard error output. _R_s_g works interactively, allowing the user to build, test, modify, and save grammars. Input to _r_s_g consists of various kinds of specifications, which can be intermixed: _P_r_o_d_u_c_t_i_o_n_s define nonterminal symbols in a syntax similar to the rewriting rules of BNF with various alternatives con- sisting of the concatenation of nonterminal and terminal symbols. _G_e_n_e_r_a_t_i_o_n _s_p_e_c_i_f_i_c_a_t_i_o_n_s cause the generation of a speci- fied number of sentences from the language defined by a given nonterminal symbol. _G_r_a_m_m_a_r _o_u_t_p_u_t _s_p_e_c_i_f_i_c_a_t_i_o_n_s cause the definition of a specified nonterminal or the entire current grammar to be written to a given file. _S_o_u_r_c_e _s_p_e_c_i_f_i_c_a_t_i_o_n_s cause subsequent input to be read from a specified file. In addition, any line beginning with #### is considered to be a comment, while any line beginning with ==== causes the rest of that line to be used as a prompt to the user whenever _r_s_g is ready for input (there normally is no prompt). A line con- sisting of a single ==== stops prompting. PPPPrrrroooodddduuuuccccttttiiiioooonnnnssss Examples of productions are: ::=|+ ::=|* Version 5.9 The University of Arizona - 5/16/83 1 RSG(1) Icon Program Library RSG(1) ::=x|y|z|() Productions may occur in any order. The definition for a nonterminal symbol can be changed by specifying a new pro- duction for it. There are a number of special devices to facilitate the definition of grammars, including eight predefined, built-in nonterminal symbols: symbol definition <<<>>> <<<< <<<>>> >>>> <<<>>> |||| <<<>>> newline <<<<>>>> empty string <<<<&&&&llllccccaaaasssseeee>>>> any single lowercase letter <<<<&&&&uuuuccccaaaasssseeee>>>> any single uppercase letter <<<<&&&&ddddiiiiggggiiiitttt>>>> any single digit In addition, if the string between a <<<< and >>>> begins and ends with a single quotation mark, that construction stands for any single character between the quotation marks. For exam- ple, <'xyz'> is equivalent to x|y|z Finally, if the name of a nonterminal symbol between the <<<< and >>>> begins with ????, the user is queried during generation to supply a string for that nonterminal symbol. For example, in ::=|+| if the third alternative is encountered during generation, the user is asked to provide a string for <<<>>>. GGGGeeeennnneeeerrrraaaattttiiiioooonnnn SSSSppppeeeecccciiiiffffiiiiccccaaaattttiiiioooonnnnssss A generation specification consists of a nonterminal symbol followed by a nonnegative integer. An example is 10 which specifies the generation of 10 <<<>>>s. If the integer is omitted, it is assumed to be 1. Generated sentences are written to standard output. Version 5.9 The University of Arizona - 5/16/83 2 RSG(1) Icon Program Library RSG(1) GGGGrrrraaaammmmmmmmaaaarrrr OOOOuuuuttttppppuuuutttt SSSSppppeeeecccciiiiffffiiiiccccaaaattttiiiioooonnnnssss A grammar output specification consists of a nonterminal symbol, followed by ---->>>>, followed by a file name. Such a specification causes the current definition of the nontermi- nal symbol to be written to the given file. If the file is omitted, standard output is assumed. If the nonterminal sym- bol is omitted, the entire grammar is written out. Thus, -> causes the entire grammar to be written to standard output. SSSSoooouuuurrrrcccceeee SSSSppppeeeecccciiiiffffiiiiccccaaaattttiiiioooonnnnssss A source specification consists of @@@@ followed by a file name. Subsequent input is read from that file. When an end of file is encountered, input reverts to the previous file. Input files can be nested. DDDDIIIIAAAAGGGGNNNNOOOOSSSSTTTTIIIICCCCSSSS Syntactically erroneous input lines are noted, but ignored. Specifications for a file that cannot be opened are noted and treated as erroneous. If an undefined nonterminal symbol is encountered during generation, an error message that identifies the undefined symbol is produced, followed by the partial sentence gen- erated to that point. Exceeding the limit of symbols remain- ing to be generated as specified by the ----llll option is handled in similarly. CCCCAAAAVVVVEEEEAAAATTTTSSSS Generation may fail to terminate because of a loop in the rewriting rules or, more seriously, because of the progres- sive accumulation of nonterminal symbols. The latter problem can be identified by using the ----tttt option and controlled by using the ----llll option. The problem often can be circumvented by duplicating alternatives that lead to fewer rather than more nonterminal symbols. For example, changing ::=|+ to ::=||+ increases the probability of selecting <<<>>> from 1/2 to 2/3. See the second reference listed below for a discussion of the general problem. SSSSEEEEEEEE AAAALLLLSSSSOOOO Griswold, Ralph E. and Madge T. Griswold. _T_h_e _I_c_o_n Version 5.9 The University of Arizona - 5/16/83 3 RSG(1) Icon Program Library RSG(1) _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1983. pp. 211-219, 301-302. Wetherell, C. S. ``Probabilistic Languages: A Review and Some Open Questions'', _C_o_m_p_u_t_e_r _S_u_r_v_e_y_s, Vol. 12, No. 4 (1980), pp. 361-379. AAAAUUUUTTTTHHHHOOOORRRR Ralph E. Griswold Version 5.9 The University of Arizona - 5/16/83 4