.SH Section 6C: The C Language Yacc Environment .PP The default mode of operation in Yacc is to write actions and the lexical analyzer in C. This has a number of advantages; primarily, it is easier to write character handling routines, such as the lexical analyzer, in a language which supports character-by-character I/O, and has shifting and masking operators. .PP When the user inputs a specification to Yacc, the output is a file of C programs, called ``y.tab.c''. These are then compiled, and loaded with a library; the library has default versions of a number of useful routines. This section discusses these routines, and how the user can write his own routines if desired. The name of the Yacc library is system dependent; see Appendix B. .PP The subroutine produced by Yacc is called ``yyparse''; it is an integer valued function. When it is called, it in turn repeatedly calls ``yylex'', the lexical analyzer supplied by the user (see Section 3), to obtain input tokens. Eventually, either an error is detected, in which case (if no error recovery is possible) yyparse returns the value 1, or the lexical analyzer returns the endmarker token (type number 0), and the parser accepts. In this case, yyparse returns the value 0. .PP Three of the routines on the Yacc library are concerned with the ``external'' environment of yyparse. There is a default ``main'' program, a default ``initialization'' routine, and a default ``accept'' routine, respectively. They are so simple that they will be given here in their entirety: .DS main( argc, argv ) int argc; char *argv[ ] { yyinit( argc, argv ); if( yyparse( ) ) return; yyaccpt( ); } yyinit( ) { } yyaccpt( ) { } .DE By supplying his own versions of yyinit and/or yyaccpt, the user can get control either before the parser is called (to set options, open input files, etc.) or after the accept action has been done (to close files, call the next pass of the compiler, etc.). Note that yyinit is called with the two ``command line'' arguments which have been passed into the main program. If neither of these routines is redefined, the default situation simply looks like a call to the parser, followed by the termination of the program. Of course, in many cases the user will wish to supply his own main program; for example, this is necessary if the parser is to be called more than once. .PP The other major routine on the library is called ``yyerror''; its main purpose is to write out a message when a syntax error is detected. It has a number of hooks and handles which attempt to make this error message general and easy to understand. This routine is somewhat more complex, but still approachable: .DS extern int yyline; /* input line number */ yyerror(s) char *s; { extern int yychar; extern char *yysterm[ ]; printf("\en%s", s ); if( yyline ) printf(", line %d,", yyline ); printf(" on input: "); if( yychar >= 0400 ) printf("%s\en", yysterm[yychar\-0400] ); else switch ( yychar ) { case \'\et\': printf( "\e\et\en" ); return; case \'\en\': printf( "\e\en\en" ); return; case \'\e0\': printf( "$end\en" ); return; default: printf( "%c\en" , yychar ); return; } } .DE The argument to yyerror is a string containing an error message; most usually, it is ``syntax error''. yyerror also uses the external variables yyline, yychar, and yysterm. yyline is a line number which, if set by the user to a nonzero number, will be printed out as part of the error message. yychar is a variable which contains the type number of the current token. yysterm has the names, supplied by the user, for all the tokens which have names. Thus, the routine spends most of its time trying to print out a reasonable name for the input token. The biggest problem with the routine as given is that, on Unix, the error message does not go out on the error file (file 2). This is hard to arrange in such a way that it works with both the portable I/O library and the system I/O library; if a way can be worked out, the routine will be changed to do this. .ul Beware: This routine will not work if any token names have been given redefined type numbers. In this case, the user must supply his own yyerror routine. Hopefully, this ``feature'' will disappear soon. .PP Finally, there is another feature which the C user of Yacc might wish to use. The integer variable yydebug is normally set to 0. If it is set to 1, the parser will output a verbose description of its actions, including a discussion of which input symbols have been read, and what the parser actions are. Depending on the operating environment, it may be possible to set this variable by using a debugging system.