.if \n(xx .bp .if !\n(xx \{\ .so tmac.p \} .ND .nr H1 0 .af H1 A .NH Appendix to Wirth's Pascal Report .PP This section is an appendix to the definition of the Pascal language in Niklaus Wirth's .I "Pascal Report" and, with that Report, precisely defines the Berkeley implementation. This appendix includes a summary of extensions to the language, gives the ways in which the undefined specifications were resolved, gives limitations and restrictions of the current implementation, and lists the added functions and procedures available. It concludes with a list of differences with the commonly available Pascal 6000\-3.4 implementation, and some comments on standard and portable Pascal. .NH 2 Extensions to the language Pascal .PP This section defines non-standard language constructs available in .UP . The .B s standard Pascal option of the translator .PI can be used to detect these extensions in programs which are to be transported. .SH String padding .PP .UP will pad constant strings with blanks in expressions and as value parameters to make them as long as is required. The following is a legal .UP program: .LS \*bprogram\fP x(output); \*bvar\fP z : \*bpacked\fP \*barray\fP [ 1 .. 13 ] \*bof\fP char; \*bbegin\fP z := 'red'; writeln(z) \*bend\fP; .LE The padded blanks are added on the right. Thus the assignment above is equivalent to: .LS z := 'red ' .LE which is standard Pascal. .SH Octal constants, octal and hexadecimal write .PP Octal constants may be given as a sequence of octal digits followed by the character `b' or `B'. The forms .LS write(a:n \*boct\fP) .LE and .LS write(a:n \*bhex\fP) .LE cause the internal representation of expression .I a, which must be Boolean, character, integer, pointer, or a user-defined enumerated type, to be written in octal or hexadecimal respectively. .SH Assert statement .PP An .B assert statement causes a .I Boolean expression to be evaluated each time the statement is executed. A runtime error results if any of the expressions evaluates to be .I false . The .B assert statement is treated as a comment if run-time tests are disabled. The syntax for .B assert is: .LS \*bassert\fP .LE .br .ne 8 .NH 2 Resolution of the undefined specifications .SH File name \- file variable associations .PP Each Pascal file variable is associated with a named .UX file. Except for .I input and .I output, which are exceptions to some of the rules, a name can become associated with a file in any of three ways: .IP "\ \ \ \ \ 1)" 10 If a global Pascal file variable appears in the .B program statement then it is associated with .UX file of the same name. .IP "\ \ \ \ \ 2)" If a file was reset or rewritten using the extended two-argument form of .I reset or .I rewrite then the given name is associated. .IP "\ \ \ \ \ 3)" If a file which has never had .UX name associated is reset or rewritten without specifying a name via the second argument, then a temporary name of the form `tmp.x' is associated with the file. Temporary names start with `tmp.1' and continue by incrementing the last character in the .SM USASCII .NL ordering. Temporary files are removed automatically when their scope is exited. .SH The program statement .PP The syntax of the .B program statement is: .LS \*bprogram\fP ( { , } ) ; .LE The file identifiers (other than .I input and .I output ) must be declared as variables of .B file type in the global declaration part. .SH The files input and output .PP The formal parameters .I input and .I output are associated with the .UX standard input and output and have a somewhat special status. The following rules must be noted: .IP "\ \ \ \ \ 1)" 10 The program heading .B must contains the formal parameter .I output. If .I input is used, explicitly or implicitly, then it must also be declared here. .IP "\ \ \ \ \ 2)" Unlike all other files, the Pascal files .I input and .I output must not be defined in a declaration, as their declaration is automatically: .LS \*bvar\fP input, output: text .LE .IP "\ \ \ \ \ 3)" The procedure .I reset may be used on .I input. If no .UX file name has ever been associated with .I input, and no file name is given, then an attempt will be made to `rewind' .I input. If this fails, a run time error will occur. .I Rewrite calls to output act as for any other file, except that .I output initially has no associated file. This means that a simple .LS rewrite(output) .LE associates a temporary name with .I output. .SH Details for files .PP If a file other than .I input is to be read, then reading must be initiated by a call to the procedure .I reset which causes the Pascal system to attempt to open the associated .UX file for reading. If this fails, then a runtime error occurs. Writing of a file other than .I output must be initiated by a .I rewrite call, which causes the Pascal system to create the associated .UX file and to then open the file for writing only. .SH Buffering .PP The buffering for .I output is determined by the value of the .B b option at the end of the .B program statement. If it has its default value 1, then .I output is buffered in blocks of up to 512 characters, flushed whenever a writeln occurs and at each reference to the file .I input. If it has the value 0, .I output is unbuffered. Any value of 2 or more gives block buffering without line or .I input reference flushing. All other output files are always buffered in blocks of 512 characters. All output buffers are flushed when the files are closed at scope exit, whenever the procedure .I message is called, and can be flushed using the built-in procedure .I flush. .PP An important point for an interactive implementation is the definition of `input\(ua'. If .I input is a teletype, and the Pascal system reads a character at the beginning of execution to define `input\(ua', then no prompt could be printed by the program before the user is required to type some input. For this reason, `input\(ua' is not defined by the system until its definition is needed, reading from a file occurring only when necessary. .SH The character set .PP Seven bit .SM USASCII is the character set used on .UX . The standard Pascal symbols `and', 'or', 'not', '<=', '>=', '<>', and the uparrow `\(ua' (for pointer qualification) are recognized.\*(dg .FS \*(dgOn many terminals and printers, the up arrow is represented as a circumflex `^'. These are not distinct characters, but rather different graphic representations of the same internal codes. .FE Less portable are the synonyms tilde `~' for .B not , `&' for .B and , and `|' for .B or . .PP Upper and lower case are considered distinct. Keywords and built-in .B procedure and .B function names are composed of all lower case letters. Thus the identifiers GOTO and GOto are distinct both from each other and from the keyword \*bgoto\fP. The standard type `boolean' is also available as `Boolean'. .PP Character strings and constants may be delimited by the character `\'' or by the character `#'; the latter is sometimes convenient when programs are to be transported. Note that the `#' character has special meaning .up when it is the first character on a line \- see .I "Multi-file programs" below. .SH The standard types .PP The standard type .I integer is conceptually defined as .LS \*btype\fP integer = minint .. maxint; .LE .I Integer is implemented with 32 bit twos complement arithmetic. Predefined constants of type .I integer are: .LS \*bconst\fP maxint = 2147483647; minint = -2147483648; .LE .PP The standard type .I char is conceptually defined as .LS \*btype\fP char = minchar .. maxchar; .LE Built-in character constants are `minchar' and `maxchar', `bell' and `tab'; ord(minchar) = 0, ord(maxchar) = 127. .PP The type .I real is implemented using 64 bit floating point arithmetic. The floating point arithmetic is done in `rounded' mode, and provides approximately 17 digits of precision with numbers as small as 10 to the negative 38th power and as large as 10 to the 38th power. .SH Comments .PP Comments can be delimited by either `{' and `}' or by `(*' and `*)'. If the character `{' appears in a comment delimited by `{' and `}', a warning diagnostic is printed. A similar warning will be printed if the sequence `(*' appears in a comment delimited by `(*' and `*)'. The restriction implied by this warning is not part of standard Pascal, but detects many otherwise subtle errors. .SH Option control .PP Options of the translator may be controlled in two distinct ways. A number of options may appear on the command line invoking the translator. These options are given as one or more strings of letters preceded by the character `\-' and cause the default setting of each given option to be changed. This method of communication of options is expected to predominate for .UX . Thus the command .LS % \*bpi \-ls foo.p\fR .LE translates the file foo.p with the listing option enabled (as it normally is off), and with only standard Pascal features available. .PP If more control over the portions of the program where options are enabled is required, then option control in comments can and should be used. The format for option control in comments is identical to that used in Pascal 6000\-3.4. One places the character `$' as the first character of the comment and follows it by a comma separated list of directives. Thus an equivalent to the command line example given above would be: .LS {$l+,s+ listing on, standard Pascal} .LE as the first line of the program. The `l' option is more appropriately specified on the command line, since it is extremely unlikely in an interactive environment that one wants a listing of the program each time it is translated. .PP Directives consist of a letter designating the option, followed either by a `+' to turn the option on, or by a `\-' to turn the option off. The .B b option takes a single digit instead of a `+' or `\-'. .SH Notes on the listings .PP The first page of a listing includes a banner line indicating the version and date of generation of .PI . It also includes the .UX path name supplied for the source file and the date of last modification of that file. .PP Within the body of the listing, lines are numbered consecutively and correspond to the line numbers for the editor. Currently, two special kinds of lines may be used to format the listing: a line consisting of a form-feed character, control-l, which causes a page eject in the listing, and a line with no characters which causes the line number to be suppressed in the listing, creating a truly blank line. These lines thus correspond to `eject' and `space' macros found in many assemblers. Non-printing characters are printed as the character `?' in the listing.\*(dg .FS \*(dgThe character generated by a control-i indents to the next `tab stop'. Tab stops are set every 8 columns in .UX . Tabs thus provide a quick way of indenting in the program. .FE .SH Multi-file programs .PP It is also possible to prepare programs whose parts are placed in more than one file. The files other than the main one are called .B include files and have names ending with `.i'. The contents of an \*binclude\fR file are referenced through a pseudo-statement of the form: .LS #\*binclude\fR "file.i" .LE The `#' character must be the first character on the line. The file name may be delimited with `"' or `\'' characters. Nested .B include s are possible up to 10 deep. More details are given in sections 5.9 and 5.10. .SH The standard procedure write .PP If no minimum field length parameter is specified for a .I write, the following default values are assumed: .KS .TS center; l n. integer 10 real 22 Boolean 10 char 1 string length of the string oct 11 hex 8 .TE .KE The end of each line in a text file should be explicitly indicated by `writeln(f)', where `writeln(output)' may be written simply as `writeln'. For .UX , the built-in function `page(f)' puts a single .SM ASCII form-feed character on the output file. For programs which are to be transported the filter .I pcc can be used to interpret carriage control, as .UX does not normally do so. .NH 2 Restrictions and limitations .SH Files .PP Files cannot be members of files or members of dynamically allocated structures. .SH Arrays, sets and strings .PP The calculations involving array subscripts and set elements are done with 16 bit arithmetic. This restricts the types over which arrays and sets may be defined. The lower bound of such a range must be greater than or equal to \-32768, and the upper bound less than 32768. In particular, strings may have any length from 1 to 32767 characters, and sets may contain no more than 32767 elements. .SH Line and symbol length .PP There is no intrinsic limit on the length of identifiers. Identifiers are considered to be distinct if they differ in any single position over their entire length. There is a limit, however, on the maximum input line length. This is quite generous however, currently exceeding 160 characters. .SH Procedure and function nesting and program size .PP At most 20 levels of .B procedure and .B function nesting are allowed. There is no fundamental, translator defined limit on the size of the program which can be translated. The ultimate limit is supplied by the hardware and the fact that the \s-2PDP\s0-11 has a 16 bit address space. If one runs up against the `ran out of memory' diagnostic the program may yet translate if smaller procedures are used, as a lot of space is freed by the translator at the completion of each .B procedure or .B function in the current implementation. .SH Overflow .PP There is currently no checking for overflow on arithmetic operations at run-time. .br .ne 15 .NH 2 Added types, operators, procedures and functions .SH Additional predefined types .PP The type .I alfa is predefined as: .LS \*btype\fP alfa = \*bpacked\fP \*barray\fP [ 1..10 ] \*bof\fP \*bchar\fP .LE .PP The type .I intset is predefined as: .LS \*btype\fP intset = \*bset of\fP 0..127 .LE In most cases the context of an expression involving a constant set allows the translator to determine the type of the set, even though the constant set itself may not uniquely determine this type. In the cases where it is not possible to determine the type of the set from local context, the expression type defaults to a set over the entire base type unless the base type is integer\*(dg. .FS \*(dgThe current translator makes a special case of the construct `if ... in [ ... ]' and enforces only the more lax restriction on 16 bit arithmetic given above in this case. .FE In the latter case the type defaults to the current binding of .I intset, which must be ``type set of (a subrange of) integer'' at that point. .PP Note that if .I intset is redefined via: .LS \*btype\fP intset = \*bset of\fP 0..58; .LE then the default integer set is the implicit .I intset of Pascal 6000\-3.4 .SH Additional predefined operators .PP The relationals `<' and `>' of proper set inclusion are available. With .I a and .I b sets, note that .LS (\*bnot\fR (\fIa\fR < \fIb\fR)) <> (\fIa\fR >= \fIb\fR) .LE As an example consider the sets .I a = [0,2] and .I b = [1]. The only relation true between these sets is `<>'. .SH Non-standard procedures .IP argv(i,a) 25 where .I i is an integer and .I a is a string variable assigns the (possibly truncated or blank padded) .I i \|'th argument of the invocation of the current .UX process to the variable .I a . The range of valid .I i is .I 0 to .I argc\-1 . .IP date(a) assigns the current date to the alfa variable .I a in the format `dd mmm yy ', where `mmm' is the first three characters of the month, i.e. `Apr'. .IP flush(f) writes the output buffered for Pascal file .I f into the associated .UX file. .IP halt terminates the execution of the program with a control flow backtrace. .IP linelimit(f,x)\*(dd .FS \*(ddCurrently ignored by .X . .FE with .I f a textfile and .I x an integer expression causes the program to be abnormally terminated if more than .I x lines are written on file .I f . If .I x is less than 0 then no limit is imposed. .IP message(x,...) causes the parameters, which have the format of those to the built-in .B procedure .I write, to be written unbuffered on the diagnostic unit 2, almost always the user's terminal. .IP null a procedure of no arguments which does absolutely nothing. It is useful as a place holder, and is generated by .XP in place of the invisible empty statement. .IP remove(a) where .I a is a string causes the .UX file whose name is .I a, with trailing blanks eliminated, to be removed. .IP reset(f,a) where .I a is a string causes the file whose name is .I a (with blanks trimmed) to be associated with .I f in addition to the normal function of .I reset. .IP rewrite(f,a) is analogous to `reset' above. .IP stlimit(i) where .I i is an integer sets the statement limit to be .I i statements. Specifying the .B p option to .I pc disables statement limit counting. .IP time(a) causes the current time in the form `\ hh:mm:ss\ ' to be assigned to the alfa variable .I a. .SH Non-standard functions .IP argc 25 returns the count of arguments when the Pascal program was invoked. .I Argc is always at least 1. .IP card(x) returns the cardinality of the set .I x, i.e. the number of elements contained in the set. .IP clock returns an integer which is the number of central processor milliseconds of user time used by this process. .IP expo(x) yields the integer valued exponent of the floating-point representation of .I x ; expo(\fIx\fP) = entier(log2(abs(\fIx\fP))). .IP random(x) where .I x is a real parameter, evaluated but otherwise ignored, invokes a linear congruential random number generator. Successive seeds are generated as (seed*a + c) mod m and the new random number is a normalization of the seed to the range 0.0 to 1.0; a is 62605, c is 113218009, and m is 536870912. The initial seed is 7774755. .IP seed(i) where .I i is an integer sets the random number generator seed to .I i and returns the previous seed. Thus seed(seed(i)) has no effect except to yield value .I i. .IP sysclock an integer function of no arguments returns the number of central processor milliseconds of system time used by this process. .IP undefined(x) a Boolean function. Its argument is a real number and it always returns false. .IP wallclock an integer function of no arguments returns the time in seconds since 00:00:00 GMT January 1, 1970. .NH 2 Remarks on standard and portable Pascal .PP It is occasionally desirable to prepare Pascal programs which will be acceptable at other Pascal installations. While certain system dependencies are bound to creep in, judicious design and programming practice can usually eliminate most of the non-portable usages. Wirth's .I "Pascal Report" concludes with a standard for implementation and program exchange. .PP In particular, the following differences may cause trouble when attempting to transport programs between this implementation and Pascal 6000\-3.4. Using the .B s translator option may serve to indicate many problem areas.\*(dg .FS \*(dgThe .B s option does not, however, check that identifiers differ in the first 8 characters. .I Pi also does not check the semantics of .B packed . .FE .SH Features not available in Berkeley Pascal .IP Formal parameters which are .B procedure or .B function . .IP Segmented files and associated functions and procedures. .IP The function .I trunc with two arguments. .IP Arrays whose indices exceed the capacity of 16 bit arithmetic. .SH Features available in Berkeley Pascal but not in Pascal 6000-3.4 .IP The procedures .I reset and .I rewrite with file names. .IP The functions .I argc, .I seed, .I sysclock, and .I wallclock. .IP The procedures .I argv, .I flush, and .I remove. .IP .I Message with arguments other than character strings. .IP .I Write with keyword .B hex . .IP The .B assert statement. .SH Other problem areas .PP Sets and strings are more general in \* .UP ; see the restrictions given in the Jensen-Wirth .I "User Manual" for details on the 6000\-3.4 restrictions. .PP The character set differences may cause problems, especially the use of the function .I chr, characters as arguments to .I ord, and comparisons of characters, since the character set ordering differs between the two machines. .PP The Pascal 6000\-3.4 compiler uses a less strict notion of type equivalence. In .UP , types are considered identical only if they are represented by the same type identifier. Thus, in particular, unnamed types are unique to the variables/fields declared with them. .PP Pascal 6000\-3.4 doesn't recognize our option flags, so it is wise to put the control of .UP options to the end of option lists or, better yet, restrict the option list length to one. .PP For Pascal 6000\-3.4 the ordering of files in the program statement has significance. It is desirable to place .I input and .I output as the first two files in the .B program statement.