.so tmac.tr .de Ta .ta .8i +.8i +.8i +.8i +.8i +.8i +.8i .. .de Px .ta 3.5i .. .ds CF \s10- \\n(PN - \s0 .de Ap .bp .ce 10 \f3\\$1\f1 .ce 0 .sp 2 .if !''\\$2' 'so \\$2 .. .TR 84-10a .DA "June 27, 1984; Revised August 4, 1984" .Gr .TL Extensions to Version 5 of the Icon Programming Language .AU Ralph E. Griswold .AU Robert K. McConeghy .AU William H. Mitchell .AE .tr *\(** .NH Introduction .PP The standard features of Version 5 of Icon are described in Reference 1. Since Icon is the byproduct of a research effort that is concerned with the development of novel programming language facilities for processing nonnumeric data, it is inevitable that some extensions to the standard language will develop. .PP Some of these extensions are incorporated as features of new releases. Others are available as options that can be selected when the Icon system is installed [2]. This report describes the extensions that are included in Version 5.9 of Icon. .PP All the extensions are upward-compatible with standard Version 5 Icon. Their inclusion should not interfere with any program that works properly under the standard version. .NH New Version 5.9 Features .NH 2 The Link Directive .PP Version 5.9 contains a link directive that simplifies the inclusion of separately translated libraries of Icon procedures. If \fIicont(1)\fR [3] is run with the \*M\-c\fR option, source files are translated into intermediate \fIucode\fR files (with names ending in \*M.u1\fR and \*M.u2\fR). For example, .Ds icont -c libe.icn .De produces the ucode files \*Mlibe.u1\fR and \*Mlibe.u2\fR. The ucode files can be incorporated in another program with the new link directive, which has the form .Ds link libe .De The argument of \*Mlink\fR is, in general, a list of identifiers or string literals that specify the names of files to be linked (without the \*M.u1\fR or \*M.u2\fR). Thus .Ds link libe, "/usr/icon/ilib/collate" .De specifies the linking of \*Mlibe\fR in the current directory and \*Mcollate\fR in \*M/usr/icon/ilib\fR. .PP The environment variable \fIIPATH\fR controls the location of files specified in link directives. \fIIPATH\fR should be have a value of the form \fIp1:p2: .\^.\^. pn\fR where each \fIpi\fR names a directory. Each directory is searched in turn to locate files named in link directives. The default value of \fIIPATH\fR is `.', that is, the current directory. .NH 2 Installation Options .PP When an Icon system is installed, various configuration options are specified [2]. The value of the keyword \*M&options\fR is a string that contains the command line arguments that were used to configure Icon. .NH Optional Extensions .PP There are two extension options: sets (\*M\-sets\fR in \*M&options\fR), and a collection of experimental features (\*M\-xpx\fR in \*M&options\fR). .NH 2 Sets .PP Sets are unordered collections of values and have the properties normally associated with sets in the mathematical sense. The function .Ds set(a) .De creates a set that contains the distinct elements of the list \*Ma\fR. For example, .Ds set(\^["abc",\*b3]) .De creates a set with two members, \*Mabc\fR and 3. Note that .Ds set(\^[\^]) .De creates an empty set. Sets, like other data aggregates in Icon, need not be homogeneous \(em a set may contain members of different types. .PP Sets, like other Icon data aggregates, are represented by pointers to the actual data. Sets can be members of sets, as in .Ds s1 := set(\^[1,\*b2,\*b3]) s2 := set(\^[s1,\*b\^[\^]]) .De in which \*Ms2\fR contains two members, one of which is a set of three members and the other of which is an empty list. .PP Any specific value can occur only once in a set. For example, .Ds set(\^[1,\*b2,\*b3,\*b3,\*b1]) .De creates a set with the three members 1, 2, and 3. Set membership is determined the same way the equivalence of values is determined in the operation .Ds x === y .De For example, .Ds set(\^[\^[\^],\*b\^[\^]]) .De creates a set that contains two distinct empty lists. .PP The functions and operations of Icon that apply to other data aggregates apply to sets as well. For example, if \*Ms\fR is a set, .Ds *s .De is the size of \*Ms\fR (the number of members in it). Similarly, .Ds type(s) .De produces the string \*Mset\fR and .Ds s := set(\^["abc",\*b3]) write(image(s)) .De writes \*Mset(2)\fR. Note that the string images of sets are in the same style as for other aggregates, with the size enclosed in parentheses. .PP The operation .Ds !s .De generates the members of \*Ms\fR, but in no predictable order. Similarly, .Ds ?s .De produces a randomly selected member of \*Ms\fR. These operations produce values, not variables \(em it is not possible to assign a value to \*M!s\fR or \*M?s\fR. .PP The function .Ds copy(s) .De produces a new set, distinct from \*Ms\fR, but which contains the same members as \*Ms\fR. The copy is made in the same fashion as the copy of a list \(em the members themselves are not copied. .PP The function .Ds sort(s) .De produces a list containing the members of \*Ms\fR in sorted order. Sets themselves occur after tables but before records in the sorting order. .PP The customary set operations are provided. The function .Ds member(s,\*bx) .De succeeds and returns the value of \*Mx\fR if \*Mx\fR is a member of \*Ms\fR, but fails otherwise. Note that .Ds member(s1,\*bmember(s2,\*bx)) .De succeeds if \*Mx\fR is a member of both \*Ms1\fR and \*Ms2\fR. .PP The function .Ds insert(s,\*bx) .De inserts \*Mx\fR into the set \*Ms\fR and returns the value of \*Ms\fR (it is similar to \*Mput(a,\*bx)\fR in form). Note that .Ds insert(s,\*bs) .De adds \*Ms\fR as an member of itself. .PP The function .Ds delete(s,\*bx) .De deletes the member \*Mx\fR from the set \*Ms\fR and returns the value of \*Ms\fR. .PP The functions \*Minsert(s,\*bx)\fR and \*Mdelete(s,\*bx)\fR always succeed, whether or not \*Mx\fR is in \*Ms\fR. This allows their use in loops in which failure may occur for other reasons. For example, .Ds s := set(\^[\^]) while insert(s,\*bread()) .De builds a set that consists of the (distinct) lines from the standard input file. .PP The operations .Ds s1 ++ s2 s1 ** s2 s1 -- s2 .De create the union, intersection, and difference of \*Ms1\fR and \*Ms2\fR, respectively. In each case, the result is a new set. .PP The use of these operations on csets is unchanged. There is no automatic type conversion between csets and sets; the result of the operation depends on the types of the arguments. For example, .Ds \&'aeiou' ++ 'abcde' .De produces the cset \*Mabcdeiou\fR, while .Ds set(\^[1,\*b2,\*b3]) ++ set(\^[2,\*b3,\*b4]) .De produces a set that contains 1, 2, 3, and 4. On the other hand, .Ds set(\^[1,\*b2,\*b3]) ++ 4 .De results in Run-time Error 119 (\*Mset expected\fR). .SH Examples .a .LP \fIWord Counting:\fR .PP The following program lists, in alphabetical order, all the different words that occur in the standard input file: .Ds .Px procedure main() letter := &lcase ++ &ucase words := set(\^[\^]) while text := read() do text ? while tab(upto(letter)) do insert(words,\*btab(many(letter))) every write(!sort(words)) end .De .LP \fIThe Sieve of Eratosthenes:\fR .PP The follow program produces prime numbers, using the classical ``Sieve of Eratosthenes'': .Ds .Px procedure main(a) local limit, s, i limit := a\^[1] | 5000 # limit to 5000 if not specified s := set(\^[]) every insert(s,\*b1 to limit) every member(s,\*bi := 2 to limit) do every delete(s,\*bi + i to limit by i) primes := sort(s) write("There are ",\*b*primes,\*b" primes in the first ",\*blimit,\*b" integers.") write("The primes are:") every write(right(!primes,\*b*limit + 1)) end .De .NH Expermental Features .NH 2 PDCO Invocation Syntax .PP The experimental features include the procedure invocation syntax that is used for programmer-defined control operations [4]. In this syntax, when braces are used in place of parentheses to enclose an argument list, the arguments are passed as a list of co-expressions. That is, .Ds p{\*1, \*2, \*(El, \*n} .De is equivalent to .Ds p(\^[create \*1, create \*2, \*(El, create \*n]) .De Note that .Ds p{\^} .De is equivalent to .Ds p(\^[\^\^]) .De .NH 2 Invocation Via String Name .PP The experimental features allow a string-valued expression that corresponds to the name of a procedure or operation to be used in place of the procedure or operation in an invocation expression. For example, .Ds "image"(x) .De produces the same call as .Ds image(x) .De and .Ds "-"(i,\*bj) .De is equivalent to .Ds i - j .De .PP In the case of operations, the number of arguments determines the operation. Thus .Ds "-"(i) .De is equivalent to .Ds -i .De Since \*Mto-by\fR is an operation, despite its reserved-word syntax, it is included in this facility with the string name \*M...\fR . Thus .Ds "..."(1,\*b10,\*b2) .De is equivalent to .Ds 1 to 10 by 2 .De Similarly, range specifications are represented by \*M":"\fR, so that .Ds ":"(s,i,j) .De is equivalent to .Ds s\^[i:j] .De .PP Defaults are not provided for omitted or null-valued arguments in this facility. Consequently, .Ds "..."(1,\*b10) .De results in a run-time error when it is evaluated. .PP The subscripting operation also is available with the string name \*M[\^\^]\fR. Thus .Ds "[\^\^]"(&lcase,\*b3) .De produces \*Mc\fR. .PP String names are available for all the operations in Icon, but not for control structures. Thus .Ds "|"(\*1,\*b\*2) .De is erroneous. Note that string scanning is a control structure. .PP Field references, of the form .Ds \*0 . \fIfieldname\fR .De are not operations in the ordinary sense and are not available via string invocation. .PP String names for procedures are available through global identifiers. Note that the names of functions, such as \*Mimage\fR, are global identifiers. Similarly, any procedure-valued global identifier may be used as the string name of a procedure. Thus in .Ds global q procedure main() q := p "q"("hi") end procedure p(s) write(s) end .De the procedure \*Mp\fR is invoked via the global identifier \*Mq\fR. .NH 2 Conversion to Procedure .PP The experimental features include the function \*Mproc(x,\*bi)\fR, which converts \*Mx\fR to a procedure, if possible. If \*Mx\fR is procedure-valued, its value is returned unchanged. If the value of \*Mx\fR is a string that corresponds to the name of a procedure as described in the preceding section, the corresponding procedure value is returned. The value of \*Mi\fR is used to distinguish between unary and binary operators. For example, \*Mproc("^",\*b2)\fR produces the exponentiation operator, while \*Mproc("^",\*b1)\fR produces the co-expression refresh operator. If \*Mx\fR cannot be converted to a procedure, \*Mproc(x,\*bi)\fR fails. .NH 2 Integer Sequences .PP To facilitate the generation of integer sequences that have no limit, the experimental features include the function \*Mseq(i,\*bj)\fR. This function has the result sequence {\^i, i+j, i+2j, \*(El }. Omitted or null values for \*Mi\fR and \*Mj\fR default to 1. Thus the result sequence for \*Mseq()\fR is {\^1, 2, 3, \*(El }. .SH Acknowledgements .PP The design of sets for Icon was done as part of a class project. In addition to the authors of this paper, the following persons participated in the design: John Bolding, Owen Fonorow, Roger Hayes, Tom Hicks, Robert Kohout, Mark Langley, Susan Moore, Maylee Noah, Janalee O'Bagy, Gregg Townsend, and Alan Wendt. .SH References .LP 1. Griswold, Ralph E. and Madge T. Griswold. \fIThe Icon Programming Language\fR, Prentice-Hall, Inc., Englewood Cliffs, New Jersey. 1983. .LP 2. Griswold, Ralph E. and William H. Mitchell. \fIInstallation and Maintenance Instructions for Version 5.9 of Icon\fR, Technical Report TR 84-13, Department of Computer Science, The University of Arizona. August 1984. .LP 3. Griswold, Ralph E. and William H. Mitchell. \fIICONT(1)\fR, manual page for \fIUNIX Programmer's Manual\fR, Department of Computer Science, The University of Arizona. August 1984. .LP 4. Griswold, Ralph E. and Michael Novak. ``Programmer-Defined Control Operations'', \fIThe Computer Journal\fR, Vol. 26, No. 2 (May 1983). pp. 175-183.