.so tmac.tr .DA "August 22, 1984" .TR 84-14 .Gr .TL Personalized Interpreters for Icon .AU Ralph E. Griswold .AU Robert K. McConeghy .AU William H. Mitchell .AE .tr *\(** .NH Introduction .PP Despite the fact that the Icon programming language has a large repertoire of functions and operations for string and list manipulation, as well as for more conventional computations [1], users frequently need to extend that repertoire. While many extensions can be written as procedures that build on the existing repertoire, there are some kinds of extensions for which this approach is unacceptably inefficient, inconvenient, or simply impractical. .PP Icon itself is written primarily in C [2] and its built-in functions are written as corresponding C functions. Thus the natural way to extend Icon's computational repertoire is to add new C functions to it. .PP The Icon system is organized so that this is comparatively easy to do. Adding a new function does not require changes to the Icon translator, since all functions have a common syntactic form. An entry must be made in a table that is used by the linker and the run-time system in order to identify built-in functions and connect references to them to the code itself. .PP The problem arises in incorporating the C code in the Icon run-time system. Prior to Version 5.9 of Icon, there were two separate but similar implementations of Icon: a compiler [3] and an interpreter [4]. The primary difference between the two systems is that the linker for the compiler generates assembly-language code, while the linker for the interpreter generates code that is ready to be interpreted. The interpreter uses a preconstructed run-time system, so that the assembly and loading phases of the compiler implementation is not needed. .PP The loading phase in the compiler is quite slow, so that when the compiler implementation of Icon is used, there is a substantial delay before getting into execution. This is a significant problem during program debugging. Furthermore, a compiled Icon program runs only slightly faster than an interpreted Icon program. This is due in large part to the fact that most programs spend only a small percentage of their time in code generated for the program itself; most of the time is spent executing code in the run-time system, which is essentially the same in the two implementations. .PP The primary advantage of the compiler is that it is possible to add new functions during the loading phase. In order to communicate the names of new functions to the linker, it is necessary to include ``external'' declarations in the Icon source programs that use these functions. There is no way to do this in the interpreter implementation, since the run-time system is preconstructed, rather than being built when the source-language program is processed. .PP One disadvantage of the external function approach is that every source program that uses an external function must contain a declaration for that function. In addition to the necessity for having to remember these declarations, external functions are, by their nature, not logically part of Icon proper. This results in problems of documentation and distribution of such functions to other users. .PP An alternative method of adding new functions to either the compiler or the interpreter implementation of Icon is to add the corresponding C functions to the Icon system itself and to rebuild the entire system. This approach is impractical for many applications. If the extensions are not of general interest, it is inappropriate to include them in the public version of Icon. On the other hand, Icon is a large and complicated system, and having many private versions may create serious problems of maintenance and disk usage. Furthermore, rebuilding the Icon system is slow, cumbersome, and comparatively complicated. This approach therefore is impractical in a situation such as a class in which students implement their own versions of an extension. .PP To remedy these problems, a mechanism for building ``personalized interpreters'' has been added to Version 5.9 of Icon. This mechanism allows a user to add C functions and to build a corresponding interpreter quickly, easily, and without the necessity to have a copy of the source code for the entire Icon system. .PP To construct a personalized interpreter, the user must perform a one-time set up that copies relevant source files to a directory specified by the user and builds the nucleus of a run-time system. Once this is done, the user can add and modify C functions and include them in the personalized run-time system with little effort. .PP Since the linker must know the names of built-in functions, a personalized linker is constructed. In order to run Icon programs with the personalized run-time system, a personalized command processor, which knows the location of the personalized linker and run-time system, is provided also. .PP The modifications that can be made to Icon via a personalized interpreter essentially are limited to the run-time system: the addition of new functions, modifications to existing functions and operations, and modifications and additions to support routines. There is no provision for changing the syntax of Icon, incorporating new operators, keyword, or control structures. .NH Building and Using a Personalized Interpreter .NH 2 Setting Up a Personalized Interpreter System .PP To set up a personalized interpreter, a new directory should be created solely for the use of the interpreter; otherwise files may be accidentally destroyed by the set-up process. For the purpose of example, suppose this directory is named \*Mmyicon\fR. The set-up process consists of .Ds mkdir myicon cd myicon icon\-pi .De Note that \*Micon\-pi\fR must be run in the area in which the personalized interpreter is to be built. The location of \*Micon\-pi\fR may vary from site to site [5]. .PP The shell script \*Micon\-pi\fR constructs three subdirectories: \*Mh\fR, \*Mstd\fR, and \*Mpi\fR. The subdirectory \*Mh\fR contains header files that are needed in C routines. The subdirectory \*Mstd\fR contains the portions of the Icon system that are needed to build a personalized interpreter. The subdirectory \*Mpi\fR contains a \*MMakefile\fR for building a personalized interpreter and also is the place where source code for new C functions normally resides. Thus work on the personalized interpreter is done in \*Mmyicon/pi\fR. .PP The \*MMakefile\fR that is constructed by \*Micon\-pi\fR contains two definitions to facilitate building personalized interpreters: .IP \*MOBJS\fR .5i a list of object modules that are to be added to or replaced in the run-time system. \*MOBJS\fR initially is empty. .IP \*MLIB\fR a list of library options that are used when the run-time system is built. \*MLIB\fR initially is empty. .LP See the listing of a generic version of this \*MMakefile\fR in Appendix A. .NH 2 Building a Personalized Interpreter .PP Performing a \fImake\fR in \*Mmyicon/pi\fR creates three files in \*Mmyicon\fR: .Ds .ta 1i picont \fRcommand processor\*M pilink \fRlinker\*M piconx \fRrun-time system\*M .De A link to \*Mpicont\fR also is constructed in \*Mmyicon/pi\fR so that the new personalized interpreter can be tested in the directory in which it is made. .PP The file \*Mpicont\fR normally is built only on the first \fImake\fR. The file \*Mpilink\fR is built on the first \fImake\fR and is rebuilt whenever the repertoire of built-in functions is changed. The file \*Mpiconx\fR is rebuilt whenever the source code in the run-time system is changed. .PP The user of the personalized interpreter uses \*Mpicont\fR in the same fashion that the standard \*Micont\fR is used [4]. (Note that the accidental use of \*Micont\fR in place of \*Mpicont\fR may produce mysterious results.) In turn, \*Mpicont\fR translates a source program using the standard Icon translator and links it using \*Mpilink\fR. The resulting icode file uses \*Mpiconx\fR. .PP The relocation bits and symbol tables in \*Mpicont\fR, \*Mpilink\fR, and \*Mpiconx\fR can be removed by .Ds make Strip .De in \*Mmyicon/pi\fR. This reduces the sizes of these files substantially but may interfere with debugging. .PP If a \fImake\fR is performed in \*Mmyicon/pi\fR before any run-time files are added or modified, the resulting personalized interpreter is identical to the standard one. Such a \fImake\fR can be performed to verify that the personalized interpreter system is performing properly. .PP Note that a personalized interpreter inherits the parameters and configuration of the locally installed version of Icon in \*Mv5g\fR, including optional language extensions [6]. The file \*Mmyicon/h/config.h\fR contains configuration information. The definitions in this file should not be changed. .NH 2 Adding a New Function .PP To add a new function to the personalized interpreter, it is first necessary to provide the C code, adhering to the conventions and data structures used throughout Icon. See [2]. Some examples of C functions taken from the Icon program library [7] are included in Appendix B of this report. The source code for these functions is contained in \*Mv5g/pifunc\fR, where \*Mv5g\fR is the root of the Icon system. The location of \*Mv5g\fR varies from site to site [5]. The directory \*Mv5g/functions\fR contains the source code for the standard built-in functions, which also can be used as models for new ones. .PP Suppose that \*Mgetenv\fR from the Icon program library is to be added to a personalized interpreter. The source code can be obtained by .Ds cp v5g/pifuncs/getenv.c myicon/pi .De (Note that the actual paths will be different, depending on the local hierarchy.) .PP Three things now need to be done to incorporate this function in the personalized interpreter: .IP 1. Add a line consisting of .Ds PDEF(getenv) .De to \*Mmyicon/h/pdef.h\fR in proper alphabetical order. This causes the linker and the run-time system to know about the new function. .IP 2. Add \*Mgetenv.o\fR to the definition of \*MOBJS\fR in \*Mmyicon/pi/Makefile\fR. This causes \*Mgetenv.c\fR to be compiled and the resulting object file to be loaded with the run-time system when a \fImake\fR is performed. .IP 3. Perform a \fImake\fR in \*Mmyicon/pi\fR. The result is new versions of \*Mpilink\fR and \*Mpiconx\fR in \*Mmyicon\fR. .LP The function \*Mgetenv\fR now can be used like any other built-in function. .PP More than one function can be included in a single source file. See \*Mmath.c\fR in Appendix B. Note that \*Mmath.c\fR uses the math library. To add this module to the run-time system of a personalized interpreter, \*MPDEF\fR entries should be made for each function in \*Mmath.c\fR, \*Mmath.o\fR should be added to \*MOBJS\fR, and \*M\-lm\fR should be added to \*MLIB\fR in the \*MMakefile\fR. .NH 2 Modifying the Existing Run-Time System .PP The use of personalized interpreters is not limited to the addition of new functions. Any module in the standard run-time system can be modified as well. The run-time system is divided into five parts: .RS .IP \*Mv5g/functions\fR 1.2i built-in functions .IP \*Mv5g/operators\fR built-in operators .IP \*Mv5g/rt\fR run-time support routines .IP \*Mv5g/lib\fR routines called by the interpreter .IP \*Mv5g/iconx\fR the interpreter and start-up routines .RE .LP For example, storage allocation routines are contained in \*Mv5g/rt/alc.c\fR. .PP To modify an existing portion of the Icon run-time system, copy the source code file from the standard system to \*Mmyicon/pi\fR. (Source code for a few run-time routines is placed in \*Mmyicon/std\fR when a personalized interpreter is set up. Check this directory first and use that file, if appropriate, rather than making another copy in \*Mmyicon/pi\fR.) When a source-code file in \*Mmyicon/pi\fR has been modified, place it in the \*MOBJS\fR list just like a new file and perform a \fImake\fR. Note that an entire module must be replaced, even if a change is made to only one routine. Any module that is replaced must contain all the global variables in the original module to prevent \fIld(1)\fR from also loading the original module. There is no way to delete routines from the run-time system. .PP The directory \*Mmyicon/h\fR contains header files that are included in various source-code files. For example, error message text for a new run-time error can be provided by adding it to \*Mmyicon/h/err.h\fR. The file \*Mmyicon/h/rt.h\fR contains declarations and definitions that are used throughout the run-time system. This is where the declaration for the structure of a new type of data object would be placed. .PP Care must be taken when modifying header files not to make changes that would produce inconsistencies between previously compiled components of the Icon run-time system and new ones. .SH References .IP 1. Griswold, Ralph E. and Griswold, Madge T. \fIThe Icon Programming Language\fR. Prentice-Hall, Inc., Englewood Cliffs, New Jersey. 1983. .IP 2. Griswold, Ralph E., Robert K. McConeghy, and William H. Mitchell. \fIA Tour Through the C Implementation of Icon; Version 5.9\fR. Technical Report TR 84-11, Department of Computer Science, The University of Arizona. August 1984. .IP 3. Griswold, Ralph E. and William H. Mitchell. \fIICONC(1)\fR, manual page for \fIUNIX Programmer's Manual\fR, Department of Computer Science, The University of Arizona. July 1983. .IP 4. Griswold, Ralph E. and William H. Mitchell. \fIICONT(1)\fR, manual page for \fIUNIX Programmer's Manual\fR, Department of Computer Science, The University of Arizona. August 1984. .IP 5. Griswold, Ralph E. and William H. Mitchell. \fIInstallation and Maintenance Guide for Version 5.9 of Icon\fR. Technical Report TR 84-13, Department of Computer Science, The University of Arizona, Tucson, Arizona. August 1984. .IP 6. Griswold, Ralph E., Robert K. McConeghy, and William H. Mitchell. \fIExtensions to Version 5 of the Icon Programming Language\fR. Technical Report TR 84-10a, Department of Computer Science, The University of Arizona. August 1984. .IP 7. Griswold, Ralph E. \fIThe Icon Program Library\fR, Technical Report TR 84-12, Department of Computer Science, The University of Arizona. August 1984. .am Ds .ps 8 .vs 9 .. .am De .ps 10 .vs 12 .. .de Ta .ta .8i +.8i +.8i +.8i +.8i +.8i +.8i +.8i .. .Ap "Appendix A \(em Makefile for Personalized Interpreters" .sp .PP The ``generic'' \*MMakefile\fR for personalized interpreters follows. The values of \*MPATH\fR and \*MDIR\fR are filled in when \*MPimake\fR is run. .Ds CFLAGS= LDFLAGS= LIB= iroot=PATH V5GBIN=$(iroot)/bin DIR= .Dd # # To add or replace object files, add their names to the OBJS list below. # For example, to add foo.o and bar.o, use: # # OBJS=foo.o bar.o (this is a sample line) # # For each object file added to OBJS, add a dependency line to reflect files # that are depended on. For example, if foo.c includes rt.h # which is located in the h directory use # # foo.o: ../h/rt.h # .Dd OBJS= .Dd PIOBJS=../std/init.o ../std/strprc.o RTOBJS=$(PIOBJS) $(OBJS) .Dd Pi: ../picont ../piconx ../pilink .Dd ../picont: ../std/icont.c rm -f ../picont picont cc -o ../picont -DIntBin="\e"$(DIR)\e"" -DIconx="\e"$(DIR)/piconx\e"" \e -DIconxEnv="\e"ICONX=$(DIR)/piconx\e"" \e -DILINK="\e"$(DIR)/pilink\e"" \e -DITRAN="\e"$(V5GBIN)/itran\e"" -DFORK=QFORK \e ../std/icont.c ln ../picont .Dd ../pilink: ../std/linklib ../std/builtin.o cc $(LDFLAGS) -X -o ../pilink ../std/builtin.o ../std/linklib .Dd ../piconx: ../std/rtlib $(RTOBJS) cc $(LDFLAGS) -X -o ../piconx -e start -u start $(RTOBJS) ../std/rtlib $(LIB) ../std/init.o: ../h/rt.h ../h/err.h ../h/config.h ../h/pdef.h cd ../std; cc -c init.c .Dd ../std/builtin.o: ../std/ilink.h ../h/config.h ../h/pdef.h cd ../std; cc -c builtin.c .Dd ../std/strprc.o: ../h/rt.h ../h/pnames.h ../h/config.h ../h/pdef.h cd ../std; cc -c strprc.c .Dd Strip: ../picont ../piconx ../pilink strip ../picont ../piconx ../pilink .De .Ap "Appendix B \(em C Functions from the Icon Program Library" .sp .SH getenv.c: .LP .Ds /* # GETENV(3.icon) # # Get contents of environment variables # # Stephen B. Wampler # # Last modified 8/19/84 # */ .Dd #include "../h/rt.h" .Dd /* * getenv(s) - return contents of environment variable s */ .Dd Xgetenv(nargs, arg1, arg0) int nargs; struct descrip arg1, arg0; { register char *p; register int len; char sbuf\^[MAXSTRING]; extern char *getenv(); extern char *alcstr(); .Dd DeRef(arg1) .Dd if (!QUAL(arg1)) /* check legality of argument */ runerr(103, &arg1); if (STRLEN(arg1) \*(<= 0 || STRLEN(arg1) \*(>= MAXSTRING) runerr(401, &arg1); qtos(&arg1, sbuf); /* convert argument to C-style string */ .Dd if ((p = getenv(sbuf)) != NULL) { /* get environment variable */ len = strlen(p); sneed(len); STRLEN(arg0) = len; STRLOC(arg0) = alcstr(p, len); } else /* fail if variable not in environment */ fail(); } .Dd Procblock(getenv,\*b-1) .De .bp .SH math.c: .LP .Ds /* # MATH(3.icon) # # Miscellaneous math functions # # Ralph E. Griswold # # Last modified 8/19/84 # */ .Dd #include "../h/rt.h" #include .Dd int errno; /* * exp(x), x in radians */ Xexp(nargs, arg1, arg0) int nargs; struct descrip arg1, arg0; { int t; double y; union numeric r; double exp(); if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1); y = exp(r.real); if (errno == ERANGE) runerr(252, NULL); mkreal(y,\*b&arg0); } Procblock(exp,\*b1) .Dd /* * log(x), x in radians */ Xlog(nargs, arg1, arg0) int nargs; struct descrip arg1, arg0; { int t; double y; union numeric r; double log(); if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1); y = log(r.real); if (errno == EDOM) runerr(251, NULL); mkreal(y,\*b&arg0); } Procblock(log,\*b1) .Dd /* * log10(x), x in radians */ Xlog10(nargs, arg1, arg0) int nargs; struct descrip arg1, arg0; { int t; double y; union numeric r; double log10(); if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1); y = log10(r.real); if (errno == EDOM) runerr(251, NULL); mkreal(y,\*b&arg0); } Procblock(log10,\*b1) .Dd /* * sqrt(x), x in radians */ Xsqrt(nargs, arg1, arg0) int nargs; struct descrip arg1, arg0; { int t; double y; union numeric r; double sqrt(); if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1); y = sqrt(r.real); if (errno == EDOM) runerr(251, NULL); mkreal(y,\*b&arg0); } Procblock(sqrt,\*b1) .De .bp .SH seek.c: .Ds /* # SEEK(3.icon) # # Seek to position in stream # # Stephen B. Wampler # # Last modified 8/19/84 # */ .Dd #include "../h/rt.h" .Dd /* * seek(file,\*boffset,\*bstart) - seek to offset byte from start in file. */ .Dd Xseek(nargs, arg3, arg2, arg1, arg0) int nargs; struct descrip arg3, arg2, arg1, arg0; { long l1, l2; int status; FILE *fd; long ftell(); .Dd DeRef(arg1) if (arg1.type != D_FILE) runerr(106); .Dd defint(&arg2, &l1, 0); defshort(&arg3, 0); .Dd fd = BLKLOC(arg1)->file.fd; .Dd if ((BLKLOC(arg1)->file.status == 0) || (fseek(fd, l1, arg3.value.integr) == -1)) fail(); mkint(ftell(fd), &arg0); } .Dd Procblock(seek,\*b3) .De .LP