Oppolzer - Informatik / Stanford Pascal Compiler


Home       Lebenslauf       Schwerpunkte       Kenntnisse       Seminare       Kunden       Projekte       Produkte       Blog       Stanford Pascal       Kontakt

The Stanford Pascal Compiler / Evolution Steps

Back to Compiler main page

Porting Stanford Pascal to Windows, OS/2 and Linux - first steps

Since 2012, I had the plan to run Stanford Pascal programs on ASCII based platforms, too. The final goal is, of course, to run the compiler on those platforms and to get the same results running the programs compiled there.

But a first step would be to get some programs executed on Windows, for example, which behave the same as the original programs on VM/370.

To reach this goal, I wrote a P-Code interpreter program in C, which should be able to run on every platform that supports C. This program, called PCINT.C, first assembles the P-Code source into an internal (binary) representation which is held in storage, and then it interprets (that is: executes) this P-Code program, either controlled by commands from the console, or uncontrolled, that is, fast.

I started to write this program in 2012, but I soon paused work, because I discovered some severe portability issues in the P-Code which I couldn't resolve at that time. For example, some char values were represented by their numeric value (EBCDIC code point) in the P-Code, so running this on an ASCII platform would lead to wrong results. Another example: char set constant are implemented as bit strings (in fact: strings of integer constants), which are related to the specific character set used.

Some examples from real Pascal programs with the corresponding P-Code:


/***************************************************/ /* Pascal code snippets with portability issues */ /***************************************************/ CSET := [ 'B' , 'E' , 'R' , 'N' , 'D' ] ; ... for C := 'a' to 'z' do if C in CSET then CSET := CSET + [ MAJOR ( C ) ] ; /***************************************************/ /* generated P-Code */ /***************************************************/ LOC 51 LDA 1,364 LCA S,(0,0,0,0,0,0,0,0,0,0,0,0,11264,1088) -- set constant SLD 28,396 SMV 32,28 ... LOC 56 LDC C,'a' STR C,1,356 ... L12 LAB LOD C,1,356 ORD LDC I,169 -- should be 'z' NEQ I FJP L11

Now in 2016, because I was now able to make significant changes to the compiler, I continued the work on the P-Code interpreter. When I got some programs running including the Fibonacci test program involving recursive procedure calls, I ran again into the portability issues from 2012. I had to change the compiler on VM/370 (both passes), so that the generated P-Code becomes more portable across platforms.

For example, the P-Code for the example Pascal statements above now looks like this:


/***************************************************/ /* better P-Code */ /***************************************************/ LOC 51 LDA 1,364 LCA S,C28'BDENR' SLD 28,396 SMV 32,28 ... LOC 56 LDC C,'a' STR C,1,356 ... L12 LAB LOD C,1,356 ORD LDC C,'z' ORD NEQ I FJP L11

When this was finished, two test programs involving char sets ran successfully on Windows, yielding the same results as on Hercules (well, almost, because some code related issues remain visible at the source code level, e.g.: SUCC ('R') is 'S' on Windows, but not on the mainframe - this is OK in my opinion and has to be handled by the Pascal code).

The two test programs:

TESTSET.PAS
TESTSET2.PAS

As a side effect, I changed the set representation in the P-Code for sets, which are not sets of char, too.

Before my change, set constants were represented by strings of 16-bit integers, like this:


/***************************************************/ /* Pascal Code; some subset of a scalar type */ /***************************************************/ STERMSYMBOLE := [ EOFSY , STRIPU , ELSESY , ENDSY , OTHERWISESY , UNTILSY ] ; /***************************************************/ /* set representation in P-Code; */ /* almost unreadable to human readers */ /***************************************************/ LOC 2291 LCA S,(36864,768,4096,4096) SLD 8,1608 SMV 8,8

If the base type of the set is char, the set is represented by a char string which contains all the characters that are part of the set. But if the base type is not char, but some sort of scalar or other simple type, the set is represented as a bit string, but it is now printed using hex digits. The first char after the comma (X vs. C) tells the representation type. For example:


/***************************************************/ /* new set representation in P-Code; */ /* you can now spot the six one-bits, IMO */ /***************************************************/ LOC 2291 LCA S,X8'9000030010001000' SLD 8,1608 SMV 8,8

Regarding speed: I compared the speed of the (interpreted) FIBOK program to the same program running on Hercules on the same box (which of course is "interpreted", too - by the Hercules engine, which emulates the 370 hardware). The interpreted PRR code was slightly faster than the original Pascal program running on Hercules, which is an encouraging result. Maybe it will get slower, when I continue work; but on the other hand, there may be room for some improvements, too.

At the moment, the compiler itself doesn't run successfully on Windows; some problems still need to be fixed, and some instructions aren't implemented yet.

Example of a Stanford Pascal program running on Windows

KALENDER.PAS
P-Code Interpreter running KALENDER.PRR on Windows

Back to Compiler main page