Oppolzer - Informatik / Stanford Pascal Compiler


Home       Lebenslauf       Schwerpunkte       Kenntnisse       Seminare       Kunden       Projekte       Produkte       Blog       Stanford Pascal       Kontakt

The Stanford Pascal Compiler / Evolution Steps

Back to Compiler main page

Allow shorter string constants on const initializers and assignments

When working on my source code formatter, I had a strange problem with indentation. The indentation on comments did not work correctly; I could not find the reason. The same program worked on the Free Pascal compiler (on Windows) without problems.

I finally found out that the compiler mixed up two variables with long names, which had the same starting 12 characters. That is: only 12 characters are significant with identifiers in this Pascal implementation (IIRC, the standard from the 1970s says, that 10 is the minimum). But IMO this is an unacceptable low number, so I decided to change that. The new number should be 16 or 20.

There is a constant IDLNGTH = 12 which controls the size of the identifiers in the internal tables. When I tried to set this to 16, I got many (some hundred) syntax errors, because the compiler doesn't accept string constant initializers which differ in length from the definition, for example:


type ALPHA = packed array [ 1 .. IDLNGTH ] of CHAR ; (*********************************************************) (* new reserved symbols in the 2011 version: *) (* break, return, continue *) (*********************************************************) const RW : array [ 1 .. NRSW ] of ALPHA = ( 'IF ' , 'DO ' , 'OF ' , 'TO ' , 'IN ' , 'OR ' , 'END ' , 'FOR ' , 'VAR ' , 'DIV ' , 'MOD ' , 'SET ' , 'AND ' , 'NOT ' , 'THEN ' , 'ELSE ' , 'WITH ' , 'GOTO ' , 'CASE ' , 'TYPE ' , 'FILE ' , 'BEGIN ' , 'UNTIL ' , 'WHILE ' , 'ARRAY ' , 'CONST ' , 'LABEL ' , 'BREAK ' , 'REPEAT ' , 'RECORD ' , 'DOWNTO ' , 'PACKED ' , 'RETURN ' , 'FORWARD ' , 'PROGRAM ' , 'FORTRAN ' , 'EXTERNAL ' , 'FUNCTION ' , 'CONTINUE ' , 'PROCEDURE ' , 'OTHERWISE ' ) ;

If the type ALPHA, which depends on IDLNGTH, is changed to length 16, I will get a syntax error for every initializer in the RW definition.

This looked unacceptable to me. I had already in 2011 the idea, that short string constants in initializations and assignments (maybe function calls) should be allowed. So I examined how the compiler could be extended to do this. It turned out to be not too diffcult; the string constants are adjusted to the new length directly after reading, depending on the length of the referencing type, and the missing blanks are appended to the buffer in the internal constant description. When the constant is written to the P-Code file, it already looks very nice, and all works well.

BTW: I am comparing the compiler all the time to the FPC (Free Pascal compiler) on Windows, and this time I first discovered some kind of problem, because FPC fills the strings with hex zeroes, if the initializers are shorter. But I don't want this behaviour on the mainframe, where other languages like PL/1 etc. fill with blanks, and every user of this compiler IMO would expect that strings would be handled this way. So it is like it is, and I have to accept this difference to FPC.

When I finished this work, the compiler accepted shorter string constant on const initializers and on string assignments, and I compiled the following program successfully:


program TESTPACK ( INPUT , OUTPUT ) ; type WORT = packed array [ 1 .. 10 ] of CHAR ; var X : array [ 1 .. 10 ] of CHAR ; Y : packed array [ 1 .. 10 ] of CHAR ; const Z : array [ 1 .. 4 ] of WORT = ( 'Bernd' , 'Sissi' , 'Lukas' , 'Marius' ) ; Z2 : WORT = 'OPPOLZER' ; begin (* HAUPTPROGRAMM *) X := 'TEST' ; Y := 'TEST2' ; WRITELN ( X , Y , Z [ 1 ] , Z [ 2 ] , Z [ 3 ] , Z [ 4 ] ) ; WRITELN ( 'TEST: ' , Z [ 1 ] [ 7 ] = ' ' ) ; WRITELN ( Z2 ) ; WRITELN ( 'TEST: ' , ORD ( Z [ 1 ] [ 7 ] ) ) ; end (* HAUPTPROGRAMM *) .

The char arrays of length 10 are all filled with blank to the end; there is no need any more to code the string constants all with length 10, which makes things a lot easier IMO.

Back to Compiler main page