05 April 2010

Implications of ECMA-335 and Partition I, 12.1.2 and 12.

As I’ve been spending quite some time this weekend to refactor parts of the MOSA compiler and fixing things small and large. I’ve stumbled once again over our memory model. I was refactoring our internal representation in order to make load and store operations explicit and broke almost all of our tests at once. Fixing them was pretty easy, except for the smaller types... Section 12.1.2 states:

„Loading from 1- or 2-byte locations (arguments, locals, fields, statics, pointers) expands to 4-byte values.“

Ouch. We’ve gone through a lot of trouble to ensure correct arithmetics on all types and have been badly missing the point: All smaller integral types are handled at 4 bytes in size on the evaluation stack.

Next step was to change the CIL load instructions to correctly reflect this fact and fortunately we already had the appropriate instructions in the IR. So the current state of work is that most of our tests are passing again, but not all yet. Then I started wondering about the floating point specification. Looking at the section for floating point values (12.1.3), it states:

„The supported storage sizes are float32 and float64. Everywhere else (on the evaluation stack, as arguments, as return types, and as local variables) floating-point numbers are represented using an internal floating-point type. In each such instance, the nominal type of the variable or expression is either float32or float64, but its value can be represented internally with additional range and/or precision. The size of the internal floating- point representation is implementation-dependent, can vary, and shall have precision at least as great as that of the variable or expression being represented. An implicit widening conversion to the internal representation from float32 or float64 is performed when those types are loaded from storage. The internal representation is typically the native size for the hardware, or as required for efficient implementation of an operation.“

So for floating point types we have exactly one stack type F, but the implementation is free to choose the precision of its operations as long as it is at least as large as the storage size of the floating point type. Since we’ve spent a great deal of time on single precision arithmetics, I’m inclined to keep the reduced precision operations there. Any opinions?

I’ll continue fixing this in the next couple of days.


illuminus said...

I think I've been vaguely aware of the expansion to 4 bytes on 1 or 2 byte values, but not enough to have articulated it or investigate it.

Since it doesn't forseeably affect any kind of compatibility, in order of my preference, here are the obvious choices:

1) Keep it as we have it.

2) Or change it, BUT find a way to use native-int size instead of a constant 4 bytes.

3) Change it to match specs.

As for the floating point, quietly having reduced precision under the hood does have an impact on developers. I can see skipping this for now for iteration reasons, but it needs to be marked for re-visitation later.

Keep up the great work!

Mosa Framework said...

When I worked on the x86 backend I tried to handle floating points as doubles if in doubt whether it is 32 or 64 bit and transformed them accordingly with the SSE instructions.
I'd vote for using double internally.

As for the integers: Always tried to load them as 4 byte and truncate them during stores back into memory. Had to fiddle around at some points because of failing testcases at the time.
So some of these maybe my mistake and I apologize for that.
In my opinion we'll need a good rewrite of our load/store mechanisms later on.

Michael said...

Unfortunately I'm doing the rewrite right now. It's not as bad as it looks though. We've basically had all IR instructions already there.

No apologies necessary. This is a huge feat and errors happen.