The compiler currently creates an instance of an instruction derived object at various points in the pipeline. Mostly this happens during source analysis (IL decoding), transformations from IL to IR and IR to x86 and at various points in between. While this makes the code easily written, it also has some issues - mostly with performance and non-deterministic memory usage.

A high priority change in 0.2 is to split the instructions into two pieces: The data and the logic (platform specific) - in a lot of cases, this will not cause the allocation of additional objects during stage transformations. In some cases it will. But mostly this will reduce the currently large number of virtual method calls and memory consumption.

One core idea is to make the current instruction objects immutable singletons, e.g. they will not contain changeable data and will not be instantiated more than once.

But first the new InstructionData type:

public struct InstructionData
  // Reference to the instruction in this slot
  private readonly Instruction instruction;
  // The destination operand of the instruction (where the result is placed into)
  private readonly Operand destination;
  // The first parameter to the instruction
  private readonly Operation first;
  // The second parameter to the instruction
  private readonly Operand second;
  // The source line, IL offset or something else to be able to map debug information
  private readonly int sourceLine;

  // Methods and ctors left out for clarity.

When a method is compiled from IL, it is currently represented as a list of instructions. In the future this will build an InstructionCollection, which consists of a list of InstructionData structures. This structure holds a reference to the instruction in its place and provides parameter data for the instruction itself. The user will only work with the InstructionData structure. Using generics and a value type allows the entire list to be allocated at once including space for all operands.

Some IL instructions use more than 3 operands. In this case, the instruction data structure shall hold the first three operands and the instruction shall hold the remaining operands. In this (and only this) case the Instruction will be allocated too.

The InstructionData value type is immutable, which means an existing instance will never change. This is important for some code analysis passes. In order to modify an InstructionData, it will be overwritten with an entirely new InstructionData instance.

All changes to an instruction happen through an InstructionCollection, which also maintains bits, which indicate if the data flow graphs or control flow graphs need to be rebuilt.

Last edited Jul 25, 2009 at 11:38 PM by __grover, version 1


No comments yet.