Understanding the Microsoft Intermediate Language
page 8 of 12
by Joydip Kanjilal
Feedback
Average Rating: 
Views (Total / Last 10 Days): 51948/ 124

Microsoft Intermediate Language (MSIL)

MSIL is defined as the CPU independent instruction set (also known as the Common Language Infrastructure or the CIL instruction set) that is generated on compilation by the CLR for programs that are written in languages that target the .NET managed environment. It is not interpreted, rather compiled to native code before its execution. When the compiler compiles the managed code inside the managed environment, it produces this intermediate code that is independent of the underlying OS or the system's hardware. This intermediate code is in turn converted to the native code by the Just in Time (JIT) compiler. This intermediate or MSIL code is verified for type safety at runtime to ensure security and reliability. It should be noted that this MSIL can be both generated and compiled to the native code in any "supported architecture" as it is an intermediate code. Further, the MSIL code is not converted to its entirety at one go. Rather, it is converted to the native code by the JIT compiler as and when it is needed at execution time or runtime. The resultant native code is also cached for fast access and references for subsequent calls.

The compiler produces metadata along with this MSIL code on compilation of any program that is targeted at the CLR's execution environment. The Metadata contains the assembly manifest that describes the MSIL. It typically contains the following:

·         Definition and signature of all types inside the code

·         The types that are referenced inside the code

·         Runtime information needed for execution

The assembly metadata helps in the following aspects:

·         Verification

·         Object Serialization

·         Garbage Collection

·         Reflection to inspect the types at runtime

Let us now understand the internals of the MSIL code that is generated on compilation of a simple source code. We would here consider the simplest possible code to avoid the complexities. Let us consider the following code shown in Listing 1.

Listing 1: A simple class to display a text in C#

public class test
{
 public static void Main(string[] args)
 {
   System.Console.WriteLine("Joydip Kanjilal");
 }
}

The following is the MSIL code for the class "test" that is generated on compilation of the source code shown in Listing 1.

Listing 2: The MSIL code that is generated on compilation

.class public auto ansi beforefieldinit test extends [mscorlib]System.Object
{
  .method public hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    .maxstack  1
    IL_0000:  ldstr      "Joydip Kanjilal"
    IL_0005:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000a:  ret
  }
 
.method public hidebysig specialname rtspecialname instance void  .ctor() cil managed
  {
    .maxstack  1
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  ret
  } 
}

Explanation

The IL instructions are actually either a 1 byte or a 2 byte operation codes or opcodes. This section discusses some of the opcodes that are in frequent use. We will now explain the code generated as MSIL shown in Listing 2 on compilation of the source code in Listing 1 to understand these concepts better.

It should be noted that any class in .NET implicitly derives from the class Object that belongs to the System namespace. The Mscorlib.dll contains the declarations of all the base classes from which the other classes are inherited. The .entrypoint directive indicates that the program's execution would start from this method only. The ret directive in both of these methods (the Main method and the default constructor) implies the end of the function call. Note that the statement that makes a call (using the "call" MSIL instruction) to the method WriteLine incorporates the method signature (method arguments and return type) and also the namespace and the class to which the method WriteLine belongs. This is helpful in validating the code for consistency and integrity at runtime.

The MSIL instruction ldstr is responsible for loading the string passed to it on the stack. The attribute hidebysig hides a method in one class from its derived classes in the hierarchy. The MSIL code is not interpreted; rather it is compiled by the JIT compiler at runtime to native code before its execution. The auto attribute implies that the layout of the class would be determined at runtime, while the ansi attribute is useful for interoperability between managed and un-managed code. Needless to say, the public attribute on a class member implies that the member can be invoked from any other part of the program. The static attribute implies that a member belongs to the class and not to its instance, i.e., a static member of a class is created in memory even before the class is instantiated. Further, a static member of a class is shared across all instances of it. Note that the .ctor statement in the MSIL code in Listing 2 above implies a constructor. Note the statement .maxstack in the MSIL code. This indicates the maximum number of elements that can be stored in the evaluation stack when a method is being executed. Hence, we are done with the explanation of the MSIL code as shown in Listing 2. Let us now understand what Portable Executable (PE) and the Common Object File Format (COFF) file formats (the format in which the MSIL code is stored) are. The following section discusses PE and COFF.


View Entire Article

User Comments

No comments posted yet.

Product Spotlight
Product Spotlight 





Community Advice: ASP | SQL | XML | Regular Expressions | Windows


©Copyright 1998-2021 ASPAlliance.com  |  Page Processed at 2021-02-28 11:20:54 PM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search