28.9.08

Is C#'s Emitted IL Optimized?

Using .NET's Reflection library to dynamically compile methods at runtime is very powerful. However, I want to know if I have to optimize my opcodes, or if they will be internally simplified if possible. I will test delegates like such:
const int ci = 2*i;
int constDeli () { return ci; }
int mulDeli () { return 2*i; }
int addDeli () { return 21+22+...+2i

For an set of the integer i, I will create a constant delegate that returns the value of 2 times i. I will calculate this while emitting the delegate and inline the value, to avoid any calculations.

I will also create a delegate which actually performs the multiplication of 2 by i and returns the value.

Finally, I will create a delegate which performs i additions of 2, achieving the same value through more operations.

Here's my test code:
class ILOptimizedTest
{
public const int count = 600000000;

public static void RunILOptimizedTest()
{
for (int i = 2; i < 1000; i+=50)
{
Func constDel = GetConstDelegate(i);
Func mulDel = GetMulDelegate(i);
Func addDel = GetAddDelegate(i);

// Run once to get rid of any startup time
double constResult = constDel();
Stopwatch constTime = Stopwatch.StartNew();
for (int c = 0; c < count; c++)
{
constResult = constDel();
}
constTime.Stop();

// Run once to get rid of any startup time
double mulResult = mulDel();
Stopwatch mulTime = Stopwatch.StartNew();
for (int c = 0; c < count; c++)
{
mulResult = mulDel();
}
mulTime.Stop();

// Run once to get rid of any startup time
double addResult = addDel();
Stopwatch addTime = Stopwatch.StartNew();
for (int c = 0; c < count; c++)
{
addResult = addDel();
}
addTime.Stop();

Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{6}",
i,
constTime.ElapsedMilliseconds,
mulTime.ElapsedMilliseconds,
addTime.ElapsedMilliseconds,
constResult, mulResult, addResult);
}
}

// eval(2*i)
public static Func GetConstDelegate(int i)
{
DynamicMethod m = new DynamicMethod(
"Const" + DateTime.Now.ToString(),
typeof(int),
new Type[] { },
typeof(ILOptimizedTest),
false);
ILGenerator il = m.GetILGenerator();

int total = (int) (2*i);

il.Emit(OpCodes.Ldc_I4, total);
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(
typeof(Func));
return del;
}

// 2*i
public static Func GetMulDelegate(int i)
{
DynamicMethod m = new DynamicMethod(
"Mul" + DateTime.Now.ToString(),
typeof(int),
new Type[] { },
typeof(ILOptimizedTest),
false);
ILGenerator il = m.GetILGenerator();

il.Emit(OpCodes.Ldc_I4, (int)2);
il.Emit(OpCodes.Ldc_I4, i);
il.Emit(OpCodes.Mul);
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(
typeof(Func));
return del;
}

// 2+2+2+2 ... i times
public static Func GetAddDelegate(int i)
{
DynamicMethod m = new DynamicMethod(
"Add" + DateTime.Now.ToString(),
typeof(int),
new Type[] { },
typeof(ILOptimizedTest),
false);
ILGenerator il = m.GetILGenerator();

il.Emit(OpCodes.Ldc_I4, (int)2);
for (int n = i; n > 1; n--)
{
il.Emit(OpCodes.Ldc_I4, (int)2);
il.Emit(OpCodes.Add);
}
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(
typeof(Func));
return del;
}
}

After verifying that all three delegates are returning correct values, I cranked it up and let it go.
























Time to calculate 2*i 600,000,000 times (ms)
iConstMultAdd
2473248054979
52469146895243
102464346705255
152465949405255
202468149465303
252471649905368
302562750685423
352465750085263
402469853165401
452471149855282
502470949755286
552470249605261
602467149755257
652468349905280
702467155625323
752468549785260
802467749725260
852466849595261
902469549645259
952465449605277

The first thing to notice is that these times are a sum of running the delegate 600 million times. So the take home message is..who cares? Closer inspection reveals some problems:

It's obvious that the constant delegate runs faster than the multiplication delegate, which runs faster than the addition delegate. However, it's not clear why. As i increases, the addition delegate should linearly increase due to the increased opcodes—it doesn't. All three are actually pretty constant, which is expected for the first two methods, but not addition.

Overall, I'm not convinced these results show anything. I tried adding a dummy parameter to the delegate, and passing in the loop variable so that the result couldn't be cached, but it had no effect. My only conclusion is that something is being optimized. If it was the IL, then all three of the methods should have taken the same amount of time. Due to the number of iterations performed and the total time taken, I'm pretty confident in the numbers I got—I just don't have a clue what they mean.

The next step is to compile the dynamic method to a DLL, then use Reflector to inspect it to see exactly what is being returned. I could also inspect my loops to see if any of the code there is being optimized away.

So far, results are inconclusive.

No comments: