28.9.08

Is C#'s Emitted IL Optimized?

Using .NET's Reflection library to dynamically compile methods at runtime is very powerful. However, I want to know if I have to optimize my opcodes, or if they will be internally simplified if possible. I will test delegates like such:
const int ci = 2*i;
int constDeli () { return ci; }
int mulDeli () { return 2*i; }
int addDeli () { return 21+22+...+2i

For an set of the integer i, I will create a constant delegate that returns the value of 2 times i. I will calculate this while emitting the delegate and inline the value, to avoid any calculations.

I will also create a delegate which actually performs the multiplication of 2 by i and returns the value.

Finally, I will create a delegate which performs i additions of 2, achieving the same value through more operations.

Here's my test code:
class ILOptimizedTest
{
public const int count = 600000000;

public static void RunILOptimizedTest()
{
for (int i = 2; i < 1000; i+=50)
{
Func constDel = GetConstDelegate(i);
Func mulDel = GetMulDelegate(i);
Func addDel = GetAddDelegate(i);

// Run once to get rid of any startup time
double constResult = constDel();
Stopwatch constTime = Stopwatch.StartNew();
for (int c = 0; c < count; c++)
{
constResult = constDel();
}
constTime.Stop();

// Run once to get rid of any startup time
double mulResult = mulDel();
Stopwatch mulTime = Stopwatch.StartNew();
for (int c = 0; c < count; c++)
{
mulResult = mulDel();
}
mulTime.Stop();

// Run once to get rid of any startup time
double addResult = addDel();
Stopwatch addTime = Stopwatch.StartNew();
for (int c = 0; c < count; c++)
{
addResult = addDel();
}
addTime.Stop();

Console.WriteLine("{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{6}",
i,
constTime.ElapsedMilliseconds,
mulTime.ElapsedMilliseconds,
addTime.ElapsedMilliseconds,
constResult, mulResult, addResult);
}
}

// eval(2*i)
public static Func GetConstDelegate(int i)
{
DynamicMethod m = new DynamicMethod(
"Const" + DateTime.Now.ToString(),
typeof(int),
new Type[] { },
typeof(ILOptimizedTest),
false);
ILGenerator il = m.GetILGenerator();

int total = (int) (2*i);

il.Emit(OpCodes.Ldc_I4, total);
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(
typeof(Func));
return del;
}

// 2*i
public static Func GetMulDelegate(int i)
{
DynamicMethod m = new DynamicMethod(
"Mul" + DateTime.Now.ToString(),
typeof(int),
new Type[] { },
typeof(ILOptimizedTest),
false);
ILGenerator il = m.GetILGenerator();

il.Emit(OpCodes.Ldc_I4, (int)2);
il.Emit(OpCodes.Ldc_I4, i);
il.Emit(OpCodes.Mul);
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(
typeof(Func));
return del;
}

// 2+2+2+2 ... i times
public static Func GetAddDelegate(int i)
{
DynamicMethod m = new DynamicMethod(
"Add" + DateTime.Now.ToString(),
typeof(int),
new Type[] { },
typeof(ILOptimizedTest),
false);
ILGenerator il = m.GetILGenerator();

il.Emit(OpCodes.Ldc_I4, (int)2);
for (int n = i; n > 1; n--)
{
il.Emit(OpCodes.Ldc_I4, (int)2);
il.Emit(OpCodes.Add);
}
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(
typeof(Func));
return del;
}
}

After verifying that all three delegates are returning correct values, I cranked it up and let it go.
























Time to calculate 2*i 600,000,000 times (ms)
iConstMultAdd
2473248054979
52469146895243
102464346705255
152465949405255
202468149465303
252471649905368
302562750685423
352465750085263
402469853165401
452471149855282
502470949755286
552470249605261
602467149755257
652468349905280
702467155625323
752468549785260
802467749725260
852466849595261
902469549645259
952465449605277

The first thing to notice is that these times are a sum of running the delegate 600 million times. So the take home message is..who cares? Closer inspection reveals some problems:

It's obvious that the constant delegate runs faster than the multiplication delegate, which runs faster than the addition delegate. However, it's not clear why. As i increases, the addition delegate should linearly increase due to the increased opcodes—it doesn't. All three are actually pretty constant, which is expected for the first two methods, but not addition.

Overall, I'm not convinced these results show anything. I tried adding a dummy parameter to the delegate, and passing in the loop variable so that the result couldn't be cached, but it had no effect. My only conclusion is that something is being optimized. If it was the IL, then all three of the methods should have taken the same amount of time. Due to the number of iterations performed and the total time taken, I'm pretty confident in the numbers I got—I just don't have a clue what they mean.

The next step is to compile the dynamic method to a DLL, then use Reflector to inspect it to see exactly what is being returned. I could also inspect my loops to see if any of the code there is being optimized away.

So far, results are inconclusive.

Doing Better than Math.Pow()

This is a sequal of sorts to my previous series on expression evaluation and relates to the framework I built there.

The source to C#'s Math.Pow() method isn't publically available, but it probably uses an exponential approximation to calculate powers of numbers, using the following:
a^b = e^(b*ln(a))

This makes it extremely dependable, value-wise and time-wise. A power of .5 takes the same time to calculate as a power of 2, or 2.01, or 53 (This seems to hold for small powers, but not for larger ones. There appears to be a cutoff determined by the combination of a and b, at which point a different method is used, which takes about three times as long).

However, when calculating integer powers, multiplication can also be used, and it is much faster for small integers. Consider the following data:

































Time to calculate 3^i 800,000 times
iControlMath.PowMult
21119019
3816424
4816128
5816533
6816041
7815943
8815948
9915954
10815858
11816565
12815871
13816174
14815979
15815986
16815991
17815895
18815999
198158106
208159113
218159118
228158123
238159126
248158136
259161134
268158145
278159149
288161158
298159158


The first column is the exponent, and other three columns are the total number of milliseconds elapsed while evaluating 3^i 800000 times for each method. The first result is actually a control, where I just call a static method which returns -1. This indicates how long the static method call actually takes, in comparison to the actual calculation.

The second result is the time it took to call Math.Pow(3,i), while the last column is the result of an inline interative multiplication calculation. It's obvious that until the very highest powers tested, inline multiplication is faster--with a power of 2, it's 10x faster. Obviously inline multiplication cannot be used for decimal powers, but it would be great if I could take advantage of this speed for integer powers between 2 and 28 (and -2..-28, by using a^-b = 1/(a^b))

Easy Pickings

public override Node Simplify(Node node)
{
Number x = node.Left.Value as Number;
Number y = node.Right.Value as Number;

if (x != null && y != null)
{
return new Node(new Number(Math.Pow(x.Value, y.Value)));
}

if (y != null)
{
if (y.Value == -1)
{
Node n = new Node(new Division());
n.Left = Number.One;
n.Right = node.Left;
}
if (y.Value == 0)
{
return Number.One;
}
if (y.Value == 1)
{
return node.Left;
}
}
}

After the expression tree is parsed, it goes through a simplification phase, where I iterate through it in postfix order, simplifying each node. This is the SimplifyNode() implementation for a Power Node. As the section heading says, these are the easy pickings. If both the argument and the power (a and b) are numbers, I just evaluate the power and replace this node with the numerical result. I also replace some simple powers with a shorter form. This will help simplify the tree, but won't provide any further benefit.

In-Place Expansion

if (y.Value >= 2 && y.Value <= 28)
{
// Iteratively construct multiplication tree to replace power operation.
Node n = new Node(new Multiplication());
n.Left = node.Left;
n.Right = node.Left;
for (int i = 0; i < y.Value - 2; i++)
{
Node nParent = new Node(new Multiplication());
nParent.Left = n;
nParent.Right = node.Left;

n = nParent;
}
return n;
}

Adding this to the SimplifyNode() method will handle the cases of integer powers greater than 2 and less than 28, which is what the earlier table indicated would benefit. It actually converts a Power node into a Multiplication node with nested Multiplication nodes--in essence, converting x^2 into x*x, and x^3 into x*(x*x), etc. Let's see how this compares when I compile the tree and execute it:































Time to evaluate an x^i tree 800,000 times
iMath.PowMult
223599
3226103
4214116
5229111
6233115
7228155
8279157
9230149
10221133
11211137
12223147
13218186
14236175
15266161
16215197
17255203
18259212
19224177
20218197
21219186
22227193
23234235
24254240
25263254
26266257
27263271
28270266
29261269

Things are good, as expected. However, there is one huge shortcoming: expanding the tree causes repeated calculations:
x^3 = x*x*x
cos(x)^3 = cos(x)*cos(x)*cos(x)
















Time to evaluate an cos(x)^i tree 800,000 times
iMath.PowMult
2302204
3283253
4286303
5286365
6287419
7291482
8289534
9289595
10291645
11283739
12280739
13285798
14285845

The table above makes it clear that actual tree expansion is a terrible way to convert exponents to multiplications. Instead, I want something like this:
double x = Math.Cos(x);
return x*x*x

This is actually achievable, since I don't need to evaluate the expression being raised to a power more than once. However, my current system doesn't support this. The way I compile from an expression tree into IL is based on the CLR stack, so by the time I'm emitting the power opcodes, the arguments are already on the stack. I need to work within these confines to achieve this.

Enter the PowerExp


I created a new operator, the PowerExp. It won't be created automatically by parsing, but rather by the simplification engine.

Since all the operands are loaded to the stack before the operator, I had to remove the ^2 part from the tree. I could have done a conversion like this:
x^2 => Square(x)
x^3 = Cube(x)

That would require quite a few operators, though. Instead, I made one operator that takes the exponent as a Property. Consider these two methods:
private static Node BuildPowerTree(int n)
{
Node powerNode = new Node(new Power());
powerNode.Left = Parameter.X;
powerNode.Right = Number.Node(n);
return powerNode;
}
private static Node BuildPowerExpTree(int n)
{
Node powerExpNode = new Node(new PowerExp(n));
powerExpNode.Left = Parameter.X;
return powerExpNode;
}

In the second method, I pass n into the operator rather than making it the Right node. This allows me to avoid pushing it to the stack. Now I can implement the PowerExp emitter:
// Eval n^power.  n is already on the stack
public override void Emit(System.Reflection.Emit.ILGenerator ilg)
{
ilg.Emit(OpCodes.Stloc_0); // Store n
ilg.Emit(OpCodes.Ldloc_0); // Load n
for (int i = this.Power; i > 1; i--)
{
ilg.Emit(OpCodes.Ldloc_0); // Load n
ilg.Emit(OpCodes.Mul); // Multiply
} // Repeat to achieve power
}

Timings:































Time to evaluate an cos(x)^i tree 800,000 times
iPowerExpMath.PowMult
2183306221
3157301270
4154304333
5160298366
6161298432
7164299500
8165289556
9166289601
10165290658
11170293712
12173292770
13170291829
14178294877
15175283909
16175288952
171842791007
181752851055
191752821105
201812781172
211822811219
221882911280
231862781330
241892811400
251952831446
261962811499
272012851547
281982871596
292002881670

The results basically speak for themselves. In fact, PowerExp is faster than Math.Pow() up to around x^60. That's 59 multiplication opcodes in a row! Plus, this method doesn't clutter my tree with extra nodes. For evaluating f(x)^i...
  1. Basic multiplying decreases performance due to redundant evaluations—never a good idea

  2. Math.Pow is constant across f(x) and i (true for small values)

  3. PowerExp is linear with respect to i, regardless of complexity of f(x)

To summarize, I found a way to greatly improve Power evaluation performance for small integer values. By writing rules into my simplification engine, I can convert certain optimal cases from Power into PowerExp, taking advantage of this speed improvement.

27.9.08

Dynamic C# Compilation

This is a sequal to my previous series on expression evaluation follows up what I talked about there. I recommend reading those articles first.

In my previous post, I determined that the fastest way to repeatedly evaluate a math expression given at runtime was to dynamically compile it, by parsing it into a tree and emitting opcodes. The sample code given in that post is functional, but hardly complete. Having every operation in a giant switch is not practical considering there are over 25 operations that I should be supporting. I need to refactor. I also need a way to enter paramters, so I can graph equations in x and y.

Here's the new version:
public static EvaluatorDelegate GetDynamicMethod(Tree<IToken> e, string[] parameters)
{
DynamicMethod m = new DynamicMethod(
"Evaluation" + DateTime.Now.ToString(),
typeof(double),
new Type[] { typeof(double[]) },
typeof(Evaluator),
false);
ILGenerator il = m.GetILGenerator();

LocalBuilder temp1 = il.DeclareLocal(typeof(double));

foreach (IToken t in e.PostFix)
{
// Try op
IOperator op = t as IOperator;
if (op != null)
{
op.Emit(il);
continue;
}

// Try number
Number n = t as Number;
if (n != null)
{
il.Emit(OpCodes.Ldc_R8, n.Value);
continue;
}

// Try parameter
Parameter p = t as Parameter;
if (p != null)
{
// Check to see if we know which parameter that is.
int pindex = Array.IndexOf<string>(parameters, p.Name);
if (pindex < 0)
{
throw new ApplicationException(
"Did not receive information on parameter: " + p.ToString());
}

il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldc_I4, pindex);
il.Emit(OpCodes.Ldelem_R8);

continue;
}

throw new ApplicationException(
"Invalid parse tree element: " + t.ToString());
}

il.Emit(OpCodes.Ret);

EvaluatorDelegate del = (EvaluatorDelegate)m.CreateDelegate(
typeof(EvaluatorDelegate));
return del;
}


The first thing to notice is that the large operator switch has been replaced with some type switching. I have a base interface IOperator, which takes care of emitting its own opcodes.

Numbers are emitted the same as before—it just pushes them to the stack as a 64-bit floating-point type.

Parameters are handled through an array of doubles. The method now takes a parameter array which is used as a reference. To make this a bit easier to understand, I'll start with how this code is used:
EvaluatorDelegate del = GetDynamicMethod(tree, new string[] { "x", "y" });
double result = del(new double[] { x, y });

The delegate created by this method takes a double array whose elements correspond to the string array passed into GetDynamicMethod(). Looking at the code, when it finds a parameter, it checks the string array for the name of the parameter and saves that index. Then it loads the parameter value from the double array with the following:
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ldc_I4, pindex);
il.Emit(OpCodes.Ldelem_R8);

These opcodes push the double array (which is passed into the delegate as the first method argument) to the stack, followed by the desired index (found by searching the string array). The last opcode pops both of those and replaces them with the desired array element.

This could probably be improved by creating a delegate with an argument for each desired parameter. The DynamicMethod constructor takes a Type[] detailing these, so I could create that from the string array. This would remove the array indexing and reduce the opcodes to a single load—increasing performance. I'll do that later.

That's basically the only changes to this method. The nature of postfix notation—where an operator comes after its arguments, is exactly how .NET's CLR works. Assuming an expression is syntactically correct and parsed to a tree correctly, an operator is guaranteed to have its operands already on the stack by the time Emit() is called. Look at this implementation:
public class Addition : BaseOperator
{
public override void Emit(System.Reflection.Emit.ILGenerator ilg)
{
ilg.Emit(OpCodes.Add);
}
}

Painfully easy. Method calls are also possible:
public class Sine : BaseFunction
{
public override void Emit(System.Reflection.Emit.ILGenerator ilg)
{
ilg.Emit(OpCodes.Call, typeof(System.Math).GetMethod("Sin"));
}
}

Some are not quite as simple:
public class Secant : BaseFunction
{
public override void Emit(System.Reflection.Emit.ILGenerator ilg)
{
ilg.Emit(OpCodes.Call, typeof(System.Math).GetMethod("Cos"));
ilg.Emit(OpCodes.Stloc_0);
ilg.Emit(OpCodes.Ldc_R8, 1.0D);
ilg.Emit(OpCodes.Ldloc_0);
ilg.Emit(OpCodes.Div);
}
}

Secant is defined as 1/cos(x), so I evaluate cosine, then store that in the first local variable (this is the temp1 variable created above). I push 1 on the stack, followed by the temp1 variable, and divide. The Div opcode returns double.NaN when dividing by zero, so no error checking is required.

21.9.08

Translating Manually-Buffered Graphics

This is a sequel of sorts to my previous post on double-buffering and builds on what I did there.

When I was 14 or so, I wrote a game in Visual Basic 3. It was a snake game, very originally named Nybbles, where multiple players steered their hungry, ever-increasing snake around the board to eat targets. Each target caused the snake to increase in size, making it more difficult to move around the board without hitting itself (or the other players!). When I was making this, I ran into a classic redraw problem: it took too long to redraw the entire board.

Perhaps it wasn't the most efficient method, but I stored the snakes' position on the game board in a large 2-dimensional array, where each position had a number, 1-4 for a player, or 0 if the square was blank. I used this for drawing the board and collision detection. Drawing the board was technically simple: iterate through every square and draw the correct colour segment in each one.

Of course, since each snake moved 1 square per turn, I quickly realized I could merely draw the new head segment, and redraw the tail segment as a normal square. This method only required drawing two squares per snake, which was much more responsive.

Fast-forward a decade or so, and I have returned to the same problem. I'm working on my math graphing program, and just added zoom and translation control via the mouse.



The goal: To enable dragging the graph around to view a different part of it.

The problem: Recalculating the points and redrawing the entire screen can potentially be too slow.

The solution: Translate the valid portion of the already-drawn graph, and invalidate and redraw the rest. Since the MouseMove event can be fired for every pixel change if the mouse is moving slow enough, this should provide serious time savings.



Here's a basic look at step 1, which is shown above. This graph was dragged to the down and right, leaving the black area as an invalided region. First I need to translate the graph in the direction of the mouse movement, and second, I can fill in the black area to complete the graph.
protected override void OnMouseMove(MouseEventArgs e)
{
if (this.MouseMode == MouseMode.Move && e.Button == MouseButtons.Left)
{
int xOffset = (e.X - this.startDragLocation.X);
int yOffset = (e.Y - this.startDragLocation.Y);

Rectangle translated = new Rectangle(new Point(xOffset, yOffset), this.Size);
using (BufferedGraphics newBuffer =
this.context.Allocate(this.buffer.Graphics, translated))
{
// Draw the current image translated into the newBuffer,
// then copy it back, but with everything shifted.
this.buffer.Render(newBuffer.Graphics);
newBuffer.Render(this.buffer.Graphics);
}

using (Graphics graphics = this.CreateGraphics())
{
buffer.Render(graphics);

This is (slightly simplified) my code for moving the buffer. I create a new buffer with the offset passed in as the targetRectangle, and when I render the old buffer into it, it draws it with an offset. Then I render that back into the original buffer, overwriting the previous image. After this swap, the original buffer has been translated, so I can discard the new buffer.

This may not be the best or most efficient way to achieve this, but it's the only way I could find. From reading the documentation on MSDN, one would have have a hard time believing this actually works..but it does. I looked into the TranslateTransform of the buffer's graphics, but that didn't yield any results. This appears to be very fast, so until I hear otherwise, I'm keeping it.

Rendering the original (now current) buffer to the screen gives something like the screenshot above, except remnants of the previous screen is visible instead of the black border. This is the next phase, where I draw the invalidated regions.
Region invalidedRegion = new Region(new Rectangle(Point.Empty, this.Size));
invalidedRegion.Exclude(translated);

this.Invalidate(invalidedRegion); <---

The region to invalidate is pretty easy to calculate. I start with the footprint of the control, and exclude the location of the translated image. I planned to use the built-in handling for invalidated regions, but it turned out to be too slow. The issue is simply that it queues up OnPaint() messages to redraw all the rectangles contained in the region--but these will execute after any pending OnMouseMove messages. Rapid movement will cause multiple OnMouseMove events, and the invalid region for each needs to be redrawn in sync with the location of the graph.
RectangleF[] invalidRectangles = invalidedRegion.GetRegionScans(new Matrix());
foreach (RectangleF ir in invalidRectangles)
{
this.OnPaint(new PaintEventArgs(graphics,
new Rectangle((int)ir.X, (int)ir.Y,
(int)ir.Width, (int)ir.Height)));
}

This piece of code replaces the Invalidate() call. It breaks up the region into rectangles (only 1 or 2 rectangles can be created from the moves we are doing), and forces an immediate OnPaint() of that rectangle.

Of course, this entire operation requires me to rewrite the DrawAxis and DrawGraph methods to take a region, otherwise they will redraw everything, which defeats the purpose of these changes. OnPaint takes a PaintEventArgs which contains a clipping rectangle--the same one I constructed above. I will pass that to the Draw methods, so that they draw everything whenever the entire control is invalidated, and they will redraw selectively when dragging the image around.

The result: This is much faster than before. There is still some fine-tuning required, as it slows down a bit if I maximize the window. This causes some glitching at the edges, but overall it looks pretty smooth thanks to the double buffering.

20.9.08

Manual Double-Buffering in C#

I believe double buffering is enabled by default on most .NET controls, which is convenient for the majority of simple graphics. I intend to draw once and use repeatedly, which makes manual buffering a requisite.

This program plots mathematical equations (i.e. y=sin(x)) on the surface of a custom Control. The actual rendering of a plot could take from 14ms to 200ms or even more, depending on the quality and complexity. However, I want to draw tangent lines to follow the curve in real-time--I cannot afford 14ms between MouseMove events.



The process is simple:
  1. Render the graph to a buffer stored in memory

  2. Copy the buffer to the screen

  3. When the mouse is clicked and moved, re-copy the buffer to the screen, then draw a new tangent line


Here's an excerpt from the constructor. Note, this class extends Control.
private BufferedGraphicsContext context;
private BufferedGraphics buffer;

public GraphPanel()
{
this.DoubleBuffered = false;
this.SetStyle(ControlStyles.UserPaint, true);
this.SetStyle(ControlStyles.OptimizedDoubleBuffer, true);
this.SetStyle(ControlStyles.AllPaintingInWmPaint, true);

this.context = new BufferedGraphicsContext();
using (Graphics graphics = this.CreateGraphics())
{
this.buffer = this.context.Allocate(graphics,
new Rectangle(new Point(0, 0), this.Size));
}
}

The first four lines disable automatic double buffering. I found out the hard way that all of these are actually needed.

The context allows me to manage my buffers. I use it to create a new buffer which matches the graphics of the Control, and size it to the control. I can just reuse this buffer every time I recalculate the graph, or when I'm drawing tangent lines on top of it.

Of course, I need to put similar code in the OnResize method, so I can adjust the buffer size when the window is resized.
protected override void OnPaint(PaintEventArgs e)
{
this.buffer.Graphics.FillRectangle(Brushes.White, this.ClientRectangle);

this.DrawAxis();
this.DrawGraph();

this.buffer.Render(e.Graphics);
}

The OnPaint gets fired everytime the window (or any subregion) gets invalidated. Since I set all those styles in the constructor, this is the place for things to get drawn. This event occurs when the program starts up, is resized, comes back from being minimized, or if it is invalidated programatically. It will not be fired if I move the window around, or click on things in it. Consequently, I will do a complete recalculation and drawing of the axis and graph in this method. This is important to keep in mind, since this is the most time-intensive part of the entire process.

The first thing I do is whitewash the buffer. Then I have two methods, DrawAxis and DrawGraph, which do exactly what they say. They are written to use this.buffer as well, so by the time they are finished, I can copy the buffer to the screen using the Graphics object provided in the EventArgs.
protected override void OnMouseMove(MouseEventArgs e)
{
if (this.MouseMode == MouseMode.Tangent && e.Button == MouseButtons.Left)
{
// Restore the clean version of the graph
using (Graphics graphics = this.CreateGraphics())
{
this.buffer.Render(graphics);
}

// Verify mouse is inside control
if (this.ClientRectangle.Contains(e.Location))
{
// Plot a line tangent to the graph at the location of the mouse
this.DrawTangentLine(Graph.ConvertToRealCoords(e.Location));
}
}
}

If the user presses the left button and moves the mouse, I want to draw a tangent line. I have to ensure the left button is pressed, since the same event is also fired if the mouse is moving idly across the screen. The first step is to copy the graph from the buffer onto the screen, to remove any previous tangent line. Next, I verify the mouse is inside the control and draw the tangent. The DrawTangentLine method, unlike the DrawGraph and DrawAxis methods, does not draw to the buffer, but rather to the screen directly.

The screenshot above shows this code in action. Using a buffer like this reduces drawing a tangent to just copying the buffer and drawing a line--of course I have to calculate the tangent, but the majority of the work has been removed.

Generating C# at Runtime: Casting Your Args

I spent the morning (like so many morning before this one), trying to debug some Reflection.Emit code. Naturally, the actual implementation is so much more complicated than this example I whipped up, which explains why it took so long to isolate the issue. Take a look at this sample code:

DynamicMethod m = new DynamicMethod("Demo", typeof(double), 
new Type[] { typeof(double[]) }, typeof(Program), false);
ILGenerator il = m.GetILGenerator();

il.Emit(OpCodes.Ldc_R8, 1.0);
il.Emit(OpCodes.Ldc_R8, 1); // <---
il.Emit(OpCodes.Add);
il.Emit(OpCodes.Ret);

Func del = (Func)m.CreateDelegate(typeof(Func));
Console.WriteLine("Result: " + del.Invoke());


When executed, the dynamically-compiled delegate throws an InvalidProgramException, with the message, "Common Language Runtime detected an invalid program." The problem here is that the literal 1 is not actually a 64-bit floating-point number. This can be quickly rectified by either casting 1 as a double, or using type specifiers such as 1.0 or 1D.

That's basically the moral of this story. Perhaps Emit is at fault, in that it doesn't check or typecast arguments for you, but until that gets fixed, if ever, make sure you type your arguments correctly!

I was further surprised while debugging this issue, when it allowed to me to change the Ldc_R8 to Ldc_I4, and keep the value of 1. According to the documentation on .Add, adding floats and integers is not allowed and SHOULD throw an exception. However, it does not.

il.Emit(OpCodes.Ldc_I4, 1);
il.Emit(OpCodes.Ldc_R8, 1D);


Perhaps there is another explaination for this, or perhaps the documentation is just plain incorrect.

23.3.08

Bah Vista

I've replaced Notepad with Notepad2 at work, and I couldn't be happier. I get syntax highlighting, ability to toggle read-only from a menu (seriously awesome), and encoding detection/changing from one program. Awesome.

I installed it in the C:\Windows folder as notepad2, and changed most of the default types (that don't open in VS2008) to use notepad2, but it still appears in every Open As.. dialog. What I should have done was replace Notepad.exe entirely.

So, this is what I'm trying to do now on my personal laptop with Vista. And I can't.

When I try to rename notepad.exe to _notepad.exe, the UAC pops up and asks me for my credentials. I sign in as my admin account, and then it pops up friendly "You need permission to perform this action." box, which pops up again and again when I click Ok.

Checking the permissions on notepad, the only thing that has full control is a trusted installer. Adminstrators just have read & execute permission. Moreover, I can't change permissions, even as an administrator. This is where the Bah Vista comes from.


In fairness, I'm the one messing up and trying to do what I shouldn't, and Vista is entirely correct in its actions. System files shouldn't be replaced, and this will prevent viruses, et cetera.. If I want to install Notepad2, I should do it somewhere else. It still doesn't make me any happier about it.



UPDATE: You can take ownership of the file from TrustedInstaller (with admin rights), which allows you to change the permissions. So I did, but I can't set the ownership back to TrustedInstaller, and I'm also having trouble limiting my user account's permissions on the file now. I also hope that Vista doesn't notice I've messed with her files and come after me..

This is NOT recommended. I should have just installed it somewhere else and dealt with the hassle of reassigning file types.



UPDATE: I feel so bamboozled. All links to Notepad go to c:\windows\system32\notepad.exe, not c:\windows. So I basically did nothing. I'm not sure why there is another copy of Notepad, or why it is in the system32 folder since it's a 64-bit executable. Mysteries of Windows I guess.

Even more mysteriously, it doesn't run while named _notepad.exe...but I rename it back to notepad.exe and it opens just fine.

I have finally put Notepad2 in its own folder and we can all just forget about this little mess..

25.2.08

Using Predicate<T> in Your Methods

If you ever use any of the generic List<> functions like Find, RemoveAll, et cetera, you should be familiar with Predicate<T>. In short, it is a delegate that returns true or false, based on whatever criteria is needed. So if you are looking through a List<int> to find an element that is evenly divisible by 7, you can do it with a foreach (or for) loop:
List myList;
int x = -1;
foreach (int i in myList)
{
if (i % 7 == 0)
{
x = i;
}
}
or by using a predicate:
int x = myList.Find(i => i % 7);

Since my example uses a value type, you have to watch out for confusing no match with an actual match. In the foreach, x is initialized to -1. If -1 is a possible match (it isn't, in this case, since -1 % 7 = -1), you won't be able to determine whether or not it worked. You can always add a boolean flag, separate from the int value.

With the Find() method, it returns default(T) if there is no match. For value types, default is 0, so if that is a possibility, you will have problems. There is a FindIndex() method which would work more appropriately in this case.

If you are using reference types, default is null--so you don't have to worry!

Now that I've illustrated Predicates, let's look at how you can use them in your code.
public T FindSomethingOrOther(Predicate<T> match)
This function will find something or other, either in a static property somewhere, or within the instance it exists--either way you want.

To use the predicate, all you have to do is use it like a function:
public T FindSomethingOrOther(Predicate<T> match)
{
foreach (T obj in this.Objects)
{
if (match(obj))
return obj;
}
return null;
}

This is a powerful way to write utility functions that are not limited to one use. The above example is very basic--not really the best example. However, if the method wasn't iterating through a local collection, but through rows in a database, there would be plenty of setup and teardown code. Instead of writing a separate method to match on each column, you could write one method that took a delegate(DataRow).

22.2.08

LINQ performance

Often times, I'm not really on the cutting-edge bandwagon. Actually, it seems like I never am. But I like being the late guy to the party--I show up for the good stuff.

Anyways, I haven't actually used LINQ for anything, but I've seen it around. You know, caught a glimpse here and there, so that I have a vague idea what it's for and what it looks like. Naturally the biggest question I have is, How fast is it?

From what I understand, it is most efficient when used on an IQueryable, but works on anything that's IEnumerable. Okay, fair enough. The C# foreach keyword works on IEnumerables too. Consequently, every time you increment, it checks the bounds, moves next, etc. Compared to a for loop, it's expensive--but not so much so.

I ran across this post from Steve Hebert today, thanks to Google, and it verified much of what I was thinking and fearing about LINQ: it's inefficient.

But let's face it, it's not LINQ's fault, and I think that article (and the comments) make that abundantly clear. It has limitations, and if you don't know about them, and don't take advantage of LINQ when it has the upper hand, you will fail.

Fair enough. You should always know how your underlying data structures and algorithms work, or you'll end up making bad choices, with or without LINQ. A linked list, for example, would make a wonderful priority queue, because removing the first element is a O(1) operation. However, adding would be O(n), since it has to traverse the entire list. If you use a generic List, you can add to the end immediately, but every time you List.Remove() from the front, it re-indexes everything.

The real point I'm trying to make here is that LINQ has it's limitations. It might be very easy to write, but perform very badly if it is unable to take into consideration all of the knowledge that you, as a developer, know about the data structure you are using, and the nature of the data inside it.

On the other hand, there are situations where it will compile into the exact same code, in which case you might actually appreciate the concise and straight-forward syntax that it offers you.

18.2.08

Iteration with Lambda Expressions

I started using lambda expressions recently, and converted all my anonymous delegates. I like the syntax for a lot of things, but not all. Basically, any simple, one or two line predicates or actions are very neatly rewritten into a lambda expression.

But then we hit the question of performance. Delegates and lambda expressions are exactly the same thing in C#. The transformation between the two is ridiculously trivial, so it's not surprising that they are compiled into the same IL.

delegate(int i)
{
DoWork(i);
}
i => DoWork(i)


But let's talk about real-world applications and their performance implications. Let's say I have an array, and I want to iterate through it and call some method with each element. Thanks to C#'s wonderful language constructs, I can think of several ways to accomplish this:

for (int i = 0; i <>
foreach (int i in Program.array)
{
Program.DoWork(i);
}
Array.ForEach(Program.array,
delegate(int i)
{
Program.DoWork(i);
}
);
Array.ForEach(Program.array, x => DoWork(x));

As I already mentioned, the last two are functionally equivalent, but I'll include both just for demonstration and completeness. Here's results for an array of 100,000,000 integers.







MethodTime
for loop0.6318259s
foreach0.9067668s
Array.Foreach (delegate)1.284918s
Array.Foreach (lambda)1.2216206s


As you can see, a simple for loop is the fastest. Foreach is more efficient over arrays than over collections--it does actually use indexing to go through the array rather than an enumerator. Array.Foreach takes twice as long. Most of the slow-down is that it's doing a delegate call.

In conclusion, nothing beats a good ol' for loop. On the other hand, the simplicity and clarity of lambda expressions is similar to what foreach brought us. Foreach always ran slower than a for loop, but removed the need to manage indexes, create a temp variable, etc. I would argue that it reduces errors and makes the code more readable. In a way, lambda expressions are just a continuation of the same.

3.2.08

C# 3.0 - var keyword

There's a new keyword on the block: var. And while many people are praising its wonderous benefits, I don't like it. Let's consider the facts:
  • It can only be used on variables initialized on the same line

  • It is statically typed when initialized, so strong typing is maintained
Visual Studio 2008 is smart enough to show you the inferred type when you go to use a var variable later, which is nice, but it doesn't solve the problem of code readability. There are plenty of articles with good examples of this, so I won't reiterate. However, today I came up with a new scenario that would provide some credibility to the use of var:
for (var i = 0; i < 400; i++)
{
if (i == 0)
Console.Write(i.GetType().Name);
}


Since the var i is in the range of 0 to 399, I hoped the compiler would realize that it could use a short (or even a ushort) when it typed it. It does not.

The reason is pretty simple, really. It's not intelligent. Since I assigned 0, which is by default an Int32, the variable i was statically typed as an Int32. If I change it to read var i = 0u, it will type it as a UInt32.

This isn't really surprising, but it's a little disappointing. My basic idea was that if the size of the loop changed (statically) in the future, I could save the trouble of changing the variable type manually. That is, if the compiler automatically picked the smallest variable type required.

In any case, I dislike the use of var immensely. I understand it is required for LINQ, but in any other case, it looks like pure laziness. One article I read said that typing "var" instead of "Dictionary" was savings. However, any software engineer will tell you the bottleneck is not in the typing, it's in the thinking. If I have to spend more time thinking the next time I look at this code to figure out what my new var type actually is, I've wasted time.

9.1.08

Method Hiding

I ran into a surprise today in C#, with a probably overly-complex system of interfaces and abstract classes. Here's some sample code:
    class Program
{
static void Main(string[] args)
{
IFoo b = new B();
b.DoWork();
((B)b).DoWork();

Console.ReadKey();
}
}

interface IFoo
{
void DoWork();
}

class A : IFoo
{
public void DoWork()
{
Console.WriteLine("A Does Work");
}
}

class B : A
{
public new void DoWork()
{
Console.WriteLine("B Does Work");
}
}

Output is:
A Does Work
B Does Work


I mistakenly expected both lines to say "B Does Work." After all, that's the point of inheritance, right? I instantiated a B, so it should map to B.DoWork(). Well, it doesn't.

The explanation is in the vtables AND the fact that I defined B.DoWork as new. I explicitly stated (without realizing) that it was not connected to IFoo's method. The vtable tried to resolve DoWork and got as far as it could..to A.

The correct way to implement this is by defining DoWork as virtual in A, then marking it override in B.

The code I present here would be useful if one wanted to define an implementation of a method that executed only when the method was casted to that specific type, rather than as a base class or interface. Not that I know of any such scenarios ..

8.1.08

Visual Studio 2008 as a WPF editor

I've started learning WPF and writing XAML in Visual Studio 2008. Granted, I've haven't used any other editors like Expression Blend, but I find VS2008 extremely lacking, in both the visual editor and the text editor.

I never liked using Visual Studio's (2003 and 2005) visual web page editor either. It never really created the HTML the way I wanted, and it tended to destroy my beautifully-indented and formatted markup. Moreover, once you start doing complex Ajax events, everything falls to pieces. But the real problem is handling CSS.

The WPF editor is terrible for essentially the same reason--it's trying to model the old-school absolute positioning in a language which is designed to be non-absolute. You can drag and drop all of your elements onto a window, but they get ridiculous margins. Instead, you need to add some StackPanels, DockPanels, and so forth. Can you do this with the editor? Yes. Is it easier by hand? Yes.

Then my second complaint is that the Intellisense sucks. It works for most elements and attributes, but there is no popup documentation provided inline, like it is for C# and ASP. And the Intellisense just don't work for anything in strings.

<Button Margin="0,0,3,0" Content="{Binding Foo.Bar}" />


It doesn't inform you that Binding is possible inside the brace. It will actually color-code the first thing you type in a brace, regardless of whether it's actually valid. Secondly, this will only work if the DataContext for the Button (or its parent) has a Foo property, and if that Foo property has a Bar property. You will get a nice compile-time error if that's not the case, BUT I would really appreciate a nice intellisense list so I didn't spend half my time looking up class definitions.

I'm really liking WPF's model, and it only gets more interesting as I learn more. However, I wish the tool was a little more advanced so that it could actually enable me.

5.1.08

Browser Wars

I was doing some reading about the recent re-ignition of the browser wars (okay, it's a 2004 article, but still relevant). One argument for the still-dominating popularity of Internet Explorer is that it comes installed on most computers sold in the world.

That's true, it does, and until Microsoft is slapped on the wrist again for monopolizing the market, it will continue to do so.

I think that while this was a huge factor in the past, we are going to see a distinct change in the next few years. In fact, this has already happened. Ever since the spyware boom, Firefox has been handed to the masses as a, "Here, this will keep your computer from getting infected," preventative measure. And it worked.

IE6 is the most popular browser in the world, but also the most spyware/malware-riddled. Firefox may have many slick features that IE6 doesn't, but it's possibly best known as a way to keep your computer running smoothly. And those features don't hurt, either, which is why average computer users are downloading it to install on their newly purchased computers.

IE7 was a much needed upgrade, fixing countless CSS problems, improving rendering and DOM performance, and tightening security all around. However, Microsoft didn't require the upgrade, and releasing it instead as a downloadable installer. This is critical because it places IE7 on the same level as Firefox--users have to look for it, download it, and install it themselves.

Similarly, there are also photo-enthusiasts who download software like Google's Picasa to edit their digital photos and upload them to the web. They aren't necessarily technologically-saavy, but they have heard about this software/service, and want it.

I think it's clear that we are in a period where ordinary people expect to get software and services off the web. Vista does include IE7, so Internet Explorer's market share isn't going anywhere too quickly. However, I think the domination of the OEM bundle is nearing its end, and this will have a significant effect on browser popularity trends.