28.9.08

Doing Better than Math.Pow()

This is a sequal of sorts to my previous series on expression evaluation and relates to the framework I built there.

The source to C#'s Math.Pow() method isn't publically available, but it probably uses an exponential approximation to calculate powers of numbers, using the following:
a^b = e^(b*ln(a))

This makes it extremely dependable, value-wise and time-wise. A power of .5 takes the same time to calculate as a power of 2, or 2.01, or 53 (This seems to hold for small powers, but not for larger ones. There appears to be a cutoff determined by the combination of a and b, at which point a different method is used, which takes about three times as long).

However, when calculating integer powers, multiplication can also be used, and it is much faster for small integers. Consider the following data:

































Time to calculate 3^i 800,000 times
iControlMath.PowMult
21119019
3816424
4816128
5816533
6816041
7815943
8815948
9915954
10815858
11816565
12815871
13816174
14815979
15815986
16815991
17815895
18815999
198158106
208159113
218159118
228158123
238159126
248158136
259161134
268158145
278159149
288161158
298159158


The first column is the exponent, and other three columns are the total number of milliseconds elapsed while evaluating 3^i 800000 times for each method. The first result is actually a control, where I just call a static method which returns -1. This indicates how long the static method call actually takes, in comparison to the actual calculation.

The second result is the time it took to call Math.Pow(3,i), while the last column is the result of an inline interative multiplication calculation. It's obvious that until the very highest powers tested, inline multiplication is faster--with a power of 2, it's 10x faster. Obviously inline multiplication cannot be used for decimal powers, but it would be great if I could take advantage of this speed for integer powers between 2 and 28 (and -2..-28, by using a^-b = 1/(a^b))

Easy Pickings

public override Node Simplify(Node node)
{
Number x = node.Left.Value as Number;
Number y = node.Right.Value as Number;

if (x != null && y != null)
{
return new Node(new Number(Math.Pow(x.Value, y.Value)));
}

if (y != null)
{
if (y.Value == -1)
{
Node n = new Node(new Division());
n.Left = Number.One;
n.Right = node.Left;
}
if (y.Value == 0)
{
return Number.One;
}
if (y.Value == 1)
{
return node.Left;
}
}
}

After the expression tree is parsed, it goes through a simplification phase, where I iterate through it in postfix order, simplifying each node. This is the SimplifyNode() implementation for a Power Node. As the section heading says, these are the easy pickings. If both the argument and the power (a and b) are numbers, I just evaluate the power and replace this node with the numerical result. I also replace some simple powers with a shorter form. This will help simplify the tree, but won't provide any further benefit.

In-Place Expansion

if (y.Value >= 2 && y.Value <= 28)
{
// Iteratively construct multiplication tree to replace power operation.
Node n = new Node(new Multiplication());
n.Left = node.Left;
n.Right = node.Left;
for (int i = 0; i < y.Value - 2; i++)
{
Node nParent = new Node(new Multiplication());
nParent.Left = n;
nParent.Right = node.Left;

n = nParent;
}
return n;
}

Adding this to the SimplifyNode() method will handle the cases of integer powers greater than 2 and less than 28, which is what the earlier table indicated would benefit. It actually converts a Power node into a Multiplication node with nested Multiplication nodes--in essence, converting x^2 into x*x, and x^3 into x*(x*x), etc. Let's see how this compares when I compile the tree and execute it:































Time to evaluate an x^i tree 800,000 times
iMath.PowMult
223599
3226103
4214116
5229111
6233115
7228155
8279157
9230149
10221133
11211137
12223147
13218186
14236175
15266161
16215197
17255203
18259212
19224177
20218197
21219186
22227193
23234235
24254240
25263254
26266257
27263271
28270266
29261269

Things are good, as expected. However, there is one huge shortcoming: expanding the tree causes repeated calculations:
x^3 = x*x*x
cos(x)^3 = cos(x)*cos(x)*cos(x)
















Time to evaluate an cos(x)^i tree 800,000 times
iMath.PowMult
2302204
3283253
4286303
5286365
6287419
7291482
8289534
9289595
10291645
11283739
12280739
13285798
14285845

The table above makes it clear that actual tree expansion is a terrible way to convert exponents to multiplications. Instead, I want something like this:
double x = Math.Cos(x);
return x*x*x

This is actually achievable, since I don't need to evaluate the expression being raised to a power more than once. However, my current system doesn't support this. The way I compile from an expression tree into IL is based on the CLR stack, so by the time I'm emitting the power opcodes, the arguments are already on the stack. I need to work within these confines to achieve this.

Enter the PowerExp


I created a new operator, the PowerExp. It won't be created automatically by parsing, but rather by the simplification engine.

Since all the operands are loaded to the stack before the operator, I had to remove the ^2 part from the tree. I could have done a conversion like this:
x^2 => Square(x)
x^3 = Cube(x)

That would require quite a few operators, though. Instead, I made one operator that takes the exponent as a Property. Consider these two methods:
private static Node BuildPowerTree(int n)
{
Node powerNode = new Node(new Power());
powerNode.Left = Parameter.X;
powerNode.Right = Number.Node(n);
return powerNode;
}
private static Node BuildPowerExpTree(int n)
{
Node powerExpNode = new Node(new PowerExp(n));
powerExpNode.Left = Parameter.X;
return powerExpNode;
}

In the second method, I pass n into the operator rather than making it the Right node. This allows me to avoid pushing it to the stack. Now I can implement the PowerExp emitter:
// Eval n^power.  n is already on the stack
public override void Emit(System.Reflection.Emit.ILGenerator ilg)
{
ilg.Emit(OpCodes.Stloc_0); // Store n
ilg.Emit(OpCodes.Ldloc_0); // Load n
for (int i = this.Power; i > 1; i--)
{
ilg.Emit(OpCodes.Ldloc_0); // Load n
ilg.Emit(OpCodes.Mul); // Multiply
} // Repeat to achieve power
}

Timings:































Time to evaluate an cos(x)^i tree 800,000 times
iPowerExpMath.PowMult
2183306221
3157301270
4154304333
5160298366
6161298432
7164299500
8165289556
9166289601
10165290658
11170293712
12173292770
13170291829
14178294877
15175283909
16175288952
171842791007
181752851055
191752821105
201812781172
211822811219
221882911280
231862781330
241892811400
251952831446
261962811499
272012851547
281982871596
292002881670

The results basically speak for themselves. In fact, PowerExp is faster than Math.Pow() up to around x^60. That's 59 multiplication opcodes in a row! Plus, this method doesn't clutter my tree with extra nodes. For evaluating f(x)^i...
  1. Basic multiplying decreases performance due to redundant evaluations—never a good idea

  2. Math.Pow is constant across f(x) and i (true for small values)

  3. PowerExp is linear with respect to i, regardless of complexity of f(x)

To summarize, I found a way to greatly improve Power evaluation performance for small integer values. By writing rules into my simplification engine, I can convert certain optimal cases from Power into PowerExp, taking advantage of this speed improvement.

No comments: