Computer Dave: Doing Better than Math.Pow()

This is a sequal of sorts to my previous series on expression evaluation and relates to the framework I built there.

The source to C#'s Math.Pow() method isn't publically available, but it probably uses an exponential approximation to calculate powers of numbers, using the following:

a^b = e^(b*ln(a))

This makes it extremely dependable, value-wise and time-wise. A power of .5 takes the same time to calculate as a power of 2, or 2.01, or 53 (This seems to hold for small powers, but not for larger ones. There appears to be a cutoff determined by the combination of a and b, at which point a different method is used, which takes about three times as long).

However, when calculating integer powers, multiplication can also be used, and it is much faster for small integers. Consider the following data:

Time to calculate 3^i 800,000 times
i	Control	Math.Pow	Mult
2	11	190	19
3	8	164	24
4	8	161	28
5	8	165	33
6	8	160	41
7	8	159	43
8	8	159	48
9	9	159	54
10	8	158	58
11	8	165	65
12	8	158	71
13	8	161	74
14	8	159	79
15	8	159	86
16	8	159	91
17	8	158	95
18	8	159	99
19	8	158	106
20	8	159	113
21	8	159	118
22	8	158	123
23	8	159	126
24	8	158	136
25	9	161	134
26	8	158	145
27	8	159	149
28	8	161	158
29	8	159	158

The first column is the exponent, and other three columns are the total number of milliseconds elapsed while evaluating 3^i 800000 times for each method. The first result is actually a control, where I just call a static method which returns -1. This indicates how long the static method call actually takes, in comparison to the actual calculation.

The second result is the time it took to call Math.Pow(3,i), while the last column is the result of an inline interative multiplication calculation. It's obvious that until the very highest powers tested, inline multiplication is faster--with a power of 2, it's 10x faster. Obviously inline multiplication cannot be used for decimal powers, but it would be great if I could take advantage of this speed for integer powers between 2 and 28 (and -2..-28, by using a^-b = 1/(a^b))

Easy Pickings

public override Node Simplify(Node node)
{
    Number x = node.Left.Value as Number;
    Number y = node.Right.Value as Number;

    if (x != null && y != null)
    {
        return new Node(new Number(Math.Pow(x.Value, y.Value)));
    }

    if (y != null)
    {
        if (y.Value == -1)
        {
            Node n = new Node(new Division());
            n.Left = Number.One;
            n.Right = node.Left;
        }
        if (y.Value == 0)
        {
            return Number.One;
        }
        if (y.Value == 1)
        {
            return node.Left;
        }
    }
}

After the expression tree is parsed, it goes through a simplification phase, where I iterate through it in postfix order, simplifying each node. This is the SimplifyNode() implementation for a Power Node. As the section heading says, these are the easy pickings. If both the argument and the power (a and b) are numbers, I just evaluate the power and replace this node with the numerical result. I also replace some simple powers with a shorter form. This will help simplify the tree, but won't provide any further benefit.

In-Place Expansion

if (y.Value >= 2 && y.Value <= 28)
{
    // Iteratively construct multiplication tree to replace power operation.
    Node n = new Node(new Multiplication());
    n.Left = node.Left;
    n.Right = node.Left;
    for (int i = 0; i < y.Value - 2; i++)
    {
        Node nParent = new Node(new Multiplication());
        nParent.Left = n;
        nParent.Right = node.Left;

        n = nParent;
    }
    return n;
}

Adding this to the SimplifyNode() method will handle the cases of integer powers greater than 2 and less than 28, which is what the earlier table indicated would benefit. It actually converts a Power node into a Multiplication node with nested Multiplication nodes--in essence, converting x^2 into x*x, and x^3 into x*(x*x), etc. Let's see how this compares when I compile the tree and execute it:

Time to evaluate an x^i tree 800,000 times
i	Math.Pow	Mult
2	235	99
3	226	103
4	214	116
5	229	111
6	233	115
7	228	155
8	279	157
9	230	149
10	221	133
11	211	137
12	223	147
13	218	186
14	236	175
15	266	161
16	215	197
17	255	203
18	259	212
19	224	177
20	218	197
21	219	186
22	227	193
23	234	235
24	254	240
25	263	254
26	266	257
27	263	271
28	270	266
29	261	269

Things are good, as expected. However, there is one huge shortcoming: expanding the tree causes repeated calculations:

x^3 = x*x*x
cos(x)^3 = cos(x)*cos(x)*cos(x)

Time to evaluate an cos(x)^i tree 800,000 times
i	Math.Pow	Mult
2	302	204
3	283	253
4	286	303
5	286	365
6	287	419
7	291	482
8	289	534
9	289	595
10	291	645
11	283	739
12	280	739
13	285	798
14	285	845

The table above makes it clear that actual tree expansion is a terrible way to convert exponents to multiplications. Instead, I want something like this:

double x = Math.Cos(x);
return x*x*x

This is actually achievable, since I don't need to evaluate the expression being raised to a power more than once. However, my current system doesn't support this. The way I compile from an expression tree into IL is based on the CLR stack, so by the time I'm emitting the power opcodes, the arguments are already on the stack. I need to work within these confines to achieve this.

Enter the PowerExp

I created a new operator, the PowerExp. It won't be created automatically by parsing, but rather by the simplification engine.

Since all the operands are loaded to the stack before the operator, I had to remove the ^2 part from the tree. I could have done a conversion like this:

x^2 => Square(x)
x^3 = Cube(x)

That would require quite a few operators, though. Instead, I made one operator that takes the exponent as a Property. Consider these two methods:

private static Node BuildPowerTree(int n)
{
    Node powerNode = new Node(new Power());
    powerNode.Left = Parameter.X;
    powerNode.Right = Number.Node(n);
    return powerNode;
}
private static Node BuildPowerExpTree(int n)
{
    Node powerExpNode = new Node(new PowerExp(n));
    powerExpNode.Left = Parameter.X;
    return powerExpNode;
}

In the second method, I pass n into the operator rather than making it the Right node. This allows me to avoid pushing it to the stack. Now I can implement the PowerExp emitter:

// Eval n^power.  n is already on the stack
public override void Emit(System.Reflection.Emit.ILGenerator ilg)
{    
    ilg.Emit(OpCodes.Stloc_0);  // Store n
    ilg.Emit(OpCodes.Ldloc_0);  // Load n
    for (int i = this.Power; i > 1; i--)
    {
        ilg.Emit(OpCodes.Ldloc_0);  // Load n
        ilg.Emit(OpCodes.Mul);      // Multiply
    } // Repeat to achieve power
}

Timings:

Time to evaluate an cos(x)^i tree 800,000 times
i	PowerExp	Math.Pow	Mult
2	183	306	221
3	157	301	270
4	154	304	333
5	160	298	366
6	161	298	432
7	164	299	500
8	165	289	556
9	166	289	601
10	165	290	658
11	170	293	712
12	173	292	770
13	170	291	829
14	178	294	877
15	175	283	909
16	175	288	952
17	184	279	1007
18	175	285	1055
19	175	282	1105
20	181	278	1172
21	182	281	1219
22	188	291	1280
23	186	278	1330
24	189	281	1400
25	195	283	1446
26	196	281	1499
27	201	285	1547
28	198	287	1596
29	200	288	1670

The results basically speak for themselves. In fact, PowerExp is faster than Math.Pow() up to around x^60. That's 59 multiplication opcodes in a row! Plus, this method doesn't clutter my tree with extra nodes. For evaluating f(x)^i...

Basic multiplying decreases performance due to redundant evaluations—never a good idea

Math.Pow is constant across f(x) and i (true for small values)

PowerExp is linear with respect to i, regardless of complexity of f(x)

To summarize, I found a way to greatly improve Power evaluation performance for small integer values. By writing rules into my simplification engine, I can convert certain optimal cases from Power into PowerExp, taking advantage of this speed improvement.

Computer Dave

28.9.08

Doing Better than Math.Pow()

Easy Pickings

In-Place Expansion

Enter the PowerExp

No comments:

Blog Archive

About Me

i	Control	Math.Pow	Mult
2	11	190	19
3	8	164	24
4	8	161	28
5	8	165	33
6	8	160	41
7	8	159	43
8	8	159	48
9	9	159	54
10	8	158	58
11	8	165	65
12	8	158	71
13	8	161	74
14	8	159	79
15	8	159	86
16	8	159	91
17	8	158	95
18	8	159	99
19	8	158	106
20	8	159	113
21	8	159	118
22	8	158	123
23	8	159	126
24	8	158	136
25	9	161	134
26	8	158	145
27	8	159	149
28	8	161	158
29	8	159	158

i	Math.Pow	Mult
2	235	99
3	226	103
4	214	116
5	229	111
6	233	115
7	228	155
8	279	157
9	230	149
10	221	133
11	211	137
12	223	147
13	218	186
14	236	175
15	266	161
16	215	197
17	255	203
18	259	212
19	224	177
20	218	197
21	219	186
22	227	193
23	234	235
24	254	240
25	263	254
26	266	257
27	263	271
28	270	266
29	261	269

i	Math.Pow	Mult
2	302	204
3	283	253
4	286	303
5	286	365
6	287	419
7	291	482
8	289	534
9	289	595
10	291	645
11	283	739
12	280	739
13	285	798
14	285	845

i	Control	Math.Pow	Mult
2	11	190	19
3	8	164	24
4	8	161	28
5	8	165	33
6	8	160	41
7	8	159	43
8	8	159	48
9	9	159	54
10	8	158	58
11	8	165	65
12	8	158	71
13	8	161	74
14	8	159	79
15	8	159	86
16	8	159	91
17	8	158	95
18	8	159	99
19	8	158	106
20	8	159	113
21	8	159	118
22	8	158	123
23	8	159	126
24	8	158	136
25	9	161	134
26	8	158	145
27	8	159	149
28	8	161	158
29	8	159	158

i	Math.Pow	Mult
2	235	99
3	226	103
4	214	116
5	229	111
6	233	115
7	228	155
8	279	157
9	230	149
10	221	133
11	211	137
12	223	147
13	218	186
14	236	175
15	266	161
16	215	197
17	255	203
18	259	212
19	224	177
20	218	197
21	219	186
22	227	193
23	234	235
24	254	240
25	263	254
26	266	257
27	263	271
28	270	266
29	261	269

i	Math.Pow	Mult
2	302	204
3	283	253
4	286	303
5	286	365
6	287	419
7	291	482
8	289	534
9	289	595
10	291	645
11	283	739
12	280	739
13	285	798
14	285	845

i	Control	Math.Pow	Mult
2	11	190	19
3	8	164	24
4	8	161	28
5	8	165	33
6	8	160	41
7	8	159	43
8	8	159	48
9	9	159	54
10	8	158	58
11	8	165	65
12	8	158	71
13	8	161	74
14	8	159	79
15	8	159	86
16	8	159	91
17	8	158	95
18	8	159	99
19	8	158	106
20	8	159	113
21	8	159	118
22	8	158	123
23	8	159	126
24	8	158	136
25	9	161	134
26	8	158	145
27	8	159	149
28	8	161	158
29	8	159	158

i	Math.Pow	Mult
2	235	99
3	226	103
4	214	116
5	229	111
6	233	115
7	228	155
8	279	157
9	230	149
10	221	133
11	211	137
12	223	147
13	218	186
14	236	175
15	266	161
16	215	197
17	255	203
18	259	212
19	224	177
20	218	197
21	219	186
22	227	193
23	234	235
24	254	240
25	263	254
26	266	257
27	263	271
28	270	266
29	261	269

i	Math.Pow	Mult
2	302	204
3	283	253
4	286	303
5	286	365
6	287	419
7	291	482
8	289	534
9	289	595
10	291	645
11	283	739
12	280	739
13	285	798
14	285	845