7.4 Performance
This section gives some notes on some of the performance issues in
simulators. It is not intended to be complete or well organized.
7.4.1 Virtual functions
There is a question of the impact on speed from the use of virtual
functions. The experiment used here is to use the circuit eq4-2305.ckt from the examples directory, and try several modified
versions of the program. I used a 100 point dc sweep, a version
between 0.20 and 0.21, and made several modifications for testing
purposes. I chose this circuit because it has little to mask the
effect, and therefore is sort of a worst case.
I added an int foo to the element class. I made the function
il_trload_source call a virtual function virtual_test
and stored the result. The local version body has a print call,
which should not show, to make sure it calls the other. These
functions simply return a constant, determined by which version of
the function is called. Run time is compared, with and without
this.
With 1 virtual function call (included in load)
user sys total
evaluate 13.45 0.11 13.56
load 13.40 0.06 13.47
lu 1.91 0.09 2.00
back 22.35 0.27 22.61
review 0.00 0.00 0.00
output 0.11 0.11 0.22
overhead 0.23 0.19 0.42
total 51.45 0.83 52.28
With 10 virtual function calls (included in load)
user sys total
evaluate 13.47 0.09 13.57
load 24.69 0.17 24.87
lu 2.09 0.02 2.11
back 22.17 0.35 22.51
review 0.00 0.00 0.00
output 0.14 0.11 0.25
overhead 0.25 0.25 0.50
total 62.82 0.99 63.81
No extra function calls (included in load)
user sys total
evaluate 13.41 0.09 13.50
load 11.75 0.05 11.79
lu 2.04 0.03 2.07
back 22.51 0.33 22.84
review 0.00 0.00 0.00
output 0.08 0.11 0.19
overhead 0.31 0.25 0.56
total 50.10 0.86 50.96
My conclusion is that in this context, even a single virtual function
call is significant (10-15% of the load time), but not so significant
as to prohibit their use. The load loop here calls one virtual
function inside a loop. The virtual function calls an ordinary
member function. Therefore, about 30% of the load time is function
call overhead.
The impact should be less significant for complex models like
transistors because the calculation time is much higher and would
serve to hide this better.
Spice uses a different architecture, where a single function
evaluates and loads all elements of a given type. This avoids
these two calls.
7.4.2 Inline functions
For this test, il_trload_source is not inline. Contrast
to "No extra function calls" and "1 virtual function" above, in
which this function is inline.
user sys total
evaluate 13.44 0.15 13.60
load 13.85 0.14 13.99
lu 1.73 0.02 1.75
back 22.89 0.35 23.24
review 0.00 0.00 0.00
overhead 0.45 0.17 0.63
total 52.50 0.94 53.44
This shows (crudely) that the overhead of an ordinary private member
function call (called from another member function in the same
class) is significant here. The cost of a virtual function call
is comparable to the cost of an ordinary private member function
call.