The Sun Constant Performance Metric

The Sun Constant Performance Metric (SCPM) is a metric that is designed for
capacity planning purposes. It indicates the processing potential of a
combination of hardware and operating system. The specific intent of the
metric is to enable the comparison of  dissimilar systems, in order to
provide the user with an indication of how to plan capacity in a consistent
and methodical fashion. In addition to comparing systems, the metric
provides a convenient way of expressing the amount of work being done by a
system.

Historical Context

Traditionally, comparing systems has been done by comparing similar
benchmarks, such as SPECint92 or TPC-C. However, getting a comparable set of
benchmarks on each platform is difficult at best, because the benchmarks
evolve at a different rate than the platforms. The current UltraSPARC-based
platforms are benchmarked with SPEC95 and TPC-C, while older
SuperSPARC-based systems such as the SPARCstation 20 were benchmarked with
SPEC92 and TPC-B. Because of the very large semantic differences between
successive generations of benchmarks, there isn't any realistic way to
compare a SPEC92 result with a SPEC95 result, or a TPC-B with a TPC-C.

Multiprocessor systems further complicate the problem by enabling many
possible configurations for each basic model. Which would deliver higher
performance, an E6000 with 13x 167MHz/1MB modules or an E4500 with 7x
250MHz/4MB modules? With multicomputer processing systems beginning to
appear, it was clear that a new way of comparing machine capacity had to be
created.

Metrics

The SCPM indicates the processing potential of the system as a whole. It is
derived from a combination of benchmarking and quantitative estimations.  It
is neither practical nor necessary to benchmark every single configuration.
Fortunately, systems are sufficiently well-behaved that interpolations from
extreme configurations need only spot-checking to ensure the required
accuracy. (For example, all of the configurations using an odd number of
processors are interpolated, with the exception of uniprocessors.)

Also known as an "M-value", the units of SCPM are quanta. Both the term 
"M-value" and the unit "quanta" are historical, and reflect the terms used
in the literature[1]. A configuration with an SCPM of 10,000 is said to have
an M-value of 10,000. If it is running at 50% utilization, the same system is
said to be expending 5,000 quanta. Comparing systems is straightforward: if
system A has twice the M-value of system B, it is about twice as fast. In the
example above, the 13x 167/1MB system has M=19,997, while the 7x 250/4MB
system has M=18,049.  So the older system delivers greater throughput, by a
small margin.

It is often useful to characterize existing workloads in terms of the quanta
expended per user.  A system with M=10,000 that serves 300 users is
expending an average of about 33 quanta per user, while another system with
M=71 but only one user is is working more than twice as hard on a per-user
basis. By knowing the basic characteristics of an operational workload, it
is relatively straightforward to plan platform transitions or predicted
growth in user populations. If a particular application is known to consume
about 50 quanta per user, a reasonable estimate is that 1,000 users will
require a system with M=50,000.

Similarly, an entire methodology has been developed that characterizes the
demand on peripheral resources as a function of expended processing effort.
For example, the term relative I/O content (denoted R) is the ratio of disk
I/Os to CPU expenditure, and similarly for relative network content (N).
Because workloads usually have fairly predictable trends, these and other
related metrics are very useful for predicting workload characteristics [2].

None of these metrics should be taken as absolute gospel, nor should one
expect them to be minutely accurate. Getting to within 10%-15% is about all
that can be expected, since there are far too many variables not taken into
account. However, the metrics do provide a structured, methodical framework
in which to begin the capacity planning process.

Metric Tables

Because the operating system is a significant component of the operating
platform's capability, the SCPM is provided as a series of tables, one for
each operating system. These are:

     Solaris 7
     Solaris 2.6
     Solaris 2.5 (and Solaris 2.5.1)
     Solaris 2.4

No tables have been measured or computed for Solaris 1, nor for any
competitive platforms.

Latency

Like every other performance metric, the SCPM has some idiosyncracies. In
particular, it is biased toward analysis of throughput, rather than
processing latency. Two systems that have the same M-value will process
approximately the same amount of work in a given period of time. However,
this does not mean that the user experience on the two systems will be
precisely the same. In particular, the metric does not account for wide
variances in the basic processor speed. For example, consider a uniprocessor
with M=10,000 and a dual-processor with M=10,000. In most circumstances,
they run the same number of transactions per minute, but the uniprocessor's
basic processing speed is twice as fast, so response time will probably feel
much different to a user. Fortunately, this type of analysis isn't always
mandatory, since most capacity planners are focused on throughput.
Furthermore, the differences in basic CPU speed are usually not so disparate
as to create serious problems with the analysis.

Scalability of large systems

Another major issue is scalability. The scalability of an application is
dependent on a large number of variables, including the design of the
application, the way it uses a database management system and the inherent
interaction of data and users within the application. Naturally, the
scalability of the hardware platform and it operating system must also be
considered. Fortunately, the hardware and Solaris are mature and scale
sufficiently well that it is unlikely that either will be the limiting
factor in scalability. The SCPM tables necessarily make assumptions about
the scalability of the application-DBMS-OS-hardware stack. The specific
assumptions are rarely an issue for small and even mid-range platforms, but
are of critical importance to E10000 users. The scalability represented in
the table reflects typical user experience. However, applications have been
observed to be both much less and much more scalable.

The chart at left shows the scalability of a number of benchmarks.  The
typical scalability of user benchmarks is in the middle of the range, near
the theoretical linear (dashed line).The user benchmarks included SAP,
PeopleSoft, ad-hoc decision support and some home-grown transaction
processing applications (their specific identities cannot be disclosed due
to non-disclosure requirements.) In the absence of any other information, a
first-order approximation of scalability can be deduced from the ratio of
processing to shared code. The more processing done per unit of data, the
less likely it is that the user will encounter other users in shared code.
TPC-C scales less well than the user applications because it has
proportionately less processing per unit of shared work in the database. The
NFS server benchmark LADDIS does not scale very well past about 16-20 CPUs.
It is nearly a worst-case, because 100% of its code is running under the
operating system's locks, compared with about 15% in the other
commercially-oriented benchmarks.

The last curve illustrates a case which in fact is superlinear. The code is
a heavily multithreaded HPC program that requires approximately 50 MB of CPU
cache to run efficiently. Each processor has 4MB of cache, so the
performance of small configurations is terrible (and also not very
scalable). The application is missing cache constantly and therefore running
at memory speeds. But when the cache configuration exceeds 50 MB, the code
runs from cache and at far higher speed. Once the code runs from cache, it
is far more efficient than the base case, resulting in superlinear
performance. This type of code is uncommon, but does exist. Counterexamples
also exist, of course. We have seen at least one code that ran about as fast
on a 16-cpu system as on a uniprocessor. In fact, when we turned off
fourteen of the cpus, response time got slightly better (gulp)! It turned
out that the application was using a very crude locking strategy. Users had
to obtain exclusive access to the primary table in the database in order to
make progress. Even worse, the table was locked for essentially the entire
duration of a transaction, resulting in a system that could effectively run
only a single user's transaction at any given time. Scalability was almost
literally zero.

Scalability of "small" systems

One last comment on scalability. Scalability is almost never an issue on
small to midrange platforms. The graph below zooms in on the same data
presented above, but concentrates on the 1-24 processor range.With the
exception of the LADDIS curve, it is nearly impossible to distinguish
the various curves.

We've used LADDIS as an example of a code that does not scale well. It's
worth noting that despite the relatively poor scalability, Sun's LADDIS
scores are excellent, and are the highest non-cluster results. See the
SPEC reporting page for LADDIS for an illustration.

------------------------------------------------------------------------
NOTE: 
	to convert between x.xx and xxxx values (new versus old), use the 
	multiplier of 3899 for x.xx to xxxx or divide xxxx by 3899.

------------------------------------------------------------------------
Solaris 8 Relative Performance E10K/1-400MHz Sun Fire Servers (F3800-F6800)

NCPU 900/8MB

1 1.92
2 3.82
3 5.71
4 7.59
5 9.45
6 11.31
7 13.15
8 14.96
9 16.79
10 18.57
11 20.36
12 22.14
13 23.90
14 25.66
15 27.39
16 29.12
17 30.83
18 32.54
19 34.22
20 35.90
21 37.55
22 39.21
23 40.87
24 42.50

Sun Fire F15000  (15K)

NCPU 900/8MB

4 7.26
8 14.22
12 20.92
16 27.34
20 33.50
24 39.44
28 45.15
32 50.62
36 55.87
40 60.92
44 65.78
48 70.45
52 74.90
...
56 79.21
60 83.34
64 87.31
68 91.13
72 94.78

Sun Fire V880

NCPU 750/8MB

1 1.71
2 3.39
3 5.04
4 6.70
5 8.36
6 9.96
7 11.59
8 13.20

Enterprise Servers (E3000-E6500)

NCPU 464/8MB 400/8MB

1 1.18 1.08
2 2.34 2.14
3 3.49 3.18
4 4.61 4.20
5 5.73 5.22
6 6.83 6.22
7 7.90 7.18
8 8.94 8.15
9 9.99 9.12
10 11.01 10.04
11 12.03 10.98
12 13.02 11.87
13 13.99 12.79
14 14.96 13.66
15 15.90 14.52
16 16.82 15.39
17 17.73 16.23
18 18.65 17.07
19 19.54 17.89
20 20.41 18.68
21 21.27 19.49
22 22.11 20.25
23 22.93 21.02
24 23.77 21.78
25 24.56 22.55
26 25.35 23.29
27 26.14 24.00
28 26.90 24.71
29 27.67 25.43
30 28.41 26.11

Enterprise 10000 (Starfire)

NCPU 466/8MB 400/8MB

1 1.10 1.00
2 2.19 1.99
3 3.26 2.98
4 4.33 3.95
5 5.38 4.89
6 6.42 5.86
7 7.46 6.80
8 8.48 7.72
9 9.48 8.64
10 10.47 9.55
11 11.46 10.47
12 12.43 11.36
13 13.40 12.25
14 14.37 13.12
15 15.31 13.99
16 16.25 14.85
17 17.17 15.69
18 18.09 16.54
19 19.01 17.38
20 19.90 18.22
21 20.79 19.03
22 21.66 19.82
23 22.52 20.64
24 23.39 21.43
25 24.23 22.22
26 25.07 22.98
27 25.91 23.77
28 26.73 24.51
29 27.54 25.27
30 28.33 26.01
31 29.15 26.75
32 29.94 27.49
33 30.70 28.23
34 31.46 28.94
35 32.23 29.66
36 32.99 30.34
37 33.73 31.03
38 34.47 31.75
39 35.21 32.41
40 35.92 33.10
41 36.64 33.76
42 37.35 34.42
43 38.06 35.08
44 38.75 35.72
45 39.44 36.36
46 40.10 36.99
47 40.76 37.63
48 41.43 38.27
49 42.09 38.88
50 42.75 39.49
51 43.39 40.10
52 44.03 40.69
53 44.64 41.27
54 45.27 41.86
55 45.89 42.45
56 46.50 43.03
57 47.11 43.59
58 47.69 44.15
59 48.28 44.71
60 48.87 45.27
61 49.43 45.81
62 50.01 46.34
63 50.57 46.88
64 51.13 47.41

Workgroup Servers E250/E450

NCPU 480/8MB 400/4MB

1 1.24 1.08
2 2.42 2.11
3 3.54 3.08
4 4.64 4.03

E220/E420

NCPU 450/4MB

1 1.19
2 2.34
3 3.41
4 4.46

Sun Fire 280R

NCPU 750/8MB

1 1.63
2 3.24

------------------------------------------------------------------------
Sun Constant Performance Metrics
*** must convert 
	to convert between x.xx and xxxx values (new versus old), use the 
	multiplier of 3899 for x.xx to xxxx or divide xxxx by 3899.
------------------------------------------------------------------------

Enterprise 3000 - Enterprise 6500

NCPU 400/8MB 400/4MB 336/4MB 250/4MB 250/1MB 167/1MB 167/512K

1 4210 3900 3300 2710 2360 1880 1670
2 8360 7730 6550 5380 4620 3710 3260
3 12400 11400 9740 8000 6790 5460 4790
4 16400 15100 12800 10500 8870 7160 6240
5 20400 18700 15900 13100 10800 8800 7630
6 24300 22200 18900 15600 12700 10300 8960
7 28200 25700 21900 18000 14600 11900 10200
8 32000 29100 24800 20400 16300 13300 11400
9 35700 32400 27700 22800 18000 14700 12600
10 39400 35700 30500 25100 19600 16100 13700
11 43000 38800 33200 27400 21200 17400 14700
12 46600 42000 35900 29600 22700 18700 15700
13 50100 45000 38600 31800 24100 19900 16700
14 53600 48000 41200 34000 25500 21100 17600
15 57000 51000 43800 36100 26800 22300 18500
16 60300 53900 46300 38200 28100 23400 19300
17 63600 56700 48800 40300 29300 24500 20100
18 66900 59500 51200 42300 30400 25500 20900
19 70100 62200 53600 44300 31500 26500 21600
20 73300 64800 55900 46300 32600 27400 22300
21 76400 67400 58200 48200 33600 28400 23000
22 79500 70000 60500 50100 34600 29300 23600
23 82500 72500 62700 52000 35500 30100 24200
24 85500 74900 64900 53800 36400 31000 24800
25 88400 77400 67000 55600 37300 31800 25300
26 91300 79700 69100 57400 38100 32500 25900
27 94100 82000 71200 59100 38900 33300 26400
28 96900 84300 73200 60800 39700 34000 26900
29 99700 86500 75200 62500 40400 34700 27300
30 102000 88700 77200 64100 41100 35400 27800


Enterprise 10000 (Starfire)

NCPU 400/8MB 400/4MB 336/4MB 250/4MB 250/1MB

1 3920 3630 2970 2360 2120
2 7800 7220 5920 4700 4200
3 11600 10700 8840 7020 6250
4 15400 14200 11700 9330 8250
5 19200 17700 14600 11600 10200
6 22900 21100 17400 13800 12100
7 26600 24500 20200 16100 14000
8 30300 27800 23000 18300 15800
9 33900 31100 25700 20600 17600
10 37500 34400 28400 22800 19400
11 41000 37600 31100 25000 21100
12 44500 40800 33800 27100 22800
13 48000 43900 36400 29300 24500
14 51400 47000 39100 31400 26100
15 54900 50100 41600 33600 27700
16 58200 53100 44200 35700 29300
17 61600 56100 46700 37800 30800
18 64900 59100 49300 39800 32400
19 68200 62000 51700 41900 33800
20 71400 64900 54200 43900 35300
21 74600 67700 56600 46000 36700
22 77800 70600 59100 48000 38100
23 80900 73300 61400 50000 39500
24 84000 76100 63800 52000 40800
25 87100 78800 66100 53900 42100
26 90200 81500 68500 55900 43400
27 93200 84200 70800 57800 44600
28 96200 86800 73000 59800 45900
29 99100 89400 75300 61700 47100
30 102000 91900 77500 63600 48300
31 105000 94500 79700 65400 49400
32 107000 97000 81900 67300 50600
33 110000 99400 84000 69200 51700
34 113000 101000 86200 71000 52800
35 116000 104000 88300 72800 53800
36 119000 106000 90400 74700 54900
37 121000 109000 92500 76500 55900
38 124000 111000 94500 78200 56900
39 127000 113000 96600 80000 57900
40 129000 115000 98600 81800 58900
41 132000 118000 100000 83500 59800
42 135000 120000 102000 85200 60700
43 137000 122000 104000 87000 61600
44 140000 124000 106000 88700 62500
45 142000 126000 108000 90400 63400
46 145000 129000 110000 92100 64300
47 147000 131000 112000 93700 65100
48 150000 133000 114000 95400 65900
49 152000 135000 115000 97000 66700
50 154000 137000 117000 98700 67500
51 157000 139000 119000 100000 68300
52 159000 141000 121000 101000 69100
53 162000 143000 123000 103000 69800
54 164000 144000 124000 105000 70500
55 166000 146000 126000 106000 71200
	to convert between x.xx and xxxx values (new versus old), use the 
	multiplier of 3899 for x.xx to xxxx or divide xxxx by 3899.
56 168000 148000 128000 108000 71900
57 171000 150000 129000 109000 72600
58 173000 152000 131000 111000 73300
59 175000 154000 133000 112000 74000
60 177000 156000 134000 114000 74600
61 179000 157000 136000 115000 75200
62 181000 159000 138000 117000 75900
63 184000 161000 139000 118000 76500
64 186000 162000 141000 120000 77100


Enterprise 250 / Enterprise 450

NCPU 400/2MB 300/2MB 250/2MB
1 3800 3120 2490
2 7500 6150 4810
3 0 0 0 
4 14300 11700 9360


Ultra2 / Ultra60 / Ultra30

NCPU 360/2MB 300/2MB 200/2MB
1 3680 3120 2520
2 6830 5790 4690


Ultra5 / Ultra10

NCPU 440/2MB 360/512K 300/512K 266/512K
1 3280 2410 2050 2050 1840


Ultra1

NCPU 167/512K 143/512K
1 1500 1280


SPARCcenter 2000E / SPARCcenter 2000

NCPU 85/2MB 60/2MB 60/1MB 50/2MB 50/1MB 40/1MB

1 837 770 707 651 451 397
2 1660 1530 1400 1290 898 790
3 2480 2280 2090 1930 1340 1170
4 3290 3020 2780 2560 1770 1560
5 4090 3760 3460 3180 2200 1940
6 4880 4490 4120 3800 2630 2310
7 5660 5210 4790 4410 3050 2690
8 6440 5920 5440 5010 3470 3050
9 7200 6630 6090 5610 3880 3420
10 7960 7320 6730 6200 4290 3780
11 8710 8010 7360 6780 4700 4130
12 9450 8690 7990 7360 5100 4480
13 10100 9370 8610 7930 5490 4830
14 10900 10000 9220 8490 5880 5180
15 11600 10600 9830 9050 6270 5520
16 12300 11300 10400 9600 6650 5850
17 13000 11900 11000 10100 7030 6190
18 13700 12600 11600 10600 7400 6520
19 14400 13200 12100 11200 7770 6840
20 15000 13800 12700 11700 8140 7160


SPARCserver 1000E / SPARCserver 1000

NCPU 85/1MB 60/1MB 50/1MB 40/1MB

1 662 623 441 387
2 1280 1210 829 724
3 1880 1770 1170 1010
4 2430 2310 1470 1270
5 2960 2810 1730 1490
6 3460 3290 1960 1680
7 3930 3750 2170 1850
8 4380 4190 2350 2000


SPARCstation20

NCPU  75/1MB 60/1MB 50/1MB 50/noE$

1 638 530 402 290
2 1230 1020 782 563
3 0 0 0 0
4 0 0 1480 0


SPARCstation10

NCPU  50/1MB 40/1MB 40/noE$

1 338 284 204
2 657 552 397
3 0 0 0
4 1240 0 0 


SPARCstation2, SPARCstation IPX

NCPU  40/64K

1 149