Scale testing the world's largest PKI, all on WS08/R2 and Hyper-V

Published 25 September 09 12:31 PM | EEC 

This week, we’ve been in the EEC doing our scale testing on the world’s largest PKI, issuing 100s of millions of certificates from 100s of CAs to devices around the world. The entire design is built on WS08R2 Hyper-V and WS08/WS08R2 CAs.

To simulate one of their Hyper-V hosts, we used a similar machine to the ones being used currently in the hosting facilities: a Sun Microsystems 2.4GHz, 4 socket, quad core machine with 64GB.  We loaded our host with 10 VMs, each assigned a single VCPU and 6GB.  All 10 of these VMs were connected to an nCipher netHSM 2000.  To generate load, each CA VM was paired with a single DC and 5 client machines, each assigned a single VCPU and 2GB and separated from the CA by a WAN simulator that added latency and throughput constraints based on the VPNs used to link bases to the hosting facility.  We used the EntGenReq tool to have each client machine open 4 request sessions and requests 1000000 2K key certs per session.

After <24 hours, we had issued >20 million certificates from this single physical chassis.  During these tests, we found that:

  •  Per VM CPU load was ~25%, total host CPU load was ~20%
  • Relatively little memory was required by the CA VMs, even at this high stress; thus we’re optimizing the design to increase the density of CA VMs per chassis, to 30:1 (2GB per VM)
  • The performance bottleneck in this design is the HSM; as we increased the number of CA VMs being stressed, our requests per second per CA fell significantly, from >100 to ~18-20, giving a net issuance rate for the entire chassis of ~200 per second
  • When investigating the HSM, it became clear that it was the gating component (150 request queue remained saturated and CPU is pegged at consistently at 85%)

Overall, this testing was a great validation of the performance of ADCS.  Microsoft software ran as fast as the HSM would allow and we gracefully handled response delays introduced by it.  Also, the fact that we’re able to run this configuration entirely on Hyper-V and get ~30 CAs per physical host provides an efficient scale story for even the very largest and most complex environments around.

Dustin Clarkson,

Program Manager - EEC

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(optional)
(required) 

  
Enter Code Here: Required

Search

This Blog

Syndication

Page view tracker