04-16-2018 02:52 PM - last edited on 11-26-2018 09:55 AM by Larry
Got this question forwarded from customer. I think the general tuning question is too broad, but maybe someone has seen this come up before from a customer and knows what they're after.
This is a theoretical question from a customer.
Assuming the HW is capable enough, how could we tune the server to 1 M read ops/sec?
What is the relationship between Worker Threads and and performance and if we were to tune the server to 1 M read ops/sec what would the optimal worker thread count be?
Currently, a customer has tuned the server to 120 K read ops/sec.
Solved! Go to Solution.
04-17-2018 10:16 AM - edited 04-17-2018 10:16 AM
04-17-2018 10:16 AM - edited 04-17-2018 10:16 AMSolution
Since the hardware theoretically can achieve the performance, and so does the software, problem solved.
In reality, it's is actually quite possible to reach these levels but it does depend on the hardware.
Optimally, I reckon our internal processing time has a hard floor at around 45 us.
Even if you ruled out all network, memory bus bottleneck, data bus, storage and interrupt contention, you would be able to process 1/0.000045 = 22,222 transactions per core.
We have actually achieved that, during a collaborative test campaign with Cisco (via VCE), where we reached 21,250 transactions per second per core for a total of around 255,000 per second, completely maxing out CPU. Pretty close to the theoretical limit.
I found a quick overview of the testing, attached here.
Bottom line, we had to work with Cisco engineering to cut us a special build of the NIC driver to remove the largest bottleneck. Maxing out performance isn't always just about increasing memory or the number of threads...