For many organizations, the cost of running High Performance Computing (HPC) engineering simulation workloads in the cloud is a major concern at many manufacturers. In a recent market survey conducted by a third party for AWS, almost half of the participants said cost and cost-management for cloud were barriers. This impression is often based on simplified calculations just based on the number of core hours needed for their Computer Aided Engineering (CAE) simulations, multiplied by the hourly core hour prices see e.g. here.
With such a rough estimate, however, it is difficult to get a reasonable picture of the total cost of cloud and how the company would benefit (ROI) from such a cloud investment.
Therefore, in the following, we calculate the total cost a company has to spend for an engineering simulation workplace in the cloud, and contrast this with the list of potential benefits the company gains from such an investment.
How Much is the Cost of an Engineering Simulation Workplace in the Cloud?
According to ZipRecruiter, the average annual salary of a senior simulation engineer in 2024 for example in the Detroit area is $127K. The total cost for the engineer's company including all additional labor costs is usually estimated to be roughly 2x of $127K. There are about 250 workdays in the US in 2024, resulting in the engineer's total cost of 2x $127K / 250 = $1,016 per day.
In addition, we estimate that the engineer uses commercial simulation software (e.g. from Ansys, Cadence, Dassault, Siemens) at a license cost of $100K per year, resp $400 per day. Suppose the engineer submits ten 2-hour simulation jobs per workday (each job running on 4 compute nodes with $5 per compute node per hour), resulting in cloud consumption of $200 per day.
In total, an engineer’s workplace, simulation software, and cloud consumption result in the following cost estimation (per day):
- Engineer 2x $127K / 250 days = $1016
- Software license $100K / 250 days = $400
- Cloud usage cost per day $20 x 10 *) = $200
- Total engineering workplace cost = $1616 per day
*) Comparison with an on-premise HPC system’s total cost of ownership (TCO) should include, among others, air cooling systems, employee salaries, employee training, energy consumption, facilities-related costs, liquid cooling systems, system downtime, subscription-based licensing, and system maintenance and support, over the life-time of the HPC system, https://www.ansys.com/blog/understanding-total-cost-ownership-hpc-ai-systems.
“The cost of an average engineering simulation workplace in the Midwest is $1616 per day.”
What if You Collaborate with an HPC Cloud Service Provider (CSP)?
The above cost calculation does not include any cost for a do-it-yourself (DIY) setup of the HPC cloud environment. Such a setup can easily amount to several person months. And according to a recent Gartner study, 83% of DIY cloud migration projects either fail or exceed their budgets and schedules.
An alternative is to work with an expert team from one of the Cloud Services Providers (CSPs, e.g. Simr, Rescale, SimScale) which are familiar with the SimOps (Simulation Operations Automation) best practices Framework. Although their cloud-bases simulation offerings often differentiate in the software platform and services, comparison and weighing against each other is not too difficult.
Either you are working for a small company (SME) with one or just a few commercial ISV codes, or you are working for a large manufacturing company with a dozen or more ISV and/or inhouse ‘homegrown’ multi-physics simulation workflows, the choice might be relatively clear: an SME might favor a SaaS solution, e.g. Ansys Cloud, 3DEXPERIENCE, Rescale, or SimScale, while a large enterprise tends to look for more agility, control, and customization, e.g. the Simr Platform.
Anyway, this CSP cost should be added to the above total workplace cost. Suppose a cloud simulation platform license costs $15K per engineer per year, or $15K / 250 workdays = $60 per workday, this results in just 60 / (1616 + 60) x 100 = 3.6% of a simulation engineer's workplace total cost per day in 2024 in the Detroit area.
“The cost of a simulation cloud platform is just 3.6% of an engineer’s total workplace cost per day in the Detroit area.”
And What About Cloud License Cost?
There is often the consideration that, to be able to use more compute resources in the cloud (to increase the engineer’s productivity), additional ISV licenses are necessary, either BYOL (Bring Your Own License) or flexible, pay-per-use licensing that enables usage-based licensing for the software, that many ISVs offer.
But even this additional license cost is relatively low compared to the overall cost of an engineering workplace, namely e.g. $100K / 250 workdays = $400 per day per engineer resp 400 / (1616 + 400) x 100 = 20% of a simulation engineer's total cost per day in 2024 in the Detroit area.
“The cost of an ISV cloud license is only about 20% of an engineer’s workplace per day, but comes with a 100% productivity increase for the engineer.”
What are the Benefits and Cost Savings from this Cloud Investment?
But what Return on Investment (ROI) does the company get for the additional 3.6% cloud service provider cost, resp the 20% of additional simulation software license cost? Here is a short list of additional benefits for the company and its engineers, their managers, corporate IT, and decision makes:
For Engineers and Their Managers:
- Fully automated simulation cloud set-up, access, and usage process.
- Conversion of traditional on-premise HPC to "HPC as a Service“ resp. “CAE as a Service”.
- Run more and more accurate simulations in the same (or even shorter) time.
Resulting in an average of 10X increase of engineer’s productivity.
Always access to latest (fastest) HPC hardware and CAE simulation software.
User friendly: "HPC Cloud with one click“, “no learning needed”.
Secure access to the engineering simulation environment from anywhere in the world. - Flexibility in the choice and use of a wide variety of compute resources (AMD, Intel, NVIDIA).
Containers with engineer's simulation workflow and an interactive virtual desktop.
Remote interactive GPU-accelerated visualization of simulation results.
For Corporate IT:
- Save time and money by avoiding unforeseeable efforts due to “Do It Yourself”.
- 20%-50% IT cost savings by using CSP tools for managing, monitoring, health-checking, maintaining, operating, and supporting the engineers' simulation environment in the cloud.
Interact with CSP's engineering team to continuously update and improve cloud services. - Benefit from SimOps (Simulation Operations Automation) best practices based on real use cases.
- Few corporate IT teams have cumulative Cloud, HPC, and CAE experience of a CSP’s Devs, DevOps, FinOps, and SimOps teams, and these experts are difficult to find in today's job market.
And for Corporate Decision Makers:
- Shortening of product development cycles and thus time to market.
- Improvement of product quality through detailed, faster, and more parameter studies.
- Thus strengthening competitiveness and ability to innovate fast.
- Cost savings: expensive upfront hardware acquisitions and maintenance are no longer necessary.
- Integrability of the HPC Cloud environment into the company’s IT environment (as part of corporate digital transformation) and thus abolition of internal IT silos.
- No cloud lock-in through standard containerization and Kubernetes portability.
- Bridging the skills gap by making their own engineers and IT specialists more SimOps productive.
Finally, Cost Savings with Cloud HPC:
The additional cloud related investments in CSP and software (as mentioned above) can result in the following cost savings:
- Using reserved and spot instances save 40% – 80%.
- 20% to 50% cost savings by CSP's monitoring, analysis, and optimization tools.
- Increase CAE license efficiency by e.g. 2x, with 2x faster (and more) cloud hardware, i.e. while 1 engineer uses a CAE license for 2 hours on-prem for one job, this engineer can now use the same license in the same time for 2 jobs, in the 2x faster cloud! Useful for Design of Experiment (DoE), Machine Learning, Parameter Studies, etc.
- Cost savings through more productive engineers: Suppose 1 engineer (cost p.a. $250K) on 1 server needs 10 hours for 10 simulations on premises, while she needs 1 hour for the 10 simulations on 10 cloud servers (running in parallel) which would be equivalent to 10 engineers each running one simulation on premises. Savings: $2,25M p.a.
- More simulations (in the cloud) allow you to discover potential failures in your next-gen products earlier in the design/development cycle, thus potentially saving millions of $$ by e.g. avoiding expensive recall actions.
- Cloud OPEX replaces most of on-prem CAPEX, thus no large upfront expenses and long procurement, implementation, and quality testing times.
“More and faster compute resources in the cloud enable more simulations with more parameters, producing better results (finding better materials, geometries, physics) and higher quality products.”