Engenharia10 min de leitura

Guia de dimensionamento de servidor cloud: quanto CPU, RAM e armazenamento?

Superprovisionamento desperdiça dinheiro. Subprovisionamento derruba seu site. Veja como dimensionar corretamente seu servidor cloud para a carga de trabalho real.

Ioana Dragomir

Equipe de Marketing · 14 de fevereiro de 2026

Server resource monitoring dashboard with CPU and RAM metrics

Foto de Lukas · Pexels

Why Sizing Matters

Choosing the right cloud server configuration is one of the most consequential decisions in any hosting project. Over-provision and you waste money every month on resources that sit idle. Under-provision and your application suffers from slow response times, timeouts, or outright crashes during traffic spikes. Neither outcome is acceptable for a production workload, yet a surprising number of businesses choose their server size based on gut feeling, vendor defaults, or whatever configuration a tutorial blog recommended three years ago.

The financial impact of incorrect sizing is significant. Our analysis of over 200 client migrations shows that the average business over-provisions cloud resources by 35% to 50% when configuring servers without data-driven guidance. On a monthly bill of 200 euros, that represents 70 to 100 euros wasted every month — over 800 to 1,200 euros per year. Conversely, under-provisioned servers cause performance issues that directly affect revenue. For e-commerce sites, a one-second increase in page load time can reduce conversions by 7%, a cost that far exceeds the savings from choosing a smaller server.

Proper sizing is not a one-time decision. Workloads evolve as applications grow, traffic patterns shift, and feature sets expand. A server configuration that was perfect six months ago may be inadequate today. This guide provides a framework for making informed sizing decisions at launch and adjusting them over time based on real performance data. Whether you are deploying a simple marketing site or a complex application stack, these principles will help you allocate resources efficiently.

Understanding CPU Requirements

CPU cores determine how many operations your server can execute simultaneously. Applications that process many concurrent requests, web servers, API endpoints, real-time data processing, benefit from higher core counts. Applications that perform intensive single-threaded computations, video encoding, scientific calculations, certain database operations, benefit more from faster individual cores. Understanding which pattern your workload follows is the first step in CPU sizing.

For most web applications, CPU utilization should average between 30% and 60% under normal load, with headroom to handle traffic spikes up to 80% before performance degrades. If your average CPU utilization is below 20%, you are likely over-provisioned and could safely downsize. If it regularly exceeds 70%, you should consider scaling up or out before users experience latency. A WordPress site serving 50,000 monthly visitors typically runs comfortably on 2 vCPUs. A Node.js API handling 500 requests per second may need 4 to 8 vCPUs depending on the complexity of each request.

Modern cloud providers offer different CPU types optimized for different workloads. Compute-optimized instances provide higher clock speeds and are ideal for CPU-bound tasks. General-purpose instances balance CPU, memory, and networking for typical web workloads. At GRADAX, we default to general-purpose instances for most client deployments and recommend compute-optimized only when profiling confirms that CPU is the primary bottleneck. Choosing the right CPU type can deliver 20% to 30% better performance at the same price point, making it worth the extra analysis.

RAM Sizing by Workload

Memory is often the resource that determines whether an application runs smoothly or crashes under load. Unlike CPU, which can queue work when saturated (resulting in slower responses), running out of RAM triggers the kernel's out-of-memory killer, which terminates processes abruptly and can bring down your entire application. For this reason, we recommend sizing RAM with a larger safety margin than CPU, target 50% to 70% average utilization, leaving substantial headroom for traffic spikes and memory leaks.

Different application stacks have vastly different memory requirements. A static site served by Nginx needs as little as 256 MB. A PHP WordPress site with object caching and a database running on the same server needs 2 to 4 GB. A Java Spring Boot application with an embedded Elasticsearch instance may require 8 to 16 GB. Node.js applications fall somewhere in between, typically requiring 1 to 4 GB per process depending on payload sizes and concurrent connections. When multiple services share a server, add their individual requirements plus a 20% buffer for the operating system and background processes.

Database servers deserve special attention for RAM sizing. Databases perform dramatically better when their working set, the data that is actively queried — fits entirely in memory. A MySQL database with a 5 GB working set running on a server with 4 GB of RAM will constantly read from disk, resulting in query times measured in milliseconds becoming tens of milliseconds or more. The same database with 8 GB of RAM keeps the working set cached, delivering consistent sub-millisecond reads. For dedicated database servers, we recommend allocating 1.5x to 2x the working set size in RAM as a starting point.

Storage Types and Sizing

Cloud storage comes in three primary tiers: NVMe SSD, standard SSD, and HDD-based block storage. NVMe SSDs deliver the highest performance with read speeds exceeding 3,000 MB/s and random I/O operations in the hundreds of thousands per second. Standard SSDs offer a balance of performance and cost, typically delivering 500 to 1,000 MB/s throughput. HDD-based storage is the most affordable but introduces significant latency for random reads, making it suitable only for archival data, logs, and sequential workloads like backups.

For storage capacity sizing, start with your current data footprint and project growth over 12 to 18 months. A typical WordPress site with a media library needs 5 to 20 GB. An e-commerce platform with product images might need 50 to 200 GB. A SaaS application with user-generated content can grow unpredictably and benefits from scalable storage that can expand without downtime. Always reserve at least 20% free space on any volume, filesystems and databases degrade in performance as they approach capacity, and running out of disk space is one of the most common causes of unplanned outages.

IOPS (Input/Output Operations Per Second) is the storage metric that most impacts application performance, yet it is frequently overlooked during sizing. A database-heavy application might need 3,000 to 10,000 IOPS to maintain acceptable query performance. A content-heavy site with mostly read operations might need only 500 IOPS. Many cloud providers tie IOPS to volume size, a 100 GB volume might be capped at 3,000 IOPS while a 500 GB volume gets 15,000. This means you sometimes need to provision more storage than you need purely to get adequate IOPS, a nuance that catches many first-time cloud users off guard.

Bandwidth and Network Considerations

Network bandwidth determines how much data your server can send and receive per second. For web applications, outbound bandwidth is typically the bottleneck because servers send far more data to users than they receive. A page that weighs 2 MB served to 100 concurrent users requires 200 MB/s of outbound bandwidth at peak. A CDN in front of your server reduces this dramatically by caching static assets at edge locations, but dynamic content still needs to travel from your origin server.

Most cloud providers include a baseline bandwidth allocation with each instance size and charge overage fees beyond that limit. At GRADAX, our web hosting plans include generous bandwidth allowances because we have observed that unexpected bandwidth charges are one of the top billing surprises for cloud newcomers. We recommend monitoring your monthly bandwidth consumption during the first three months and alerting at 80% of your allocation to avoid bill shock. For media-heavy sites or applications that serve large file downloads, factor bandwidth costs into your sizing decision alongside CPU, RAM, and storage.

Network latency is equally important, particularly for geographically distributed audiences. A server in Frankfurt delivers sub-20ms latency to users across central Europe but 150ms or more to users in Asia-Pacific. If your audience spans multiple continents, a single server location will not provide a consistent experience. Multi-region deployment or a CDN with edge caching is essential. We typically recommend starting with a single region closest to the majority of your user base and expanding to additional regions only when performance data justifies the added complexity and cost.

Sizing for Common Workloads

For WordPress sites with up to 100,000 monthly visitors, we recommend starting with 2 vCPUs, 4 GB RAM, and 40 GB NVMe SSD storage. This configuration handles WordPress, MySQL, Redis object cache, and Nginx comfortably with headroom for traffic spikes. WooCommerce stores with active inventories should add an extra 2 GB of RAM and consider a separate database instance once traffic exceeds 200,000 monthly visitors or the product catalog surpasses 10,000 items.

Node.js applications vary widely depending on their architecture. A simple Express.js API serving JSON responses runs efficiently on 1 vCPU and 1 GB RAM for moderate traffic. A Next.js application with server-side rendering requires 2 to 4 vCPUs and 2 to 4 GB RAM because each render is CPU-intensive. Real-time applications using WebSockets need more memory to maintain persistent connections, budget approximately 10 KB of RAM per concurrent connection, meaning 100,000 concurrent connections require roughly 1 GB of RAM for connection state alone, plus whatever the application logic demands.

E-commerce platforms with dedicated server infrastructure have the most demanding requirements due to the combination of dynamic page rendering, database queries for product catalogs and inventory, real-time cart and checkout processing, and image serving. A mid-size e-commerce site handling 500,000 monthly visitors typically needs 4 to 8 vCPUs, 8 to 16 GB RAM, 100 GB or more of SSD storage, and a separate database server with at least 8 GB RAM. During peak events like flash sales or holiday promotions, these requirements can double or triple, making auto-scaling capabilities essential.

When to Scale Up vs Scale Out

Scaling up (vertical scaling) means adding more CPU, RAM, or storage to an existing server. Scaling out (horizontal scaling) means adding more servers behind a load balancer. Each approach has trade-offs that should inform your decision. Scaling up is simpler: no application architecture changes, no load balancer configuration, no session management concerns. It works well for databases, monolithic applications, and workloads that are difficult to distribute across multiple nodes.

Scaling out is more resilient and cost-effective at larger scales. If one server in a horizontally scaled cluster fails, the remaining servers absorb the load with minimal impact. Horizontal scaling also allows you to use smaller, less expensive instances that collectively provide more compute power than a single large instance at a lower total cost. However, horizontal scaling requires your application to be stateless or to externalize session state to a shared store like Redis. It also requires a load balancer and introduces complexity in deployments, logging, and debugging.

Our general recommendation is to scale up first until you hit a natural ceiling, typically around 8 to 16 vCPUs and 32 to 64 GB RAM, and then scale out. This approach minimizes operational complexity while your workload is small enough to fit on a single machine. Once you outgrow vertical scaling, invest in the architecture changes needed for horizontal scaling, which will serve you well as traffic continues to grow. Contact our infrastructure team for a personalized sizing assessment based on your specific workload characteristics.

Monitoring and Right-Sizing over Time

Initial sizing is an educated estimate. Right-sizing is the ongoing practice of adjusting resources based on actual usage data. We recommend collecting at least two weeks of performance metrics before making sizing adjustments, and ideally four to six weeks to capture weekly traffic cycles, monthly billing cycles, and any seasonal variations. Key metrics to track include average and peak CPU utilization, memory usage and swap activity, disk I/O wait time, and network throughput.

Automated right-sizing tools can analyze usage patterns and recommend optimizations. At GRADAX, our monitoring platform flags instances where average CPU utilization stays below 20% for more than 14 consecutive days, indicating potential over-provisioning. It also alerts when any resource consistently exceeds 75% utilization, signaling that a scale-up or scale-out event should be planned before performance degrades. These alerts have helped our clients reduce infrastructure spend by an average of 28% within the first six months of monitoring.

Right-sizing is not just about cutting costs. It is equally about ensuring adequate performance as workloads grow. A server that ran perfectly six months ago may struggle today because the database grew, traffic increased, or new features added computational overhead. Schedule quarterly sizing reviews where you examine resource utilization trends and compare them against application performance metrics like response time and error rate. This practice catches gradual resource exhaustion before it manifests as user-facing performance issues and ensures your infrastructure investment stays aligned with actual demand.