Lately I have had a few clients inquire as to how scalable PrestaShop is and what kind of set up would be needed to scale PrestaShop to a site that receives a lot of traffic. By a lot of traffic we are talking thousands or tens of thousands of concurrent users on the site at once browsing around or making purchases. Configuring a site like this is a whole different ball game than configuring a PrestaShop site that gets a moderate amount of traffic and sales.
To make the PrestaShop scalable the first thing you need to do is optimize it. Which is also the last thing you need to do as well. Weird right? I will talk more about the final optimization at the end of the article, lets talk about the first optimization now. I use the general optimization best practices when optimizing a site for scalability. Here is a brief list of things that I do.
- Unhook modules that are not being used
- Set the server up in a thread safe php interpreter like fcgi
- Install mod_spdy
- Optimize the css/js/and html content of the site
- Optimize the images
- Install an Opcode cache, I like APC still
- Install a caching module, I like the one from Xtendify
- Make sure all of PrestaShop’s caching is enabled, even the APC cache, because I run APC
Once those things are out of the way the site should run fairly quick with no load on it. If set up properly you should be looking at under 1 second page load times. But they will grow as more users hit the site.
I generally choose AWS for the platform of high traffic sites. The reason being is that it has a lot of built in features that will handle the load well. The best instance for loading PrestaShop sites is the M3 instance, it does load the fastest from the tests I have run. At the same time though, the M3 instance is expensive for the performance gains that you get. That being the case I usually recommend C3 instance types. They are the best all around instance for PrestaShop when price is a factor.
The Setup for High Traffic
From the testing I have done a C3 Xlarge instance will handle around 600 concurrent users, just depending on how well the site is optimized, how many products are displayed on a page, and what modules are running. If you have a resource intensive site you will need something bigger. But 600 concurrent users is not a lot in the grand scheme of things if you run a high traffic site. How to you do it? Do you run a load balancer in front of your instance and spawn more? No, you will wreck your PrestaShop installation if you do that.
The first thing you need to do when considering high traffic with PrestaShop is to break your database away from your actual php files. Set up a RDS instance and run your database from there. Generally RDS instances are slow I have found, but a good mix that can return fast results is using a large RDS instance with 2000 PIOPS. That should handle around 3000-5000 concurrent users and start dropping rapidly when more users join the site.
So currently the database can handle around 5000 concurrent users, but the instance that the main site files are on cannot. What do you do? You can try growing the instance vertically, but it will not scale in the manner you think it will. Doubling the instance size will only add about 25% more users concurrently. The best practice to do in this situation is to move an Elastic Load Balancer in front of the instance. That way you can spawn multiple instances the same size to deal with the load. When you set a load balancer up in Amazon, one thing you have to take into account is that PrestaShop uses session cookies. If you transfer one user to another node during their session, the cart will be lost. Elastic Load Balancers have an option called sticky sessions, you will need to enable this so that users are kept on the same node and cart data is not lost. Normally you can run about 5 -8 C3 XL instances off of one properly configured RDS XL M3 instance.
When you start spawning instances with a load balancer there are several major considerations to keep into account. The absolute best practice in operating a site like this is doing it this way. Keep the site files in a repo like github or something similar. That way you can push changes live. But another consideration you have to take into account is the spawned instances and their dependencies. What I recommend as the best practice is when you have your deployment code, build an Amazon Machine Instance off of it that can be spawned. This is essential in making a deployment with an Elastic Load Balancer in front od AWS instances. The reason being is that everything needs to be in sync. The way that you keep the instances in sync is to use a S3 bucket for a few directories. Since we have sticky sessions turned and are using one database, if someone uploads a new product, the image will land on an instance. But at the same time the product will be added globally and no one off of that instance will be able to access the product image. So you have to create an S3 bucket for the following directories.
When you build your AMI you will want to dynamically link in the S3 bucket so when the instance loads it will load your buckets that have the files needed for the bucket to link properly.
So we have covered how to handle 3-5k users at once, but what if you need more? If you need more, the weak link is going to be your database instance. You can grow it and it will handle more, but AWS is set up to scale horizontally, not vertically. So what you will need to do is to start spawning read replicas. RDS instance can have what are called read replicas or slaves. These database instances are only used for reading information, all of the storage is sent to the main database and replicated across the network of databases to be read. These are the backbone in scaling up from the base of 3-5k users. Using read replicas you can scale a site to around 20k concurrent users before any major code changes need to take place. Just to give an idea of what 20k concurrent users is in normal traffic, that would equal about 12 million users a week or close to 50 million users a month.
The Last Optimization
When I started this article I mentioned that their was a final optimization. The final optimization does not become apparent until you are load testing the site, so it varies greatly with each site. One site we have worked on for instance did not need to respect any stock, everything was custom made. So instead of using PrestaShop’s smarty cache, we wrote a custom system that actually wrote every page to disk in its path as a html file. That way pages were loaded as static html pages and it really increased the throughput of requests by almost doubling them per instance. Other times we have modified the default queries that PrestaShop uses. If you exclude a table here or there, you will never notice it on a site that does not have a lot of traffic. But when things are scaled up, everything is scaled up. If an extra table was adding 1 millisecond to a query with 1 person on a site, when it has 5000 people it adds 5000 milliseconds over the span of the queries. It becomes more noticeable.
The fact of the matter is that every site is different and the approach to developing a high capacity site varies greatly from site to site depending on the sites needs. One thing that I left out, because their was no place to really insert it is that I run CDN’s in front of these sites. PrestaShop calls them media servers, but they are the same thing. The reason I run them is that the latency for external requests to a S3 bucket are too high. They could be run natively off of Amazon, but I have found that it does not deliver the best performance plus you are hitting the bucket through several round robins and they are not meant to serve that much data up that quickly.
Are you looking to run a high capacity PrestaShop site? Are you wondering if PrestaShop is the best solution for your high traffic site? Contact us, we will give you a free assessment and let you know what we think.