Long back Google has reportedthat a 500ms (half a second) delay caused 20% drop in their traffic and Amazon founda 100ms of extra latency dropped their sales by 1%.
So in any software application which uses persistent data, introducing a cache is a very common technique to improve the performance by reducing the latency. But there are many things to consider before coming up with a suitable caching mechanism like what, where and how to cache.
Let’s take a three tier application for a brief discussion:
What to cache
Any data that is persisted in a disk (file system/database) are candidates for cache. But caching all of them may not be an efficient idea. To shortlist the data to be cached think in following lines:
- Is consistency is more important for that data? Example: displaying the price of an item during check-out. Here we should not use a cached data but show up-to-date data. However in other pages where the products are listed, a cached price can be shown.
- Effectiveness of cache – if the data is keep on changing/generating then caching the data is of no use. Example: to show real time price of a stock or number of seconds remaining for a bid.
- Structure of the data – sometime it’s better to cache in a format that’s a combination of multiple data together, instead of caching it alone.
Where to cache
The main purpose of cache is to reduce the latency of your application. So it should be kept as close as to the application which uses it. The data tier can be a different location or data center, but the cache should be preferably within the same low latency network. In case of a Cloud deployment like AWS, the cache node(s) should be kept in the same Availability Zone (AZ).
How to cache
Many in-memory database options are available which can be easily used as a cache. This wiki pagelists almost all of them. Two of the most common are Memcached and Redis. To decide between these consider following points
Use Memcached for:
· Simple Object caching, i.e. to just to offload database
· Multi-threaded performance
· Horizontal scalability
Use Redis for:
· Data types like lists, sets hashes etc.
· Sorting, Ranking etc.
· Publish-subscribe capabilities
PS: This was originally published in LinkedIn – https://www.linkedin.com/pulse/application-performance-improvement-caching-anil-g-kurian
Reference and further reading: