This article discusses several changes we have made to the EXPIRE command and how it works. We have added the ability to expire individual members of a set with the EXPIREMEMBER command and have also enabled active deletion to operate in near real time which has big advantages for those who rely heavily on using expires in production. Throughout the work with the EXPIRE command we have actually achieved 5-10% memory savings with these.
This work has been done in KeyDB which is an open source, high performance fork of Redis, where we are able to implement features that may never become a part of Redis. Features such as multithreading (5x perf increase) and active-replication reduce sharding (up to 10x) benefiting a range of users, especially those who value simplicity in their production setup.
As an example, adding in the ability to expire individual members of a set has been a highly requested feature, yet its something the creators of Redis fundamentally do not want to add (RE:Github issue#135):
This option has been requested for years, even recently, with complex workarounds being required to achieve this need. We will discuss current methods/workarounds and the new EXPIREMEMBER feature later in this blog.
KeyDB is a database that supports features that make life easier, and we believe that with the right approach you don’t have to compromise on performance. We are also actively seeking feedback and requests on our open source project
This section discusses how we were able to update the EXPIRE function to actually remove expired keys in an O(N) time with a deterministic algorithm. You may not think this is significant, however if you rely heavily on the EXPIRE feature, this can mean considerable memory savings.
The EXPIRE command is widely used in applications where you are keeping track of user sessions and for applications where temporary data is important but also where memory consumption needs to be moderated.
The EXPIRE command uses passive and active deletion. Passive deletion means that if we call an expired key that is overdue, it will be deleted at that time. However some keys may never be called again, which is where active deletion takes place. Redis uses a randomized approach to search 20 random keys ten times per sec with an associated expire. This is a probabilistic algorithm vs KeyDB’s deterministic algorithm.
This past approach is fine for someone using a marginal amount of keys to expire. However what if you rely heavily on EXPIRE? What if the timed intervals on your expires varies drastically? Well lets take a few examples and see how long keys might remain in memory after expiration:
The length it takes to remove overdue expires through active deletion is non-linear with Redis and due the randomized algorithm, this length gets worse the higher the ratio of active expires to overdue expires. With KeyDBs deterministic algorithm we are able to expire in a near linear relationship. In the scenario of the chart above the KeyDB keys were removed in less than 1 sec .
To get a better understanding of how KeyDB has optimized the feature, lets look at 2 scenarios below where there are enough keys to notice a delay in KeyDB active deletion. In the first chart we run a test where we have 10 million keys with the same expiry (same unix timestamp). When they become overdue both Redis and KeyDB will remove them in a reasonable amount of time (same CPU usage also). The big difference in methods comes into play when there are a lot of expires, but a smaller percentage of them are overdue.
In the second chart, you can see that with 20 million expires, if 10 million become overdue (TTL<=0) at the same time, KeyDB removes them all in a linear pattern and short amount of time. Redis has only removed a small percentage (and this is steepest part of the curve – reference model above).
Now if you rely heavily on EXPIRE and are continually adding new keys with expirations, what you might theoretically calculate for memory usage could vary drastically from what your memory consumption holds. Being able to remove expired keys in almost real time ensures your database does not have a huge delay in memory rampdown and you don’t get a buildup of overdue expires that could linger around for days.
Why isn’t this how Redis does EXPIRES? You may be thinking ‘this must be resource intensive, or consume more memory’. Redis and KeyDB use the same CPU intensity for this when benchmarking this command head to head, however keep in mind KeyDB is multithreaded, and if more than 1 thread is enabled for a node it can get up to 5X the perf. Regarding memory we are happy to say that the improvements to the EXPIRE feature actually saves memory! If you measure the memory consumption of the same set of keys with expires, KeyDB comes in at approximately 10% less memory usage than the same set used on Redis! So not only do you immediately remove overdue expires to optimize memory use, you save an extra 10% on the expires still queued! For heavy EXPIRE users this feature is highly beneficial, not to mention the INFO KEYSPACE will also now contain accurate information. This can improve the accuracy of your metrics if applicable.
Is it always best to use external tools to minimize memory consumption, complexity, or size? Is it better to think more about the codebase than the applications that use it? Not always. You as the user should have the option to choose the tradeoffs that you value. KeyDB has a different philosophy that promotes simplicity and drives to have less moving parts… and we still prioritize performance!
Lets look first at KeyDB’s new EXPIREMEMBER command and then compare the alternatives that people have had to put into adoption to achieve this:
In the above example we set the member, member3, to expire in 10 seconds. 10 seconds later it is gone and the other members remain.Past work arounds to achieve the need for this feature include (simplified descriptions):
- Store the set element together with a timestamp – ie. store "foo:unix-timestamp".
- Use a sorted set (a single, global one for all the sets you have) where the score is the unix time
For both of these scenarios it is up to you to find a way to detect when a unix timestamp has been exceeded and remove the keys. People have used the KEYS command to search for them and then remove, others have used ZRANGEBYSCORE to minimize CPU cycles in searching. You then must find your own way to delete them. Decide how frequently you will run these queries and either set up a cron job to do this or some other client side method. Using only unix timestamps, you must ensure your server clock is synced with the client to avoid inaccuracy, and you cant rely on the server doing this for you by specifying TTL.
Setting up additional moving parts to control such a feature increases the complexity, setup and maintenance of your system and if these parts fail you can end up ballooning out your database size. The expiration of keys is something already built into the database and should follow through to members of sets instead of workarounds.
The simplicity and choice for a built in command such as EXPIREMEMBER seemed to us like a no-brainer to add as an option. Regarding “it being too memory consuming and complex”, well the above options are not necessarily considered “a simple solution” either. Relating to memory consumption, EXPIRE and EXPIREMEMBER use less memory than Redis EXPIRE, however if you are already using one of the more complex methods above, it may or may not save you memory. But is the tradeoff worth it?
- Due to KeyDB memory saving on keys with EXPIRES, if you were to compare ‘x’ amount of keys with EXPIREs set in Redis, to the equivalent number of keys that are members of a set with EXPIREMEMBER command set, there is actually a 5% savings in memory using EXPIREMEMBER
- Comparing the workarounds, a set with attached unix timestamps to each member takes up 33% less memory than a set that has expires set for each member of that set using EXPIREMEMBER. So in this case and with sorted sets, the workaround may arguably save some memory, but at the expense of external workarounds to the base.
The ability to trim down your datasets and ensure expired tags are removed immediately, to us, outweighs any workaround. With the memory savings on the EXPIRE feature itself, we feel these changes were a good improvement to the project.
If you like these features, want to try it out, or want to learn more, you can check out the open source project, KeyDB here
You can check out our docker image here, and if you want to see the other features just released such as updates to OBJECT LASTMODIFIED, and the addition of LSHIFT and RSHIFT to BITOP, click here. Another popular feature is KeyDB's active-replica feature which you can see an overview of here. To see the rest of KeyDB’s commands, or more detail on how to use these commands, you can find them here.
KeyDB has some cutting edge plans in store and we will be discussing some of these features in the coming months. To keep up to date with where the KeyDB project is going please subscribe, we feel strongly against spamming and try to keep our emails informative about the project.