When I tested the MPaof function, when there was an error on the disk, multiple incr files were generated, like this, the total data volume was about 6GiB
# ls -l appendonlydir
-rw-r--r-- 1 jud jud 1761442651 Apr 3 15:44 appendonly.aof.4.base.aof
-rw-r--r-- 1 jud jud 2376015194 Apr 3 16:12 appendonly.aof.4.incr.aof
-rw-r--r-- 1 jud jud 21334104 Apr 3 16:12 appendonly.aof.5.incr.aof
-rw-r--r-- 1 jud jud 702739358 Apr 3 16:14 appendonly.aof.6.incr.aof
-rw-r--r-- 1 jud jud 563456877 Apr 3 16:16 appendonly.aof.7.incr.aof
-rw-r--r-- 1 jud jud 137416787 Apr 4 00:04 appendonly.aof.8.incr.aof
-rw-r--r-- 1 jud jud 264 Apr 3 16:16 appendonly.aof.manifest
after I restarted Redis, I found that a rewrite will be triggered only when the incremental data exceeds the sum of all the above files, and the disk consumes 13GiB at peak, although my Redis actually only uses about 1.5GB of memory
I also read the code: redis version < 4, the aof_write_base_size was the memory dump size redis version < 7, it was the sum of memory dump size and aof_rewrite_buf size redis version = 7, it was the sum of memory dump size and incr file size
my question is that why aof_write_base_size is not equal the base.aof's size, but equal the sum of base.aof and all incr.aof files' size? Is it better to make aof_write_base_size equal to the size of base.aof?
Comment From: oranagra
it looks like a bug, but it was actually the same before 7.0. on startup, after loading the AOF file, the aof_current_size and aof_rewrite_base_size are both set to the full size of the AOF file.
loaded_ok: /* DB loaded, cleanup and return C_OK to the caller. */
....
aofUpdateCurrentSize();
server.aof_rewrite_base_size = server.aof_current_size;
server.aof_fsync_offset = server.aof_current_size;
now in 7.0, we're in a better place to set it to just the base file size (which is how it was just after the last rewrite completed)
@chenyang8094 please take a look too.
Comment From: chenyang8094
@oranagra I don't think this is a bug. IIUC server.aof_current_size is always the total size of whole AOF, it is not server.base_aof_size. MP-AOF just splits the previous single AOF into BASE+INCRs.
@judeng Regarding the scenario you mentioned, even before 7.0, if there is a problem with your disk, you will get a 6GB AOF (single file), and the next time redis restart, it will still automatically trigger AOFRW at 13GB, I don't think there's anything suspicious about it, it behaves the same as before, right?
Comment From: judeng
@oranagra indeed,on startup ,aof_rewrite_base_size is equal the full size of the AOF file, I missed it! I just only considered the running state: in redis3.0
void aofUpdateCurrentSize(void) {
....
server.aof_current_size = sb.st_size;
}
void backgroundRewriteDoneHandler(int exitcode, int bysignal) {
....
server.aof_rewrite_base_size = server.aof_current_size;
}
on runing state in redis3, aof_rewrite_base_size not include the incr size.
@judeng Regarding the scenario you mentioned, even before 7.0, if there is a problem with your disk, you will get a 6GB AOF (single file), and the next time you restart, it will still automatically trigger AOFRW at 13GB, I don't think there's anything suspicious about it, it behaves the same as before, right?
you are right, before 7.0 we have to use a bigger disk for rewriting. but at 7.0, by your MP-AOF feature, we could optimize it :-)
Comment From: oranagra
@chenyang8094 server.aof_current_size should be the whole file including the incremental parts. but server.base_aof_size should be just the base, same as we do after rewrite. i.e. a restart should bring us to the same state we were before the restart.
the base file is used to calculate the automatic rewrite according to the auto-aof-rewrite-percentage config.
it makes no sense that it'll be based on a base that includes the whole file that was loaded on startup.
Comment From: chenyang8094
It sounds reasonable, the current logic is :
server.aof_current_size = total_size;
server.aof_rewrite_base_size = server.aof_current_size;
server.aof_fsync_offset = server.aof_current_size;
what we hope is:
server.aof_current_size = total_size;
server.aof_rewrite_base_size = base_aof_size;
server.aof_fsync_offset = server.aof_current_size;
@oranagra right ?
Comment From: chenyang8094
I will make a PR to fix it.
Comment From: oranagra
sounds good.. i'm actually surprised it was like that in previous versions. at least with preamble, it was quite easy to solve...
Comment From: judeng
I am very happy to contribute a pr for this asap
Comment From: chenyang8094
@judeng Thank you for your report, this is a very meaningful question.