Crash report

Redis cache keeps on terminating on my Arch server ever since moving up from Redis 6.0.7.

Redis service will launch fine, but soon after an application starts using it will terminate. 1. Start redis service 2. Navigate to Browser running Nextcloud UI 3. Sometimes redis will terminate the moment F5 is hit, sometimes it needs 3 - 4 more menu changes until Redis will terminate and Nextcloud won't function any more

Nov 05 15:03:56 homeserver redis-server[34502]: 34502:C 05 Nov 2020 15:03:56.924 # Configuration loaded
Nov 05 15:03:56 homeserver redis-server[34502]:                 _._
Nov 05 15:03:56 homeserver redis-server[34502]:            _.-``__ ''-._
Nov 05 15:03:56 homeserver redis-server[34502]:       _.-``    `.  `_.  ''-._           Redis 6.0.9 (00000000/0) 64 bit
Nov 05 15:03:56 homeserver redis-server[34502]:   .-`` .-```.  ```\/    _.,_ ''-._
Nov 05 15:03:56 homeserver redis-server[34502]:  (    '      ,       .-`  | `,    )     Running in standalone mode
Nov 05 15:03:56 homeserver redis-server[34502]:  |`-._`-...-` __...-.``-._|'` _.-'|     Port: 0
Nov 05 15:03:56 homeserver redis-server[34502]:  |    `-._   `._    /     _.-'    |     PID: 34502
Nov 05 15:03:56 homeserver redis-server[34502]:   `-._    `-._  `-./  _.-'    _.-'
Nov 05 15:03:56 homeserver redis-server[34502]:  |`-._`-._    `-.__.-'    _.-'_.-'|
Nov 05 15:03:56 homeserver redis-server[34502]:  |    `-._`-._        _.-'_.-'    |           http://redis.io
Nov 05 15:03:56 homeserver redis-server[34502]:   `-._    `-._`-.__.-'_.-'    _.-'
Nov 05 15:03:56 homeserver redis-server[34502]:  |`-._`-._    `-.__.-'    _.-'_.-'|
Nov 05 15:03:56 homeserver redis-server[34502]:  |    `-._`-._        _.-'_.-'    |
Nov 05 15:03:56 homeserver redis-server[34502]:   `-._    `-._`-.__.-'_.-'    _.-'
Nov 05 15:03:56 homeserver redis-server[34502]:       `-._    `-.__.-'    _.-'
Nov 05 15:03:56 homeserver redis-server[34502]:           `-._        _.-'
Nov 05 15:03:56 homeserver redis-server[34502]:               `-.__.-'
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.926 # Server initialized
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.926 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.927 * Loading RDB produced by version 6.0.9
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.927 * RDB age 367 seconds
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.927 * RDB memory usage when created 0.77 Mb
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.927 * DB loaded from disk: 0.000 seconds
Nov 05 15:03:56 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:03:56.927 * The server is now ready to accept connections at /var/run/redis/redis.sock
--- Switch to browser and start changing Nextcloud screens
Nov 05 15:04:57 homeserver systemd[1]: redis.service: start operation timed out. Terminating.
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:signal-handler (1604585097) Received SIGTERM scheduling shutdown...
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:04:57.162 # User requested shutdown...
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:04:57.162 * Saving the final RDB snapshot before exiting.
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:04:57.170 * DB saved on disk
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:04:57.170 * Removing the pid file.
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:04:57.170 * Removing the unix socket file.
Nov 05 15:04:57 homeserver redis-server[34502]: 34502:M 05 Nov 2020 15:04:57.170 # Redis is now ready to exit, bye bye...
Nov 05 15:04:57 homeserver systemd[1]: redis.service: Failed with result 'timeout'.
Nov 05 15:04:57 homeserver systemd[1]: Failed to start Advanced key-value store.

Aditional information

OS: Arch linux, Kernel 5.4.74-1-lts If have tried upgrading whenever new Redis versions came along but up to now always ended up with downgrading to 6.0.7 which reliably fixes the issue.

Thanks for looking into this and let me know if there is any additional information I can provide.

Comment From: oranagra

Hi, It seems like someone's killing redis (SIGTERM) after exactly 1 minute.

This could be some bug in 6.0.9 which fails to notify systemd that it's ready. but by looking at the diff between 6.0.7 and 6.0.9 i don't see any change in that area.

maybe it has something to do with the way redis was built? maybe they where built differently?

Comment From: zell-mbc

I didn't notice the exact minute yet, but you are right, my Nextcloud clicking is irrelevant, the process get's killed after exactly one minute even if I just do nothing.

The exact minute made me look at my redis.service file and I found the following two in the [Service] section:

TimeoutStartSec=60
TimeoutStopSec=60

The SIGTERM corresponds with the TimeoutStartSec setting, I changed it to different values and the process termination matched the numbers. I actually don't know why they are there and could possibly just delete them. Maybe 6.0.7 ignored this value?

Edit: tried without the TimeouStartSec value which leads to the process to end after exactly 90 seconds.

Comment From: oranagra

@zell-mbc where did you get redis and this redis.service file? Do you use a PPA? which one? @yossigo maybe you have some input?

Comment From: zell-mbc

It's Arch, so no PPAs but the official Arch community repository:

9 community/redis 6.0.9-1 (839.4 KiB 5.3 MiB) (Installed)
    An in-memory database that persists on disk

As far as the redis.service file is concerned I can't remember to be honest, I am running Pleroma, NextCloud, Synapse on this server and one of these probably provided the template:

[Unit]
Description=Advanced key-value store
After=network.target

[Service]
Type=notify
User=redis
Group=redis
ExecStart=/usr/bin/redis-server /etc/redis.conf
ExecStop=/usr/bin/redis-cli shutdown
#TimeoutStartSec=60
#TimeoutStopSec=60
CapabilityBoundingSet=
PrivateTmp=true
PrivateDevices=true
ProtectSystem=full
ProtectHome=true
NoNewPrivileges=true
RuntimeDirectory=redis
RuntimeDirectoryMode=755
LimitNOFILE=10032

[Install]
WantedBy=multi-user.target

Comment From: yossigo

@zell-mbc If you use systemd with Type=notify you need to make sure that: 1. You build Redis with libsystemd support (that's done automatically if you have the development libsystemd and pkg-config). Otherwise, you may need to set USE_SYSTEMD. 2. Your redis.conf contains daemonize no and supervised auto. Otherwise Redis does not notify systemd it's up, so it gets killed.

Comment From: zell-mbc

I didn't build myself but rather use the binary from the Arch repo. So that means that either the Arch team started to build the binary differently after 6.0.7 or something changed in Redis to not tolerate my possible wrong configuration any longer. In either case, I now have things to test. I will report back. Thanks for getting back!

Comment From: zell-mbc

I kept "daemonised=no" and changed "supervised no" to "systemd" and bingo. Kind of obvious but then again :-)

Redis is up now for more than 2 hours, so I think I can safely close this case. Thanks for helping me along!

Comment From: yossigo

@zell-mbc It looks like the Arch package started building Redis with systemd support in 6.0.8, that explains the problem.