Describe the bug
I want to setup Redis sentinel in my AKS based on the Bitnami Helm Chart (https://artifacthub.io/packages/helm/bitnami/redis/18.1.2). After deploying the Sentinel Container spams the message:
waitpid() returned a pid (996) we can't find in our scripts execution queue!
To reproduce I am using an Image with the following Dockerfile:
FROM redis:7.0-alpine3.18
RUN apk add --no-cache bash
RUN apk add openssl
I am using the following YAML to deploy the Redis Sentinel with the above shown Image in the StatefulSet.
Expected behavior
Redis and Sentinel are Running without these Logs shown in additional information.
Additional information
To make it clear i am NOT using the officall Bitnami Image shown in the Helm Chart docker.io/bitnami/redis:7.2.1-debian-11-r0 / docker.io/bitnami/redis-sentinel:7.2.1-debian-11-r0 because those critical CVEs in the images are not allowed in my environment.
Some Logs from the Pod
Comment From: koenigfa1
│ sentinel 1:X 07 Nov 2023 14:28:18.174 # waitpid() returned a pid (144) we can't find in our scripts execution queue! │
│ sentinel 1:X 07 Nov 2023 14:28:22.757 * +sentinel sentinel 817333fc9cfeb5cf8d7595ba084eac43e846dd09 10.248.64.245 26379 @ mymaster xx-yy-redis-n │
│ sentinel 1:X 07 Nov 2023 14:28:22.769 * Sentinel new configuration saved on disk │
│ sentinel 1:X 07 Nov 2023 14:28:23.109 # waitpid() returned a pid (152) we can't find in our scripts execution queue! │
│ sentinel 1:X 07 Nov 2023 14:28:28.123 # waitpid() returned a pid (169) we can't find in our scripts execution queue!
I see sentinel doing stuff correctly isnt it? Is the Log Message not an error just an information? If so what is the impact of the message? For more Logs from sentinel refer to the beginning of 1.Log File
Comment From: enjoy-binbin
can you share the INFO SENTINEL output? My guess is that sentinel has entered TILT mode. (Will this fsync be slow?)
│ sentinel 1:X 07 Nov 2023 14:28:22.769 * Sentinel new configuration saved on disk
and can you check if the logs have some TITL keyword?
TILT mode source code:
/* This function checks if we need to enter the TILT mode.
*
* The TILT mode is entered if we detect that between two invocations of the
* timer interrupt, a negative amount of time, or too much time has passed.
* Note that we expect that more or less just 100 milliseconds will pass
* if everything is fine. However we'll see a negative number or a
* difference bigger than SENTINEL_TILT_TRIGGER milliseconds if one of the
* following conditions happen:
*
* 1) The Sentinel process for some time is blocked, for every kind of
* random reason: the load is huge, the computer was frozen for some time
* in I/O or alike, the process was stopped by a signal. Everything.
* 2) The system clock was altered significantly.
*
* Under both this conditions we'll see everything as timed out and failing
* without good reasons. Instead we enter the TILT mode and wait
* for SENTINEL_TILT_PERIOD to elapse before starting to act again.
*
* During TILT time we still collect information, we just do not act. */
void sentinelCheckTiltCondition(void) {
mstime_t now = mstime();
mstime_t delta = now - sentinel.previous_time;
if (delta < 0 || delta > sentinel_tilt_trigger) {
sentinel.tilt = 1;
sentinel.tilt_start_time = mstime();
sentinelEvent(LL_WARNING,"+tilt",NULL,"#tilt mode entered");
}
sentinel.previous_time = mstime();
}
Comment From: koenigfa1
@enjoy-binbin thank your for your answer. How can i get the INFO SENTINEL output? If i take a look into all my logs also into the logs linked in my post i never saw something like TILT in my logs. Thats the point why i am confused. How can we check the TILT mode do i need to configure something?
Comment From: enjoy-binbin
INFO is a command, https://redis.io/commands/info/ or you can post the full INFO command output
Comment From: enjoy-binbin
can you issue the info command to the sentinel node?
Comment From: koenigfa1
I kubectl exec into the sentinel container and authenticate with redis-cli. But i dont get any output:
-
- k exec -it xy -c sentinel -- bash
-
- redis-cli -a xyz
-
Comment From: enjoy-binbin
you should try the sentinel port 26379, not 6379
Comment From: koenigfa1
127.0.0.1:26379> info sentinel
Sentinel
sentinel_masters:1 sentinel_tilt:0 sentinel_tilt_since_seconds:-1 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=mymaster,status=ok,address=10.248.64.250:6379,slaves=2,sentinels=3
Comment From: enjoy-binbin
Sorry, i don't have any clues right now. @hwware do you have some suggestions?
Comment From: koenigfa1
So you think the redis and sentinel container are running correctly and are production ready? I dont know how to understand the log message...Is it an error or just a warning/info? Does it affect the redis/sentinel so the containers are or can't run correctly in the futher?
Comment From: enjoy-binbin
it is just a warning. as far as I know, it won't affect redis or sentinel, but it's better to test it (like try to do a failover in your env)
Comment From: koenigfa1
Alright. Can you give me a hint how to test in my env a failover scenario?
Comment From: enjoy-binbin
https://redis.io/docs/management/sentinel/ you can get more info in here, and can learn how to use the sentinel or how sentinel work
Comment From: koenigfa1
Wow nice now i understand. Everything seems fine:
I see in the logs the health check fails and the failover from 10.248.64.250 to 10.248.65.12 is done successfully isn't it?
Comment From: koenigfa1
More logs from the failover
From my point of view everything works fine!