Activity log for bug #2003189

Date Who What changed Old value New value Message
2023-01-18 11:31:25 Sistemi CeSIA bug added bug
2023-01-19 12:30:48 Paride Legovini bug watch added https://bz.apache.org/bugzilla/show_bug.cgi?id=66302
2023-01-19 12:30:48 Paride Legovini bug task added apache2
2023-01-19 12:56:08 Paride Legovini tags patch-accepted-upstream patch-accepted-upstream server-todo
2023-01-19 12:57:47 Paride Legovini nominated for series Ubuntu Jammy
2023-01-19 12:57:47 Paride Legovini bug task added apache2 (Ubuntu Jammy)
2023-01-19 12:57:47 Paride Legovini nominated for series Ubuntu Kinetic
2023-01-19 12:57:47 Paride Legovini bug task added apache2 (Ubuntu Kinetic)
2023-01-19 12:57:52 Paride Legovini apache2 (Ubuntu): status New Triaged
2023-01-19 12:57:55 Paride Legovini apache2 (Ubuntu Jammy): status New Triaged
2023-01-19 12:58:00 Paride Legovini apache2 (Ubuntu Kinetic): status New Triaged
2023-01-19 13:00:48 Paride Legovini tags patch-accepted-upstream server-todo needs-merge patch-accepted-upstream server-todo
2023-01-19 13:04:45 Paride Legovini bug added subscriber Ubuntu Server
2023-01-19 13:04:47 Paride Legovini bug added subscriber Paride Legovini
2023-01-19 13:29:05 Bug Watch Updater apache2: status Unknown Confirmed
2023-02-16 09:46:05 Christian Ehrhardt  apache2 (Ubuntu Jammy): assignee Michał Małoszewski (michal-maloszewski99)
2023-02-16 09:46:13 Christian Ehrhardt  apache2 (Ubuntu Kinetic): assignee Michał Małoszewski (michal-maloszewski99)
2023-02-16 09:51:25 Christian Ehrhardt  apache2 (Ubuntu): status Triaged Fix Released
2023-03-22 13:58:42 Launchpad Janitor merge proposal linked https://code.launchpad.net/~michal-maloszewski99/ubuntu/+source/apache2/+git/apache2/+merge/439391
2023-03-22 14:00:25 Launchpad Janitor merge proposal linked https://code.launchpad.net/~michal-maloszewski99/ubuntu/+source/apache2/+git/apache2/+merge/439392
2023-03-28 16:31:41 Bug Watch Updater apache2: status Confirmed Fix Released
2023-04-28 23:23:45 Bryce Harrington tags needs-merge patch-accepted-upstream server-todo patch-accepted-upstream server-todo
2023-05-11 19:37:20 Michał Małoszewski apache2 (Ubuntu Jammy): status Triaged In Progress
2023-05-11 19:37:22 Michał Małoszewski apache2 (Ubuntu Kinetic): status Triaged In Progress
2023-05-12 09:17:04 Paride Legovini removed subscriber Paride Legovini
2023-05-12 21:01:12 Steve Langasek apache2 (Ubuntu Kinetic): status In Progress Fix Committed
2023-05-12 21:01:13 Steve Langasek bug added subscriber Ubuntu Stable Release Updates Team
2023-05-12 21:01:15 Steve Langasek bug added subscriber SRU Verification
2023-05-12 21:01:18 Steve Langasek tags patch-accepted-upstream server-todo patch-accepted-upstream server-todo verification-needed verification-needed-kinetic
2023-05-12 21:02:06 Steve Langasek apache2 (Ubuntu Jammy): status In Progress Fix Committed
2023-05-12 21:02:11 Steve Langasek tags patch-accepted-upstream server-todo verification-needed verification-needed-kinetic patch-accepted-upstream server-todo verification-needed verification-needed-jammy verification-needed-kinetic
2023-05-17 16:14:09 Michał Małoszewski description [Original Report] While we were in the process of enabling mod_proxy_hcheck on some of our apache2 nodes we encountered an unusual behavior: sometimes, after rebooting a backend, its worker status remains marked as "Init Err" (balancer manager) until another request is made to the backend, no matter how many health checks complete successfully. The following list shows the sequence of events leading to the problem: 1. Watchdog triggers health check, request is successful; worker status is "Init Ok" 2. HTTP request to apache2 with unreachable backend (rebooting); status becomes "Init Err" 3. Watchdog triggers another health check, request is again successful because the backend recovered; worker status remains "Init Err" 4. same as 3 5. same as 4 The only way for the worker status to recover is to wait for "hcfails" unsuccessful health checks and then again for "hcpasses" requests to be completed or just wait for legitimate traffic to retry the failed worker, which may not happen for a long time for rarely used applications. This was surprising to us since we were expecting the worker status to be recovered after "hcpasses" successful health checks; however this doesn't seem to happen when the error status is triggered by ordinary traffic to the backend (i.e not health checks). [Test Case] # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80> ServerAdmin webmaster@dummy-host2.example.com DocumentRoot "/var/www/html" ServerName myapp.example.com ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log" CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp> BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok Upstream bug: https://bz.apache.org/bugzilla/show_bug.cgi?id=66302 Upstream fix: https://svn.apache.org/viewvc?view=revision&revision=1906496 We would like to see this fix backported in Jammy. [Impact] The mod_proxy_hcheck module provides dynamic health checking of workers. The following error is encountered in jammy and kinetic when enabling mod_proxy_hcheck on some Apache2 nodes, which causes unusual behavior and is simply confusing.This is caused by the lack of a defined macro (PROXY_WORKER_IS_ERROR) and the lack of an operation on that in the source code. Fix is based on adding the missing part. [Test Plan] Make a container for testing: $ lxc launch ubuntu-daily:jammy jammy-test $ lxc shell jammy-test Do the same 2 steps from above for kinetic: $ lxc launch ubuntu-daily:kinetic kinetic-test $ lxc shell kinetic-test Type in the code from below inside both the kinetic and jammy containers: # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80> ServerAdmin webmaster@dummy-host2.example.com DocumentRoot "/var/www/html" ServerName myapp.example.com ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log" CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp> BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok [Where problems could occur] The patch itself modifies the code of mod_proxy_hcheck, so any new bugs involving that module would be suspect. The patch changes the state of the workers, so issues cropping up that seem related to workers and its state could be suspect. Finally, since the patch modifies C code, issues typical of C code (segfaults, memory leaks, …) would be possible, however since this moves a chunk of code unmodified this seems unlikely. ---------------------------------------------------------------------- [Original Report] While we were in the process of enabling mod_proxy_hcheck on some of our apache2 nodes we encountered an unusual behavior: sometimes, after rebooting a backend, its worker status remains marked as "Init Err" (balancer manager) until another request is made to the backend, no matter how many health checks complete successfully. The following list shows the sequence of events leading to the problem: 1. Watchdog triggers health check, request is successful; worker status is "Init Ok" 2. HTTP request to apache2 with unreachable backend (rebooting); status becomes "Init Err" 3. Watchdog triggers another health check, request is again successful because the backend recovered; worker status remains "Init Err" 4. same as 3 5. same as 4 The only way for the worker status to recover is to wait for "hcfails" unsuccessful health checks and then again for "hcpasses" requests to be completed or just wait for legitimate traffic to retry the failed worker, which may not happen for a long time for rarely used applications. This was surprising to us since we were expecting the worker status to be recovered after "hcpasses" successful health checks; however this doesn't seem to happen when the error status is triggered by ordinary traffic to the backend (i.e not health checks). [Test Case] # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80>     ServerAdmin webmaster@dummy-host2.example.com     DocumentRoot "/var/www/html"     ServerName myapp.example.com     ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log"     CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common     ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp>     BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok Upstream bug: https://bz.apache.org/bugzilla/show_bug.cgi?id=66302 Upstream fix: https://svn.apache.org/viewvc?view=revision&revision=1906496 We would like to see this fix backported in Jammy.
2023-06-21 10:12:03 Michał Małoszewski description [Impact] The mod_proxy_hcheck module provides dynamic health checking of workers. The following error is encountered in jammy and kinetic when enabling mod_proxy_hcheck on some Apache2 nodes, which causes unusual behavior and is simply confusing.This is caused by the lack of a defined macro (PROXY_WORKER_IS_ERROR) and the lack of an operation on that in the source code. Fix is based on adding the missing part. [Test Plan] Make a container for testing: $ lxc launch ubuntu-daily:jammy jammy-test $ lxc shell jammy-test Do the same 2 steps from above for kinetic: $ lxc launch ubuntu-daily:kinetic kinetic-test $ lxc shell kinetic-test Type in the code from below inside both the kinetic and jammy containers: # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80> ServerAdmin webmaster@dummy-host2.example.com DocumentRoot "/var/www/html" ServerName myapp.example.com ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log" CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp> BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok [Where problems could occur] The patch itself modifies the code of mod_proxy_hcheck, so any new bugs involving that module would be suspect. The patch changes the state of the workers, so issues cropping up that seem related to workers and its state could be suspect. Finally, since the patch modifies C code, issues typical of C code (segfaults, memory leaks, …) would be possible, however since this moves a chunk of code unmodified this seems unlikely. ---------------------------------------------------------------------- [Original Report] While we were in the process of enabling mod_proxy_hcheck on some of our apache2 nodes we encountered an unusual behavior: sometimes, after rebooting a backend, its worker status remains marked as "Init Err" (balancer manager) until another request is made to the backend, no matter how many health checks complete successfully. The following list shows the sequence of events leading to the problem: 1. Watchdog triggers health check, request is successful; worker status is "Init Ok" 2. HTTP request to apache2 with unreachable backend (rebooting); status becomes "Init Err" 3. Watchdog triggers another health check, request is again successful because the backend recovered; worker status remains "Init Err" 4. same as 3 5. same as 4 The only way for the worker status to recover is to wait for "hcfails" unsuccessful health checks and then again for "hcpasses" requests to be completed or just wait for legitimate traffic to retry the failed worker, which may not happen for a long time for rarely used applications. This was surprising to us since we were expecting the worker status to be recovered after "hcpasses" successful health checks; however this doesn't seem to happen when the error status is triggered by ordinary traffic to the backend (i.e not health checks). [Test Case] # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80>     ServerAdmin webmaster@dummy-host2.example.com     DocumentRoot "/var/www/html"     ServerName myapp.example.com     ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log"     CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common     ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp>     BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok Upstream bug: https://bz.apache.org/bugzilla/show_bug.cgi?id=66302 Upstream fix: https://svn.apache.org/viewvc?view=revision&revision=1906496 We would like to see this fix backported in Jammy. [Impact] The mod_proxy_hcheck module provides dynamic health checking of workers. The following error is encountered in jammy and kinetic when enabling mod_proxy_hcheck on some Apache2 nodes, which causes unusual behavior and is simply confusing.This is caused by the lack of a defined macro (PROXY_WORKER_IS_ERROR) and the lack of an operation on that in the source code. Fix is based on adding the missing part. [Test Plan] Make a container for testing: $ lxc launch ubuntu-daily:jammy jammy-test $ lxc shell jammy-test Do the same 2 steps from above for kinetic: $ lxc launch ubuntu-daily:kinetic kinetic-test $ lxc shell kinetic-test Type in the code from below inside both the kinetic and jammy containers: # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80>     ServerAdmin webmaster@dummy-host2.example.com     DocumentRoot "/var/www/html"     ServerName myapp.example.com     ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log"     CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common     ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp>     BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok [Where problems could occur] * The patch itself modifies the code of mod_proxy_hcheck, so any new bugs involving that module would be suspect. * The patch changes the state of the workers, so issues cropping up that seem related to workers and its state could be suspect. * Finally, since the patch modifies C code, issues typical of C code (segfaults, memory leaks, …) would be possible, however since this moves a chunk of code unmodified this seems unlikely. * There may be a problem when a possible future fix introduces a discrepancy between the value between the defined MODULE_MAGIC_NUMBER_MINOR and the one describing the change. Simply, there might be an issue that comes from the values of the MODULE_MAGIC_NUMBER_MINOR. ---------------------------------------------------------------------- [Original Report] While we were in the process of enabling mod_proxy_hcheck on some of our apache2 nodes we encountered an unusual behavior: sometimes, after rebooting a backend, its worker status remains marked as "Init Err" (balancer manager) until another request is made to the backend, no matter how many health checks complete successfully. The following list shows the sequence of events leading to the problem: 1. Watchdog triggers health check, request is successful; worker status is "Init Ok" 2. HTTP request to apache2 with unreachable backend (rebooting); status becomes "Init Err" 3. Watchdog triggers another health check, request is again successful because the backend recovered; worker status remains "Init Err" 4. same as 3 5. same as 4 The only way for the worker status to recover is to wait for "hcfails" unsuccessful health checks and then again for "hcpasses" requests to be completed or just wait for legitimate traffic to retry the failed worker, which may not happen for a long time for rarely used applications. This was surprising to us since we were expecting the worker status to be recovered after "hcpasses" successful health checks; however this doesn't seem to happen when the error status is triggered by ordinary traffic to the backend (i.e not health checks). [Test Case] # apt update && apt dist-upgrade -y # apt install -y apache2 python3 # cat > /etc/apache2/sites-available/httpd-hcheck-initerr.conf << '__EOF__' <VirtualHost *:80>     ServerAdmin webmaster@dummy-host2.example.com     DocumentRoot "/var/www/html"     ServerName myapp.example.com     ErrorLog "${APACHE_LOG_DIR}/myapp.example.com-error_log"     CustomLog "${APACHE_LOG_DIR}/myapp.example.com-access_log" common     ProxyPass / balancer://myapp stickysession=JSESSIONID </VirtualHost> __EOF__ # cat > /etc/apache2/conf-available/balancers.conf << '__EOF__' <Proxy balancer://myapp>     BalancerMember http://127.0.0.1:8080/ route=app-route hcmethod=GET hcinterval=5 hcpasses=1 hcfails=1 </Proxy> __EOF__ # a2enmod status # a2enmod proxy # a2enmod proxy_balancer # a2enmod proxy_http # a2enmod proxy_hcheck # a2enmod lbmethod_byrequests # a2enconf balancers # a2ensite httpd-hcheck-initerr # python3 -m http.server --bind 127.0.0.1 8080 & # PYTHON_PID=$! # systemctl restart apache2 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # kill -9 $PYTHON_PID # curl -s localhost -H 'host: myapp.example.com' -o /dev/null # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status # python3 -m http.server --bind 127.0.0.1 8080 & # sleep 10 # curl -s localhost/server-status?auto | grep ProxyBalancer | grep Status Example failed output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 203609 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:24:05] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:24:10] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Err Example of (expected) successful output: Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... ProxyBalancer[0]Worker[0]Status: Init Ok scripts: line 6: 202190 Killed python3 -m http.server --bind 127.0.0.1 8080 ProxyBalancer[0]Worker[0]Status: Init Err Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ... 127.0.0.1 - - [18/Jan/2023 12:23:12] "GET / HTTP/1.0" 200 - 127.0.0.1 - - [18/Jan/2023 12:23:17] "GET / HTTP/1.0" 200 - ProxyBalancer[0]Worker[0]Status: Init Ok Upstream bug: https://bz.apache.org/bugzilla/show_bug.cgi?id=66302 Upstream fix: https://svn.apache.org/viewvc?view=revision&revision=1906496 We would like to see this fix backported in Jammy.
2023-06-22 03:44:01 Ubuntu Archive Robot bug added subscriber Bryce Harrington
2023-07-12 17:38:59 Steve Langasek bug added subscriber Steve Langasek
2023-07-19 15:08:26 Michał Małoszewski tags patch-accepted-upstream server-todo verification-needed verification-needed-jammy verification-needed-kinetic patch-accepted-upstream server-todo verification-done verification-done-jammy verification-done-kinetic
2023-08-02 11:36:02 Robie Basak removed subscriber Ubuntu Stable Release Updates Team
2023-08-02 11:36:13 Launchpad Janitor apache2 (Ubuntu Jammy): status Fix Committed Fix Released
2023-08-08 09:42:04 Christian Ehrhardt  apache2 (Ubuntu Kinetic): assignee Michał Małoszewski (michal-maloszewski99)
2023-08-08 09:48:29 Christian Ehrhardt  apache2 (Ubuntu Kinetic): status Fix Committed Won't Fix