Details:
Activity was started at 10:30 PM.
The active configuration was backed up (conforming haproxy.cfg.XXXXX - XXXXX being the tracker id).
New code, first block, was added to the frontend section of the configuration file followed by a configuration file check around 10:31 PM
Configuration check complained that the configuration was invalid and a brief analysis revealed that the general purpose counter and store method were
introduced with 1.5.
Configuration change was rolled back, resulting in no change - followed by PassengerMaxInstances activity.
-
11:35 PM: Reason for this rollout failure was identified. A different version of HaProxy (1.5) was used when writing and testing the configuration code.
#frontend http-ingress
stick-table type ip size 1m expire 10s store gpc0,http_err_rate(10s)
tcp-request connection track-sc1 src
tcp-request connection reject if { src_get_gpc0 gt 0 }
#backend learnexa
acl abuse src_http_err_rate(ft_web) ge 10
acl flag_abuser src_inc_gpc0(ft_web)
tcp-request content reject if abuse flag_abuser
Tests failed at about 20k conns (normal as it is the configured limit).
#global maxconn 35000
Tests conducted so far show that response times stay around 4.13s range with about 60 concurrent clients @30 reqs / second.
Production average for the past month is 5.
This change will regulate the connections via a FIFO queue at HaProxy level (ensuring a max 31 / server).
There will no timeout or any interruption for the user.
maxconn 31
Increase timeouts to eliminate 408 response code resulting from Haproxy unable to communicate with backend servers
in a timely fashion.
#under global timeout client 50000ms #under frontend http-ingress timeout http-request 15s
frontend http-ingress bind 10.166.152.16:80 ... frontend chat-ingress bind 10.166.152.16:80 … backend leanrexa source 10.166.152.240
frontend http-ingress #bind 10.166.152.14:80 frontend chat-ingress #bind 10.166.152.14:80
backend learnexa_bkp
balance roundrobin
cookie SERVERID insert indirect
server prodapp01 10.166.152.11:80 cookie app1 maxconn 64 inter 8000
server prodapp02 10.166.152.19:80 cookie app2 maxconn 64 inter 8000
option httpclose
frontend http-ingress
acl prm_is_dead nbsrv(learnexa) lt 1
...
use_backend learnexa_bkp if prm_is_dead
#similar for chat
Challenge 1: Reconfigure health checks without relying on Opsource firewall. Challenge 2: Reconfigure health checks without any deploy scenario changes.
*Needs to be tested*
#2:
Deploy script blocks 10.166.152.240 through .250.
If we set a source IP for haproxy probe in this range, we should be able to retain the test before site goes live functionality (last step of deployment) via
the machine IP (assigned by Opsource during instance creation).
source <IPAddress>
#1: