===== Haproxy configuration changelog ===== ==== Upgrade to haproxy 1.5 (stable) ==== - Tested 100% compatibility with current configuration from 1.4 - Tested 100% compatibility + web socket support for Chat role. - Native SSL support - WebSockets support - A lot of bug fixes. - Feature: Stick-table counters allow tracking activity for inputs which can be used to enable security and performance features such as rate control, advanced DDOS protection, Abuser policing, Scan policing etc. - PROXY protocol : Apache does not support it yet but when it does, we can switch to highly efficient handovers between lb <-> app servers - 153 bug fixes. - Pretty stable : 4 years in coding and testing core code. - Better Logging. ==== 404's generated from scanners botnets etc ==== === Try#1: July, 7th 2014. === === Incomplete: Difference between test server and production. User error. Need to revisit. === Details: \\ Activity was started at 10:30 PM. \\ The active configuration was backed up (conforming haproxy.cfg.XXXXX - XXXXX being the tracker id). \\ New code, first block, was added to the frontend section of the configuration file followed by a configuration file check around 10:31 PM\\ Configuration check complained that the configuration was invalid and a brief analysis revealed that the general purpose counter and store method were \\ introduced with 1.5. \\ Configuration change was rolled back, resulting in no change - followed by PassengerMaxInstances activity. \\ - 11:35 PM: Reason for this rollout failure was identified. A different version of HaProxy (1.5) was used when writing and testing the configuration code. \\ #frontend http-ingress stick-table type ip size 1m expire 10s store gpc0,http_err_rate(10s)   tcp-request connection track-sc1 src   tcp-request connection reject if { src_get_gpc0 gt 0 } #backend learnexa   acl abuse src_http_err_rate(ft_web) ge 10   acl flag_abuser src_inc_gpc0(ft_web)   tcp-request content reject if abuse flag_abuser ==== Maxconn changes *COMPLETE June 11 2014*=== * Increase overall maxconnections. Tests failed at about 20k conns (normal as it is the configured limit). #global maxconn 35000 * Decrease per server maxconns. Tests conducted so far show that response times stay around 4.13s range with about 60 concurrent clients @30 reqs / second.\\ Production average for the past month is 5. \\ This change will regulate the connections via a FIFO queue at HaProxy level (ensuring a max 31 / server).\\ There will no timeout or any interruption for the user. maxconn 31 ==== 408s *COMPLETE - MAY 25 2014* ==== Increase timeouts to eliminate 408 response code resulting from Haproxy unable to communicate with backend servers \\ in a timely fashion. \\ #under global timeout client 50000ms #under frontend http-ingress timeout http-request 15s ==== loadbalancer machine crash recovery *COMPLETE - MAY 9 2014*==== - Add and support 10.166.152.16 frontend http-ingress bind 10.166.152.16:80 ... frontend chat-ingress bind 10.166.152.16:80 … backend leanrexa source 10.166.152.240 - Remove 10.166.152.14 (Remove commented code) frontend http-ingress #bind 10.166.152.14:80 frontend chat-ingress #bind 10.166.152.14:80 - Add redundant block backend learnexa_bkp balance roundrobin cookie SERVERID insert indirect server prodapp01 10.166.152.11:80 cookie app1 maxconn 64 inter 8000 server prodapp02 10.166.152.19:80 cookie app2 maxconn 64 inter 8000 option httpclose frontend http-ingress acl prm_is_dead nbsrv(learnexa) lt 1 ... use_backend learnexa_bkp if prm_is_dead #similar for chat ====old code==== Challenge 1: Reconfigure health checks without relying on Opsource firewall. Challenge 2: Reconfigure health checks without any deploy scenario changes. *Needs to be tested* \\ #2: \\ Deploy script blocks 10.166.152.240 through .250. \\ If we set a source IP for haproxy probe in this range, we should be able to retain the test before site goes live functionality (last step of deployment) via the machine IP (assigned by Opsource during instance creation). \\ source #1: