Table of Contents

Haproxy configuration changelog

Upgrade to haproxy 1.5 (stable)

  1. Tested 100% compatibility with current configuration from 1.4
  2. Tested 100% compatibility + web socket support for Chat role.
  3. Native SSL support
  4. WebSockets support
  5. A lot of bug fixes.
  6. Feature: Stick-table counters allow tracking activity for inputs which can be used to enable security and performance features such as rate control, advanced DDOS protection, Abuser policing, Scan policing etc.
  7. PROXY protocol : Apache does not support it yet but when it does, we can switch to highly efficient handovers between lb ↔ app servers
  8. 153 bug fixes.
  9. Pretty stable : 4 years in coding and testing core code.
  10. Better Logging.

404's generated from scanners botnets etc

Try#1: July, 7th 2014.

Incomplete: Difference between test server and production. User error. Need to revisit.

Details:
Activity was started at 10:30 PM.
The active configuration was backed up (conforming haproxy.cfg.XXXXX - XXXXX being the tracker id).
New code, first block, was added to the frontend section of the configuration file followed by a configuration file check around 10:31 PM
Configuration check complained that the configuration was invalid and a brief analysis revealed that the general purpose counter and store method were
introduced with 1.5.
Configuration change was rolled back, resulting in no change - followed by PassengerMaxInstances activity.
- 11:35 PM: Reason for this rollout failure was identified. A different version of HaProxy (1.5) was used when writing and testing the configuration code.

#frontend http-ingress
  stick-table type ip size 1m expire 10s store gpc0,http_err_rate(10s)
  tcp-request connection track-sc1 src
  tcp-request connection reject if { src_get_gpc0 gt 0 }

#backend learnexa
  acl abuse src_http_err_rate(ft_web) ge 10
  acl flag_abuser src_inc_gpc0(ft_web)
  tcp-request content reject if abuse flag_abuser

Maxconn changes *COMPLETE June 11 2014*

Tests failed at about 20k conns (normal as it is the configured limit).

#global
maxconn 35000

Tests conducted so far show that response times stay around 4.13s range with about 60 concurrent clients @30 reqs / second.
Production average for the past month is 5.
This change will regulate the connections via a FIFO queue at HaProxy level (ensuring a max 31 / server).
There will no timeout or any interruption for the user.

maxconn 31

408s *COMPLETE - MAY 25 2014*

Increase timeouts to eliminate 408 response code resulting from Haproxy unable to communicate with backend servers
in a timely fashion.

#under global
timeout client 50000ms
#under frontend http-ingress
timeout http-request 15s

loadbalancer machine crash recovery *COMPLETE - MAY 9 2014*

  1. Add and support 10.166.152.16
frontend http-ingress
  bind 10.166.152.16:80
  ...
frontend chat-ingress
  bind 10.166.152.16:80
  …
backend leanrexa
  source 10.166.152.240
  1. Remove 10.166.152.14 (Remove commented code)
frontend http-ingress
  #bind 10.166.152.14:80
frontend chat-ingress
  #bind 10.166.152.14:80
  1. Add redundant block
backend learnexa_bkp
        balance roundrobin
        cookie SERVERID insert indirect
        server prodapp01 10.166.152.11:80 cookie app1 maxconn 64  inter 8000
        server prodapp02 10.166.152.19:80 cookie app2 maxconn 64  inter 8000 
        option httpclose
frontend http-ingress
  acl prm_is_dead nbsrv(learnexa) lt 1
  ...
  use_backend learnexa_bkp if prm_is_dead
#similar for chat

old code

Challenge 1: Reconfigure health checks without relying on Opsource firewall. Challenge 2: Reconfigure health checks without any deploy scenario changes.

*Needs to be tested*
#2:
Deploy script blocks 10.166.152.240 through .250.
If we set a source IP for haproxy probe in this range, we should be able to retain the test before site goes live functionality (last step of deployment) via the machine IP (assigned by Opsource during instance creation).

source <IPAddress>

#1: