===== Notice =====
This page was rewritten to reflect recent developments on Januaray 23, 2014.
Please use revisions feature of Doku to browse the older version. \\
This page contains highlights and task references to tracker. \\
The content will be updated with actual code where necessary when the task is complete. \\
====Start Date: Feb 06, 2014====
Trial#1 : Feb 06,2014.
Result: Functional but rolled back due to certain concerns.
Scheduled : Feb 11,2014.
Cancelled due to [[known_issues_and_fixes|gluster issues]] on prodapp02. \\
Scheduled: Feb 13, 2014.
Successfully rolled out HaProxy. prodapp02 had to be removed from the pool due to:
- [[https://tracker.exphosted.com/view.php?id=4674|Chat functionality issues. Click here to lookup issue in tracker]].This chat issue has been fixed in rev. 4685. It will be available in production when the fix is shipped.\\
- [[https://tracker.exphosted.com/view.php?id=4675|Styles not synced between app servers in the pool.Click to lookup issue in tracker]] \\
Scheduled: Feb 19, 2014. Successfully reintroduced prodapp02 into the pool.
==== CURRENT STATUS: LB roll out phase implemented, working on follow up tasks. ====
==== Load Balancer Details ====
HaProxy v. 1.4 \\ (SSL support not required currently) \\
==== Bringing Own domain Sites Up ====
Recommended method:
Add an A record to the IP address of www.learnexa.com \\
Alternatively (slower): \\
Create a CNAME to www.learnexa.com \\
==== Migrating learnexa to a load balanced environment: ====
This document details the steps to be taken to upgrade the existing single app server setup to a load balanced, multiple app server infrastructure.
==== Pre-flight tasks:* COMPLETED * ====
* [[https://tracker.exphosted.com/view.php?id=4571|Verify LAN connectivity for already deployed components]]:
- Check and modify, if necessary, MySQL Master to allow remote connections from prodapp02.
- Check and modify, if necessary, MySQL Slave to allow remote connections from prodapp02.
- Check and modify, if necessary, Memcached to allow remote connections from prodapp02.
#double check the username, password and database name in config/database.yml.
grant all on cb_production.* to produser@'localhost' identified by 'prodpswd';
grant all on cb_production.* to produser@'websrvr_IP_or_NAME' identified by 'prodpswd';
* [[https://tracker.exphosted.com/view.php?id=4572|Prepare prodapp02]]:
** Refer to [[setup_an_app_server|Setting up an app server]] for the following configuration tasks. ** \\
- Remove prodapp02 from www1 pool.
- Deploy latest build in production environment mode on prodapp02.
- Configure prodapp02 to work with production components:
* Shared folders
* DB
* Memcached
* Juggernaut
* BBB
* Solr
* Red5 (Recorder/Streamer)
* wepay
* production.rb
* God related configuration.
* [[https://tracker.exphosted.com/view.php?id=4573|prodapp01 Changes]]:
- This task is important for maintainance scripts to work.
- Edit sudoers file and authorize expprodl user to be able to perform actions on iptables binary. \\ [[setup_an_app_server#prepare_system| Refer to sudoers part in "Prepare System" Section]]
- Update [[setup_an_app_server#install_microsoft_core_fonts|CoreFonts]] and [[setup_an_app_server#install_jre|JRE]] version.
- Clear default "shipped" iptables rules.
#clear iptables
#On the interface that opens, deselect all the options.
system-network-config-tui
* [[https://tracker.exphosted.com/view.php?id=4574|Add Load Balancer Role to websvr]]:
- [[setup_other_roles?#load_balancer_role|Install and configure HaProxy]] on the frontend server:
- An additional "load balancer" role will be added to the frontend "websvr" server. This will help limit additional changes to the ARP, DNS, NAT and reduce downtime, if any.
=== Externalize styles folder ===
* Create a new brick on .13 for styles.
mkdir /data00/styles
gluster volume create styles transport tcp SHARINGSERVERIP:/deploy/crossbow/shared/system/styles
gluster volume start styles
* Mount the brick on app 1.
mkdir /deploy/crossbow/shared/system/styles
mount.glusterfs 10.166.152.13:/styles /deploy/crossbow/shared/system/styles
* Copy existing contents of /deploy/crossbow/current/public/styles folder over to the mounted directory
cp -r /deploy/crossbow/current/public/styles/* /deploy/crossbow/shared/system/styles/
* Verify contents, paths, permissions.
#to verify the directory size.
du . -hs \\
#verify all owned by expprodl.expprodl
ls -al
* Link /deploy/crossbow/current/public/styles folder to mount directory
cd /deploy/crossbow/current/public
mv styles ~/styles.tracker-4675
ln -nfs /deploy/crossbow/shared/system/styles styles
====Ship checklist *COMPLETED* :====
(rev 0.2, revised on Feb 14, 2014)
- Client notification, if required.
- Verify that all pre-flight tasks are completed.
- Open [[https://docs.google.com/a/expertus.com/spreadsheet/ccc?key=0Atv0MG6V7MbDdHQ0V3pRX1pUdEQzcHcycnB0bVNHY0E&usp=sharing|this link]]. Sheet 2 "Shiplist 0.2".
- Add prodapp02's IP address in deploy.rb on build box.
- Enable monitoring in Zabbix.
====Rollback checklist:====
Stop HaProxy service (on websvr). \\
This task stops the HaProxy service and unbinds all the IP addresses. \\
service haproxy stop
Stop iptables service, if running (on app server) \\
This task is to ensure any modifications made to prodapp01 are reverted. \\
service iptables stop
Start Apache (on websvr). \\
This task re-binds the necessary IP Addresses and reinstates production environment to as it was. \\
/opt/apache2/bin/apachectl start
Enable monitoring in Zabbix. \\
===== Followup tasks: =====
This tasks needs to be performed post LB implementation. \\
=== Mount Monitoring *COMPLETED* ===
Task: [[https://tracker.exphosted.com/view.php?id=4667|Monitor gluster mounts]]
=== Release IP Address: ===
Task: [[https://tracker.exphosted.com/view.php?id=4575|Release IP Address]]
We do not support HTTPS / SSL yet. As such, there is no need to allocate a unique IP address for each client.
==Notification period:==
- Set a deadline.
- Notify the clients of a deadline to switch the DNS entries by.
- A week prior to the deadline, verify which clients are still using old IP address and send additional notification.
==Migration checklist:==
- Unbind the IP addresses from HaProxy.
- Release private IP addresses.
- UNNAT the private IP addresses.
- Review ACL and remove any entries dealing with the IP addresses (public and private)
===Apache on websvr:===
Task: [[https://tracker.exphosted.com/view.php?id=4576|Apache on websvr]]
- Archive the configuration files.
- Uninstall Apache.
===Memcached HA:===
Task: [[https://tracker.exphosted.com/view.php?id=4577|Memcached HA]]
- Add memcached role on to prodapp02.
- Reconfigure crossbow on both app servers so that Dalli stores data on both instances and is failover compatiable.
===Remove public IP addresses for any app server.===
Task: [[https://tracker.exphosted.com/view.php?id=0004409|Remove public IP addresses for any app server]]
===Upgrade prodapp01:===
Task: [[https://tracker.exphosted.com/view.php?id=4578|Upgrade prodapp01]]
* Memcached dependency : Can only proceed with this after Memcached HA is completed. \\
- Disable prodapp01 in the load balancer.
- Upgrade VMWare Tools.
- Upgrade system software and libraries.
===Issues to resolve:===
**Q:** How do we make transition smoother for existing BYOD customers? Can we allow them to keep current IP till hey are ready to switch? Or, do we have to get them to use new IP right away. \\
**Answer:**
Task [[https://tracker.exphosted.com/view.php?id=4575|Release IP Address]] deals with this. \\ At this point, we are not modifying any public end points i.e. current IPs, so yes they will retain existing IP's. \\ The procedure recommended in the "Release IP Address" section above details a plan which includes notifications, deadlines and IP recycling procedures for existing customers. \\
**Q:** What happens if he hit issues along the way. Rollback needs to be more elaborate. \\
**Answer:** The procedure to introduce LB in the existing environment has been extremely simplified to minimize moving parts and to accommodate for any unexpected failures in the fastest possible way. \\ Additional information detailing what happens behind the scenes with the executiong of each rollback step has been added for clarification. \\
**Q:** Put actual code in now, before this exercise is started. \\
**Answer:** All critical code exists in setup_app_server WIKI page. \\ References to the relevant sections to be added on Monday, 3rd. \\
**Q:** We need to put together a schedule with actual dates. \\
**Answer:**
The final date will be either 6th Feb 2014 or 7th Feb 2014 (all times are in PST): \\
4th Feb : Client and stakeholder notifications. \\
5th Feb : Final prep and audit before GOLIVE. \\
6th Feb : \\
@10:00 PM : Team meet and greet. \\
@10:05 PM : Put the app servers in maintainence mode. \\
@10:07 PM : Stop Apache. \\
@10:07 PM : Start HaProxy. \\
@10:07 PM : Ensure HaProxy was able to successfully bind on the IP addresses. \\
@10:08 PM : Test public endpoints (PASS: maintainence page, FAIL: Anything else). \\
@10:10 PM : A quick test of learnexa.com (using host files) verifying whether all the major functionalities work. \\
@10:18 PM : Put the app servers out of maintainence mode. \\
@10:19 PM : QA does an elaborate functionality test on Crossbow using public endpoints. \\
@10:xx PM : Wrapup. \\
This schedule can also club WARSAW deployment. Incase of any failures, the configuration is to be reverted using the RollBack procedure. Also, since we are introducing an additional app server - any possible fails attributed to the new app server will be addressed by taking that app server out of rotation.