Monitoring of App Servers: ====Zabbix==== Will need a dedicated machine with fast IO to avoid IO hiccups in future. \\ Pull method to collect metrics. \\ Nodes have to be added manually. Extensive handlers, including notification. \\ Custom configuration and partial application monitoring (via system metric aggregation) \\ ====Sensu==== Better than Zabbix, message broker elimiates heavy IO and scaling is easy via injecting nodes. \\ Custom configuration and partial application monitoring (via system metric aggregation). \\ Extensive handlers, including notification. Better dashboard with automatic node handling based on role, Graphite support etc. \\ ====New Relic RPM Lite ==== == Features == Apdex Score \\ Browser Historgrams \\ Worst Transactions by User Dissatifaction \\ Real User Monitoring Overview \\ Application response times \\ Application Histograms \\ Most Time consuming transactions \\ Integration with Collab Tools \\ Time Spent in DB Calls \\ Server Resources Monitoring \\ Analysis of CPU Disk Memory Network \\ Summary Server Process Metrics \\ Incident & Availablity Monitoring and Alerting \\ Error Detection, Alerting and Analysis \\ App Speed Index \\ Data Obfuscation \\ Weekly email reports. overview of app’s performance for the past week. \\ Metric Data: 24hrs \\ Sample Data: 7 days \\ Data Aggregation: 2 hrs \\ Permanent storage : None \\ == Implentation and Network considerations == Outbound JSON objects. \\ No firewall reconfiguration. \\ Setup: Install a gem. \\ Write script to grap a snapshot of the metrices at the 23:55 hours every day. \\ == Thoughts == Pretty good initial and future-proof ($$) solution. As we grow, we will likely end up with a need to analyze historical data to compare against in more consistent (SLA) and dash boardy fashion. With the recent addition of advanced notification mechanics, this is currently the only solution which will provide us both application level monitoring as well as resources / service availability monitoring. \\ == Discussion == Q:There is no indication that new relic sends alerts like Zabbix does nor does it send msg to mobile. How will that be handled? \\ A: New relic will be monitoring the system via a system agent (in addition to app monitoring via newrelic rpm). \\ Alerts and notifications are triggered by these agents. \\ Alerts can be received to any email address and hence we can configure mobile alerts just as we have for zabbix (ex: 201556xxxx@txt.att.net) \\ In addition to email accounts, alerts have been baked into PagerDuty, Campfire, HipChat, Webhook (any web endpoint) as well as NewRelic iOS and Android apps. \\ Next steps: \\ We could technically take the metric export and import that into Sensu / Zabbix. If configured properly, the monitoring frameworks can be used as rentention spots with access to historical information. \\ == Implementation checklist *COMPLETE* == 1) Bring the app server out of pool. \\ 2) Stop Zabbix Agent. \\ 3) Install newrelic gem and agent. \\ 4) Add the config file under shared. \\ 5) Restart passenger. \\ 6) Monitor until next release for any potential issues. \\ == RollBack == 0) Ensure no sessions are being served by the app server. \\ 1) Disable new relic monitoring in config. \\ 2) Restart passenger. \\ 3) Start Zabbix. \\ 4) Remove new relic gem. \\ 5) Repeat for the other app server. \\ ==== Installing & Enabling new relic monitoring agent on production only ==== 1. Install newrelic_rpm as a plugin in any environment. script/plugin install git://github.com/newrelic/rpm.git #now rename rpm to newrelic_rpm mv rpm newrelic_rpm 2. Copy this to /deploy/crossbow/shared/plugins. 3. Copy the newrelic.yml to /deploy/shared/config and edit it as needed. 4. Add the following symlinks. cd /deploy/crossbow/current/config ln -nfs /deploy/shared/config/newrelic.yml newrelic.yml cd /deploy/crossbow/current/vendor/plugins ln -nfs /deploy/shared/config/plugins/newrelic_rpm newrelic_rpm 5. Restart apache. 6. Go to /deploy/crossbow/current/log and observe the newrelic_agent.log to make sure it is reporting. Data can be observed in new relic website in next 2-5 mins. ==== Deploy script changes needed on production ==== Open /home/expprodl/crossbow/config/deploy.rb Under symlink tasks add the following two lines. namespace(:customs) do task :symlink, :roles => :app do ----- ----- //Add the following line run "ln -nfs #{shared_path}/plugins/newrelic_rpm #{release_path}/vendor/plugins/newrelic_rpm" end //Under task :dblink, :roles => :app do ----- ----- //Add the following line run "ln -nfs #{shared_path}/config/newrelic.yml #{release_path}/config/newrelic.yml" end