Current stress testing results on prodcollab01 - https://wiki.exphosted.com/doku.php/stress_test
As mentioned in stress testing results the maximum number of possible connections on the machine are 109. So, around 100 users are supported across all live events simultaneously. This can be splitted as 25 users per session and about 4 parallel sessions.
Scaling up live events needs -
1. Increase the limit of simultaneous users in a single live event server - This is not done now as it requires BBB's current architecture redesign.
2. Increase the limit of simluntaneous users by having multiple live event servers and through load balancing - This is discussed below and is the aim of this task as of now.
As BBB has multiple components it is important for us to decide the parameters based on which we need to load balancing. We can't do a simple http or rtmp based service load balancing as though majority of the BBB functionality is red5 server based there are other components which can't be scaled this way (like redis and etc.).
Also, as meetings are shared in memory, once a meeting is created on a given BBB server, all the attendees would join the same server and the assets, recordings of the meeting get saved on the same machine. We do not need any asset sharing. Hence, the only thing we need to do is to determine which meeting server we need to select while creating the meeting. This means we need to know which server has less load before making the create meeting API call.
1) Scaling by load balancing out side the BBB server: We can have multiple BBB servers setup independently. Learnexa app can decide which BBB server is less loaded and can route the new meeting sessions to it. A seperate gem is written which becomes part of learnexa app and uses getMeetings API calls to see how many meetings are running on each server and how many users on each server. It will create a new meeting in the least loaded server in round robin manner.
LB Metric: number of users on a machine.
Conclusion: Easier and faster to implement but we may not take this approach as it is not the job of learnexa or any client app to decide the load on the BBB server.
2) Third party BBB hosting : There are some BBB hosting providers with their custom load balanced implementations. One popular and best priced is www.hostbbb.com
Conclusion: We may not take this approach as 100+ concurrent users is priced at > $575 which might not be cost effective.
3) Load balancing like mconf : Ref - https://github.com/mconf/wiki/wiki/Mconf-Scalability
Have a Load blancer app deployed on a seperate machine which balances BBB servers based on either CPU load or users/meetings currently live.
LB Metric: CPU usage. But this can be tweaked to number of users/meetings.
Conclusion: We can try this as it is architecturally more approriate. A seperate machine is deciding which BBB server to hit. Only consideration is it will take time to implement and maintain this load balancer. We need to make sure this is a light weight app. If possible, we can also use any exisiting applications which act like load balancers and suit this purpose.
- Should we load blance the load based on CPU usage ? or number of users/meetings?
- When all the meeting servers are occupied and a new request comes to create a new meeting, should we keep waiting till a server is freed ? or Inform the user that no meeting slots are free and ask him to retry this.