The rfswarm agent distribution is poor

When the robots are distributed among the agents, they are not very evenly distributed.

The following environment and scenario exists.

  • There are 100 robots allocated
  • There are 4 agents used
  • Everything is on one domain
  • The test case takes about 1.5 minutes to run
  • Tried different ramp ups (like 30, 40, 100 seconds)

When running, the agents got distributed as follows.

  • 30 seconds: 40, 10, 38, 12
  • 40 seconds: 10, 10, 37, 43
  • 40 seconds: 10, 45, 17, 28
  • 50 seconds: 10, 10, 47, 33
  • 100 seconds: 10, 10, 10, 10 (only 40)

Shouldn’t they get distributed more evenly, like the below?

  • 25, 25, 25, 25
  • 20, 30, 20, 30

I did find a document below that sounds to me like this should be okay.

  • rfswarm/Overview.md at master · damies13/rfswarm · GitHub
  • Agents: A web application being tested using SeleniumLibrary, My initial tests indicate that with headlessfirefox, a mid range desktop PC should be able to support around 50 virtual users, obviously this will vary depending on the amount of think time you include, how javascript heavy your application is a few short (~5 minutes) runs with 10, 30 & 50 users on one agent should give you a feel for what your agent machines are capable of.

Hi Mark,

I’d really like to understand better what your experiencing, but I suspect rfswarm is working as designed, it’s pretty rare that in a production system that you would go from 0 users to all users in 30-100 sec, so rfswarm wasn’t really designed for fast ramp ups like that, perhaps I need to make an adjustment to the code to handle this scenario.

Not necessarily, this really depends on your agents, refer to the documentation on Agent Assignment, if your agents are all the same then yes it should distribute more evenly but if they are different then this could be expected.

Something to note is even if the agents are the “same” hardware configuration but shared resources then, for example in VM’s or other apps running on the agents then that might trigger the agent to report a difference in resource usage, as the least utilised agent is always selected.

Another thing that can impact the the distribution of robots is whether all the robots have connected before you start the test and the polling interval (~10 sec before the agent knows the test has started and ~2 sec after the agent knows the test has started)

I would like to understand if this is the robot distribution after ramp up finished each on different tests or at the different point during a 100 sec ramp-up?

There is a protection mechanism after the first 1 agents are loaded where robot’s don’t get assigned to agents in critical state (over 95% resource utilisation (cpu, memory or network io)) so if the least utilised agent is in a critical state then no more robots get assigned, this might explain the only 40 robots but if your agents are over 95% resource utilisation with 10 robots then that would explain the other strange behaviour you’re seeing too.

Dave.

Hi Dave.

I chose 100 seconds (1 second per robot) as ramp up time due to your last comment. I thought that was standard. It didn’t work very well so tried a few other times. I guess I should have went higher also. It just seemed to work better lower.

The 4 agents exist on 4 of the same VMs. They are identical. They are all most definitely overloaded. Now that I read your comments, I see why the uneven distribution.

This answers my question. Thank you very much Dave.

Mark

1 Like