Compatibility of Pabot >= 1.3 with DataDriver >=0.4.0

Since Pabot 1.3 changed the method of parsing Robot Framework test suites, it lost compatibility with DataDriver and the option --testlevelsplit.

DataDriver 0.4.0 with support of lists and dictionaries is not yet released. I plan to implement a fix in this version before releasing.

In this thread we discuss possible fixes:

@mkorpela can you briefly dxplain how Pabot does the parsing now? Uses RobotFrameworks new parser?

Parsing uses now TestSuiteBuilder.
Previously library listener methods were executed (especially this ) while using DryRun now they are not.

I suggest that we try to fix this by introducing a new keyword to PabotLib: Add Suite To Execution Queue [suitename, variables=None].
This would allow DataDriver to call this method at the first run and populate the execution queue with items. For example

pabotlib.add_suite_to_execution_queue(
      "my suite name",
      variables=["DYNAMICTEST:my suite name.test case to be executed"])

This could work also without --testlevelsplit option.

One thing to consider would be if a new item should be added to top of the execution queue or to bottom. Default would in my opinion be bottom (FIFO - first in, first out) but I see that there might be reasons for this to actually work the other way (LIFO - last in, first out).

I added a prototype implementation of pabot side changes to:
feature/add_suite_to_execution_queue

example suite.robot (executes self twice) pabot --pabotlib suite.robot :

*** Settings ***
Library  pabot.PabotLib

*** Variables ***
${MYVAR}  hi
@{VARIABLES}  MYVAR:hello

*** Test Cases ***
My Test
   Run Keyword If  '${MYVAR}' != 'hello'  Add Suite To Execution Queue  Suite  ${VARIABLES}
   Log  something

Let me try to understand it.

Lets assume we have three suites and each suite has tests, suite2 is datadriven:

suite1.robot
    test1.1
    test1.2
suite2.robot
    templatetest (dd with 10 testcases in file)
suite3.robot
    test3.1
    test3.2

now i call pabot --testlevelsplit --processes 2 .

how would the pabot queue look like?
And when the start_suite(suite) of DataDriver Listener gets called, Pabot assumes that this suite has one test, that is called templetetest, right?
But with DataDriver templatetest is deleted and test2.1 until test2.10 is added to the suite.

i could also add 2 time 5 tests instead of 10 times one test. I don´t know if this is possible.
But if DataDriver knows, that pabot has max 2 threads, it would be much better performance to just execute all 10 testcases in two parts, instead of ten.
DataDriver has already the possibility to get DYNAMICTESTS as a list of testnames or a string pipe seperated.
So if i would add to the queue two Suites like this:

Suite2  DYNAMICTESTS:Suite2.Test2.1|Suite2.Test2.2|Suite2.Test2.3|Suite2.Test2.4|Suite2.Test2.5
Suite2  DYNAMICTESTS:Suite2.Test2.6|Suite2.Test2.7|Suite2.Test2.8|Suite2.Test2.9|Suite2.Test2.10

What happens with the current test suite, that is executed at the moment?
What would happen if i delete the test, after putting the both suites to the queue?
And how does DataDriver knows, that this is the “first” call of this suite?

Can you explain a bit more, when the pabot queue looks like what?

cheers

Queue at beginning would be:

--test suite1.test1.1
--test suite1.test1.2
--test suite2.templatetest
--test suite3.test3.1
--test.suite3.test3.2

After templatetest

# OK --test suite1.test1.1
# OK --test suite1.test1.2
# OK --test suite2.templatetest
--test suite3.test3.1
--test suite3.test3.2
--- SYSTEM WILL WAIT HERE ---
--suite suite2 --variable DYNAMICTESTS:Suite2.Test2.1|Suite2.Test2.2|Suite2.Test2.3|Suite2.Test2.4|Suite2.Test2.5
--suite suite2 --variable DYNAMICTESTS:Suite2.Test2.6|Suite2.Test2.7|Suite2.Test2.8|Suite2.Test2.9|Suite2.Test2.10

This we need to figure out.

I think we can figure this one out. There is bunch of variables visible.

@René

Now Pabot master branch contains a draft with the keywords I expect this operation to need:
Add Suite to Execution Queue [Suite Name] @[LIST:OF:VARIABLES]
and
Ignore Execution

Ignore Execution will stop the current execution and ignore outputs from it.

With these you can have a phase during runtime that adds your dynamic tests and after that ignores itself.

Good!

I will not make it very soon. I am totally busy with multiple things right now.

But once i have time, i will try it out!

Cheers

Ok.

I’ll release these silently. Let me know how they work for you.
Now available in PyPi in version 1.6.1.
Docs missing on purpose for now.

Just wanted to open a thread how i use pabot and datadriver wrong, because testlevel split is not working. :slight_smile: Looking forward to the fix!

OH NO! Do you need the fix?
Now i have to make it…

1 Like

No, I am working on a few principle concepts for RPA. I do not need it with high priority. I would like to show that parallel workflows are possible with DataDriver and Pabot.

testlevelsplit is an awesome feature! I’d love to see it work again. Until then: which versions do I need to use in order to have it working? pabot 1.2.* and datadriver 0.3.*?

OH NO! Do you need the fix?

Yes, i do.

I am stuck between bugs and versions. I could use an old pabot version together with datadriver where testlevelsplit still works, but then there is bug constraining remotelibraries which is only fixed in more recent pabot versions. It would be awesome, if both libs would be compatible, again.

i will now start to investigate

1 Like

@mkorpela

Fetch process count
How do i get the information from pabot how many processes are active?
If DataDriver has 100 test cases and pabot has 4 processes, i want to add 4 Suites with 25 test cases each to the queue! I think that this is a really important performance advantage!

is pabotlib needed
How do would i need to call pabot? with --pabotlib? do i have to import the pabotlib to any suite, or just try/catch import the class to datadriver?

For --testlevelsplit: wouldn’t it still require only 1 suite with 100 test cases and pabot does the assignment which test cases goes to what process?

No pabot can not split a suite if it is in the queue.

Pabot parses all suites before exec and adds all singe testcases as suite with one test to the queue.

Because each start of robot and each writing of outputs costs time, i do not want to add 100 suites with one test to the queue.

If we have 4 processes, i want to add 4 suites with 25 test cases each.

But nobody knows how long each test case is going to run.

Likely 3 suites/processes can be finished while 1 suite is still busy with 10 test cases in its queue.

In that case performance gain is lost.

And datadriver would have to correct the report, because there shall only be 1 suite, although 4 had been sent to pabot. Maybe datadriver needs to name those intermediate suites differently in case RF through a warnings for processing suites with same name.

One might argue that datadriver could use the suites file from pabot’s last run in order to find the optimal distribution of test cases among intermediate suites. That only works under laboratory conditions of test automation (but even there processing time may vary). In rpa every run has a new set of data which makes it impossible predicting the best distribution of tasks.

In summary, hence creating intermediate suites might likely cause quite a lot of effort on datadriver development side and only brings a little extra performance, it might be worth starting with 1 suite and see how bad it really is.

I think it is very unlikely that DataDriven test cases that have the exact same sequence Are so different in execution time.
May be a bit jitter depending on data.

I think you think it a bit too complex.
Pabot is the one who merges everything.
If pabot finds 100 Suites with one test case and the suites are the same, it will be merged by rebot to one suite with 100 testcases in report.

I am not planning to put all datadriven cases in one suite, but devide the total amount of test cases per one suite through the number of processes.

I could add an option pabot_single_tests to Datadriver. But i would only do this if it is really needed.

We can do some performance testings.
But i would assume, that the overhead of starting robot 100 times instead of 4 times is so huge, that it would male and differences even if the last process just runs the last quater…

And less processes are always a performance issue.