[CI] Use pytest-xdist for AlterOpLayout tests

Background

AlterOpLayout (also Legalize) tests register different packed functions in each test for unit testing. The registry can only update an attribute if the plevel supplied to it is larger than the currently registered function. This causes slightly awkward tests shown by

Problem

Pytest runs all the tests in one process, which means all the tests share the registry. This means that a registered function in one test can be unknowingly/mistakingly called for some other test (in some other file) as well. The reason that we don’t see any failure today is because today pytest runs serially, and by luck, the tests that need the original alter_op_layout function run before the tests that overwrite the registered function.

Also, this problem will block our move to pytest-parallel/xdist (if we decide to do that).

Proposed Solution

I tried to use pytest --forked which runs each test in a separate process ensuring that the registry is not re-used across tests. Using this I was able to run all the tests in the file, even when the plevel was same.

Is this the right approach? I hit this problem while working on https://github.com/apache/incubator-tvm/pull/4351 @tqchen @zhiics @yzhliu @haichen

will it eat too much resource in CI?

Will fire a run w and w/o on my machine and report the time difference. For single file, there was no visible difference.

I ran ./tests/scripts/task_python_integration.sh and just ran tests/python/relay

With forked

real	20m59.901s
user	21m5.000s
sys  	0m51.096s

W/o forked (or original)

real 21m55.788s
user 22m8.824s
sys 0m45.872s

I don’t think the forked causes any real issues.

In fact I think it reveals a problem that we need to resolve. We should avoid making assumption of order of registrations in general

As a possible way to get around it. Perhaps we can add a remove attr trigger that allows us to temporary remove an attribute so we can have a RAII pattern in tests

@tqchen Thanks for the suggestion, but can you please elaborate it a little bit more :slight_smile: Sorry, I am learning about good software practices, so I might be dragging this back-and-forth.

Are you suggesting to “remove_attr --> update_attr --> runt_test --> rewrite_original_vale_attr”? And this whole thing should be done in an atomic manner, such that any other test running in parallel should not be able to able see this temporary update?

If the original test code was order dependent(depending on the order of things being called) Then it is something that we need to fix, so set_attr, run_test, recover_attr makes sense in this case.

It is harder to do things in a thread-safe manner, because certain parts of the system are designed to be not thread-safe for efficiency or simplicity concerns. So having it does not matter to have the atomic property.

In summary, we would like to make test-case order independent, but not necessarily thread-safe, if the component to be tested was not originally designed to be so

Thanks! Yes, the tests are order dependent today.

I have added a reset_attr api to make tests order-independent. The flow is

save_old_attrs --> reset_attr --> set_attr --> run_test --> reset_attr --> recover_attrs

Working PR - https://github.com/apache/incubator-tvm/pull/4357