Response to Github Author Attribution Bug

On Mar 5th 2020, github changed the the squash merge behavior, which caused three commit’s author being wrongly attributed. This problem has been fixed as of now by github so new PRs won’t be affected.

This is an RFC about possible action items wrt to this issue. In order to amend the authors in these commits, we will need to generate new commits that are in parallel of the current chain. Essentially it means we will need to regenerate all these commits which leads to a different(parallel) chain, and force push to master and rewrite the commit history.

The divergent chain will be from the Mar 5th to the lastest

Force push to the master can bring additional problems (we will first need to ask ASF to remove the branch protection in the master, and run the rewrite, then finally ask everyone in the community to resync to the master, normal sync back from master to forks will break because of the diverged chain). Never-the-less, one can also argue that we should do the fix by force push.

Regardless of the choice, we believe that it is important to make the community aware of this issue. Importantly, we evaluate contributions in a comprehensive manner rather than directly counting the loc, so hopefully the few wrong commit stats won’t affect the overall recognition of everyone’s controbutions

See contexts

Please share your opinions by participating the poll

  • Keep the current commit history
  • Use force push to rewrite the history

0 voters

Given the impact of the history rewrite(the longer it takes the more we need to rewrite), we will leave the thread open for a few days.

Proposed Solution

  • Create a commit that undo the contributed PR (this can be done by the original merger, and will create delete statistics(which could be fine, given the main problem is author attribution) for that merger)
  • Then the original author can send the contribution back as a separate PR before we merge it.
1 Like

To be clear, it is 5 (not 3) commits, including (which seems to be) the first time contribution https://github.com/apache/incubator-tvm/pull/4938

After thinking a bit more, perhaps we can get a middle ground. By reverting the affected PRs, and reapply them on top of the master. It is not exactly the same as rewriting the history, instead, we

  • Create a commit that undo the contributed PR (this can be done by the original merger, and will create delete statistics(which could be fine, given the main problem is author attribution) for that merger)
  • Then the original author can send the contribution back as a separate PR before we merge it.

Note that we can exeucte this on a per PR basis and it aligns with the “keep the current commit history choice”.

Would love to see how everyone thinks about this solution

2 Likes

As long as If it doesn’t cause breaking issues, I think it is quite worth doing :slight_smile:

I was thinking of the same solution on my way home. I think it worth doing so. If it is 5 commits, maybe we can revert by clicking “revert” button in the PRs sequentially.

I think this is a solution with minimal affect which keeps author attribution.

I would have thought it was also important for us as a project to stay compliant with respect to the provenance bit of the FAQ. http://www.apache.org/foundation/license-faq.html#provenance

Reverting alongside with documenting the reason for the revert in the commit message and then reapply seems reasonable to me as long as the authors and the committers agree to do this in a timely fashion.

regards Ramana

Do we have an conclusion on this? Me and @janimesh would look like to do the undo/redo for my PR.

The undo PR is already up https://github.com/apache/incubator-tvm/pull/5013

OK, seems we have reached concensus. Let us go for the undo/redo approach. Iet us prioritize to merge these changes asap. Thanks everyone

1 Like

Here is the github issue for the tasks https://github.com/apache/incubator-tvm/issues/5015