Part of my role at 2toLead is to help set guidance around best source management practices either internally or for our customers. One of the questions I get often is: Should I fork this repository or do something else with it?
It’s hard to get clear and simple guidance on the web so I thought I’d take a stab at it.
As we’re using Visual Studio Team Services, my examples and screenshots will be based on it, but it really applies to any git service like Github or internal git servers.
The options you get
From my point of view, when you have an existing repository somewhere, you have the following options to move/copy it depending on your scenario:
- Making a manual copy: manually copying files over, running a git init or something similar and starting anew. This should probably be always avoided. You’ll lose history, capability to merge with the source and so on. Even if it looks super easy, you’ll regret it later.
- Using the import option: In VSTS you have the option, after creating a new repository, to import the content (and history, and branches…) from another repository**. This is great for simple migration scenarios** as it’ll keep the history and other artifacts that will be useful later. Use it if the source is not in the same service (ie Github => VSTS) or if you’re planning to delete the source right after.
- Fork a repo(from the source): this is probably the best option if you’re planning to have both the source and the new repository live. It will allow you to easily port your commits from one repository to another (via pull requests). This choice should probably be your go to by default.
Example of authoring a PR across repos.
Example of importing a repo
Getting out from the mess
Now let’s say you landing on this article because the situation already got out of control. You have the “same” source base that got over multiple repositories, and not necessary on the recommended way, how do you fix that?
Before we begin let me say that this operation can be error prone, make you loose work, will induce a “service interruption” (even short) for your developers and this solution is provided with no warranty whatsoever. Also make sure all changes are committed and pushed before starting anything, for every developer accessing the repo.
You are facing two main cases:
- Your repositories have at some point a common commits tree (they have been forked or imported). Git is going to “understand” some of what happened and will be able to help us
- Your repositories don’t share a common tree (manual copy of files), you are going to have to “replay” the changes on the new fork manually, super error prone.
Common tree scenario
Let’s say I have the current structure.
The second one being an import of the source one. Source didn’t get any updates since the import but the import did. And now I want to be able to propagate changes from importedRepo to the source one, without having to handle merges and multiple remotes locally.
First, fork the RepoSource repo into ProjectB/ForkedRepo
Then clone the ForkedRepo locally. After that run the following commands.
Make sure you set up the branch policies, builds definitions and release definitions are up to date. Even run a diff tool on your local machine, branch per branch, between the two repositories folders and you’re good to go!
For the other developers on the team, simply run these set of commands to re-map to the new forked repository.
My ask for the VSTS product team
Please make it easier to move Git repositories between team projects keeping the fork link and everything.
I hope that post helped bring a bit of clarity on the best practices as well as it helped some of you fix the situation.