Facebook’s GIT repository is a single 54GB animal that builds Facebook. It’s not that big - they haven’t seen the MS Windows one! Actually, I take that back - anecdotal evidence suggests that a large number of ex-MSFT engineers are now working at Facebook. I’ve just counted a dozen people I know in my head and at least half have experienced the Windows codebase, first hand. Before I roll out my argument, I have to say that many are my good friends and awesome people. And let me also preface my post with the fact that there are many advantages of having a single repository for those that contribute to it, and I will not discuss those here. Instead, I am going to try and offer a unique perspective from my rather unique position.
Why me?
In 2000 I started a build environment project inside Microsoft called CoreXT. I managed it for a while until a large community had formed around it inside Microsoft. I left in 2004, but CoreXT continued to be a vibrant and active project, and, from what I hear, is very much alive and evolving almost 15 years later. CoreXT was a fork of the Windows build environment, born from the desire of hundreds of non-Windows groups to break free from depending on the giant heavy hand of a single repo and infrastructure. Those groups included such massive codebases as Microsoft SQL Server and the entire MSN division with 54 products in my time. People continue working on CoreXT and putting CoreXT on their resumes. I want to think that everyone, but Windows, uses it now, but I also know that some organizations have rolled out their own versions of things, including the Developer Division, the team that ships Visual Studio. They dedicate a significant amount of effort into cloning, forking and otherwise patching Windows code, and even ship their own C runtime which is source of headache for two generations of developers. The Windows organization finds that being centralized is efficient, but they rarely discuss the cost to other dependent organizations.
Facebook’s central repository is the new Windows. The team that supports it is the new Microsoft Build Lab.
How bad is it? Let’s draw a parallel between a central repository and a network of experts. I used to work on some security-related code for Microsoft Billing. I would walk to building 41 and consult the world’s best experts on cryptography and key management. I could open the source code for the Windows SSPI and dig through it when my SSPI provider wasn’t cooperating and spitting returning back E_INVALIDARG. I cannot do this now, even if I wanted to. When I left, it felt like a huge void. I had to reinvent a lot of wheels in every single one of my subsequent jobs, but have learned my lesson since. My current knowledge network consists exclusively of open-source contributors and their code, Q&A sites, Twitter, and personal connections. In such a network, knowledge isn’t hidden and is always evolving. People move around cities and companies, and even die, but their work is preserved for the public and projects get taken over by new contributors all the time. It’s a living, collective consciousness, a naturally self-selecting organism. It’s a powerful form of freedom.
If you start centralizing your development you’re killing any type of collaboration with the outside world and discouraging such collaboration between your own teams. You’re improving efficiency by creating shorter paths, but you also create lock-in and create no place for healthy competition amongst multiple solutions to the same problem. You’re probably making your own workflow more efficient, but you’re subtracting from everybody else’s. Eventually, as you grow big enough, you start subtracting from your own, entire organization’s capacity.
Kill your giant central repo. Distribute everything. Small is nimble. Small is creative. Fast forward.