A couple of moments ago, I finished reading the article by Rob O'Leary about the pervasive data collection done by Visual Studio Code. Now that I'm no longer an employee at Gitpod, I'm finally able to author a blog post freely about something that has been troubling me for quite some time...
Whilst Visual Studio Code is "open-source" (as per the OSD) the value-add which transforms the editor into anything of value ("what people actually refer to when they talk about using VSCode") is far from open and full of intentionally designed minefields that often makes using Visual Studio Code in any other way than what Microsoft desires legally risky...
In this blog post, we explore the ecosystem of open-source forks, revisit the story so far with how Microsoft has been transforming from products to services, go deep into why the Visual Studio Code ecosystem is designed to fracture, and the legal implications of this design then discuss future problems faced by the software development ecosystem if our industry continues as-is on the current path...
By the end of this blog post, I hope more folks understand that by using anything other than the official distribution of Visual Studio Code provided by Microsoft (or GitHub via Codespaces) that it is easy to expose yourself or your company to legal risks similar to incorrectly using Docker Desktop or the Oracle JDK...
visual studio code is now seven years old
Visual Studio Code was released 7 years ago and is fast becoming the de facto standard editor that people use when doing software development. Sure there's also the JetBrains product suite, Emacs, Neovim, XCode and Visual Studio [for Windows and Mac], but VSCode is likely installed on your computer right now.
The source code has been released by Microsoft under the open-source MIT license, but the product available for download (Visual Studio Code) is licensed under this proprietary license. This small distinction matters a lot and is the primary mechanism that Microsoft uses to fork open-source communities.
This comment from a Visual Studio Code maintainer explains the process of how Microsoft generates its builds:
When we [Microsoft] build Visual Studio Code, we clone the vscode repository, lay down a customized product.json that has Microsoft specific functionality (telemetry, gallery, logo, etc.), and then produce a build that we release under our license.
In the broader community, there are two leading distributions based on the MIT source code: vscodium & openvscodeserver.
vscodium is an oss desktop distribution
Members of the free software community became concerned by the usage of the proprietary license and launched the VSCodium project as a community-driven, freely-licensed desktop distribution of Visual Studio Code in binary form. The project automatically follows the upstream open-source (MIT) project and generates binary builds without the telemetry found in the official releases.
The VSCodium follows the same process outlined by the Visual Studio Code maintainer:
When you [VSCodium] clone and build from the vscode repo, none of these endpoints are configured in the default product.json. Therefore, you generate a “clean” build, without the Microsoft customizations, which is by default licensed under the MIT license
Rob is correct with the following statement from his blog post on telemetry:
However, VSCodium can’t shut out all the data collection as it is the same codebase. And since extensions act independently with regard to data collection, you still need to be mindful of what extensions you install.
VSCodium does an extremely good job at disabling data collection, but due to not being licensed by Microsoft under the proprietary license VSCodium is not able to connect to the Microsoft Visual Studio Code Marketplace and suffers from the ecosystem fracture by design problem...
openvscodeserver is an oss server distribution
OpenVSCodeServer is similar to VSCodium in that is also not allowed to connect to Microsoft Visual Studio Code Marketplace and suffers from the same ecosystem fracture by design problem. The project is a company-driven, freely-licensed server distribution of Visual Studio Code in binary form that is the backbone of Gitpod. The project is primarily maintained by four Gitpod employees (Anton / Filip / Jean Pierre / Huiwen) and automatically follows the upstream open-source (MIT) project. The distribution has some minor overlay customisations in the gp-code/main branch and also does not have the telemetry found in the official releases.
IDEs that are not subscriptions are a dying breed
Circa 9 years ago, Microsoft started an internal transformation in how they delivered software to customers. Instead of directly (ie. in-house) employing quality assurance teams who were dedicated to testing software builds, Microsoft switched to a model based on sprint-based development work and rolling releases with feedback from telemetry data that is gathered from Insider Builds of Microsoft's software.
At the same time, wider organisational changes took place in the form of functional restructures, which transitioned Microsoft into becoming a services company. Their Azure cloud-computing offering during this period has grown into a legitimate challenger to Amazon's cloud-computing offering called AWS.
The biggest side effect of this change for consumers was that instead of delivering installable products that could be run on-premises, Microsoft, in true Microsoft form of moving ever so slowly and doing it over a generation of people so as not to spook them, has been transitioning their customer base into consumers of services offered by Microsoft.
This same transition has been happening across the board in the developer tooling space as a whole. IDEs that are not subscriptions are a dying breed unless you make a ton of money from something else (ie. Apple and the AppStore, which funds the development of Xcode)
So why am I bringing this all up? It's because Visual Studio Code is a ramp to move the developer tooling ecosystem towards an end-to-end consumable services model of software development tools, and GitHub Codespaces is a white label of an existing service called Visual Studio Online aka Microsoft Dev Box aka Microsoft Azure DevTest Labs.
GitHub is a white label for existing Microsoft tech
Microsoft acquired GitHub circa 2018 and in 2019 Microsoft released the Visual Studio Online product that included a component for hosting your own "codespace" locally. Since then, everything has moved to GitHub, including the team that made Codespaces, and that component is used on the servers GitHub deploys to. Thus GitHub Codespaces is a devdiv project that now belongs to GitHub.
There have been numerous restructures within GitHub, but the most notable is the one that occurred on the day that Nat Friedman retired as the CEO of GitHub. On that day, when everyone was focusing on Nat's retirement, Scott Gu sent this email internally within Microsoft announcing the restructuring of how GitHub reports to Microsoft...
Julia Liuson promoted to President of the Microsoft Developer Division, which now includes GitHub
The mission of the Microsoft Developer Division is to earn the trust and love of developers across all languages and platforms and make them successful as they build the applications of the future. DevDiv today includes our developer tools and services including Visual Studio, Visual Studio Code, NET and C#, TypeScript, and the OpenJDK. Our Azure Developer SDKs, as well as our Azure Application Development PaS and Serverless offerings (including our Azure App Services, Functions, Logic Apps, API Management, Dapr, Redis Cache, Spring Cloud services, etc.) are also part of this organization.
I'm very pleased to announce the promotion of Julia Liuson to President, Microsoft Developer Division. As part of today's changes, Thomas Dohmke, CEO of GitHub, will report to Julia going forward, as will Julia's existing DevDiv reports.
Julia has been instrumental in Microsoft's adoption of open source, and in the transformation of Microsoft's developer strategy. As the leader of DevDiv, she helped guide the open sourcing of -NET (which now runs on every major OS platform), as well as the creation and open sourcing of Visual Studio Code (now the most popular development tool in the world). She initiated our deep engagement with the Python community, including hiring Python creator Guido van Rossum to Microsoft, and her team now delivers the most widely used Python developer tooling in the world (with VS and VS Code) as well as delivers runtime performance for the broader Python community. She started the OpenJDK effort at Microsoft, which is now used broadly to run Java workloads on Azure. And over the last 9 months she has led our Azure Application Developer PaS and Serverless offerings and has helped drive to make these services great for developers using all languages, tools, and platforms.
Under Julia's leadership, the Developer Division team has undergone a significant cultural transformation and is guided by consistent cultural values: diversity & inclusion, customer obsession, data-driven, and quality-driven. The pervasiveness of these culture attributes is evident through the success of products like Visual Studio and Visual Studio Code, which have experienced more than 16x usage growth since 2014 (and are now used by the majority of developers in the world).
I deeply admire how committed Julia is to team culture, mentorship, and how she helps generate opportunities for others to succeed. She is an avid supporter of MakeCode (which is also built by her team) as an investment to help kids learn programming and pursue computer science in earty education. Julia was one of the first women at Microsoft to be promoted to Corporate Vice President of Engineering leading development teams at Microsoft, and she is a mentor and sponsor to women and men across Microsoft today. She received the Asian American Executive of the Year award in 2013 and was inducted into the Women in Technology International Hall of Fame in 2019.
Please join me in thanking Julia for the outstanding leadership she provides, and in congratulating her on the well-deserved promotion and expanded remit.
Instead of GitHub reporting directly to Scott Guthrie, GitHub now reports to the person who looks after numerous products within the developer division. Julia Liuson was an interesting choice because she was the person that weeks before the promotion who implemented last-minute changes that fractured the .NET community...
Sources at Microsoft, speaking on condition of anonymity, told The Verge that the last-minute change was made by Julia Liuson, the head of Microsoft’s developer division, and was a business-focused move.
an ecosystem that is designed to fracture
I used to think GitHub Codespaces would help popularise Gitpod but now realize it is the other way around. Gitpod is currently permitted to exist in the Visual Studio Code ecosystem to popularise GitHub Codespaces, and Microsoft can step in at any moment to create legal crises that strategically divide the market from a business perspective because like Apple and their AppStore: it is their ecosystem that they control and they are in absolute control.
Meanwhile, from a product perspective, people will try out Gitpod and, unfortunately, experience product papercuts in the expected value of Visual Studio Code and how users expect the product to function because the developer experience of Gitpod can never match the seamless developer experience of Visual Studio Code or GitHub Codespaces because the Visual Studio Code open-source source code is a venus fly trap that is designed to fracture and lure people in...
but it would be wrong to single out just Gitpod here. Any company (Gitpod, GitLab, Datacoves, OpenBB, Foam, et al) that adopts the Visual Studio Code open-source source code and attempts to compete with Microsoft or GitHub will face the problems outlined above and will be unable to legally offer services for the following programming languages using the functionality that Visual Studio Code users expect and have become accustomed to unless they develop their own tooling (which as of this blog post none have done so):
- Microsoft .NET C# (fsharp is completely open and does not have these issues)
- Python (general purpose and data science markets)
- Project Jupyter (as in nearly the entirety of the data science market)
- C or C++ (general purpose, enterprise and industrial hardware markets)
- and I suspect 🔜 Java (general purpose, enterprise and data science) will be next once the Microsoft tooling catches up with the tooling offered by RedHat.
Microsoft can easily fork open-source communities by changing towards proprietary defaults ("strategically divide the market") as Microsoft has already done twice so far. The way Microsoft forks open-source communities is by releasing Visual Studio Code extension updates that make their proprietary offering the default once they have managed to capture enough adoption...
While the company isn't forcing users to switch to its new proprietary language server -- pointing to an open source alternative -- it's the new default.
The move affects millions of developers, as the Python extension is by far the most popular tool in the VS Code Marketplace, having been downloaded nearly 50 million times, about twice as much as the next most popular extension: Jupyter.
The "switching of defaults" that occurred in the Python community is taking place right now in the .NET community...
Even if Gitpod, GitLab, Datacoves, OpenBB, Foam, et al were to develop "Open.NET"
or similar tooling alternatives to the proprietary extension offerings created by Microsoft to enforce their commercial strategy, users will experience friction in the form of having to wire in different product-specific configurations on a per-platform basis and then dealing with the headaches of user support/training related to topics of how the "official"
ms-dotnettools.csharp functions vs the open alternative (if it is ever built) and topics of feature/configuration disparity.
1. desktop configuration
"extensions": [ "ms-dotnettools.csharp" ],
2. Gitpod web configuration
- // some open tooling that doesn't exist yet
Whereas if a user stays within the official ecosystem created by Microsoft via the desktop edition of Visual Studio Code produced by Microsoft, then the same configuration that works on the desktop edition just works when someone or their team goes to try out or adopt GitHub Codespaces.
vs a single source of truth configuration that just works
"extensions": [ "ms-dotnettools.csharp" ],
If Gitpod, GitLab, Datacoves, OpenBB, Foam, et al were to attempt to bypass these restrictions by offering
ms-dotnettools.csharp post move to LSP Host via their service, then they would receive a very nasty legal email from Microsoft's lawyers.
The same is also true for customers of these competitive cloud development environments if they were to manually install these extensions into these platforms, the customers would be in breach as the license of these extensions is very clear that they are only licensed for installation in official builds distributed by Microsoft:
You may install and use any number of copies of the software only with Microsoft Visual Studio, Visual Studio for Mac, Visual Studio Code, Azure DevOps, Team Foundation Server, and successor Microsoft products and services to develop and test your applications - https://marketplace.visualstudio.com/items/ms-vscode.cpptools/license
okay, so how do we fix this?
The future of software development tooling that is being built is closed as fuck, and people seem to be okay with it because select components meet the OSI definition whilst missing the bigger picture that the compositional graph of components does not.
Open-source was created as a financial weapon to destroy proprietary on-prem software and to ensure file formats (eg. msword doc vs msword docx) remained open for mixing between different pieces of on-prem software. Open-source as a financial weapon is also why making money from open-source is so god damn hard.
Maybe we need a new movement (or revisit past ideas from the 70s) that focuses on ensuring the openness regarding freedoms of computing (😉) that combat proprietary SaaS offerings? idk.
When I see people arguing over open-source vs non-opensource in 2022, I feel like people are completely missing the bigger and more pressing issues at hand, like how the Visual Studio Code ecosystem has been designed...
The fracturing ecosystem problem is one of the reasons I created Gitpod's Open-Source Sustainability Fund. Paying for resources that are being consumed broadens the list of people who can do open-source. Additionally, money enables open-source maintainers to buy services and outsource activities that do not bring them joy.
In the very short period of time (1.5yrs) I was at Gitpod, over $32,000 USD was distributed to maintainers of language server tooling in the open-source community.
I hope other companies follow Gitpod's lead and support the high achievers that our digital society is built upon by enabling them to become independent artists that build truly open ecosystems
Gitpod has already partially resolved the marketplace problem for the Visual Studio Code open-source ecosystem by creating the OpenVSX project and gifting it to the eclipse foundation
but the biggest challenge for Gitpod, GitLab, Datacoves, OpenBB, Foam, et al lies ahead - developing open language tooling for each community where Microsoft has forked the communities over to proprietary language servers...
Thanks for reading; please discuss on the tweet below 👇 🧡
edit: 31st of August 2022 - Green0Photon from /r/programming has a good summary:
In short, this is what Microsoft did:
- Created VSCode and made it the best and open-source IDE that everyone would jump to first.
- Make a proprietary free distribution of it, along with proprietary free extensions for the various languages.
- Make those extensions the best version possible and slow down focus on open source ones, often deprecating them.
- Now you have to use the closed form of VSCode to have the best experience by quite a bit.
- Everyone else using VSCode as a platform can't keep up because Microsoft fractured their community -- and your VSCode product is now just an ad for a similar Microsoft product that doesn't have all the papercuts.
edit: 16th December 2022
GitLab has launched their own offering based off VSCode (MIT) which suffers from the same problems as everyone else who uses VSCode (MIT) in that VSCode (MIT) is designed in a way that makes your product a walking, talking, advertisement for GitHub Codespaces....
edit: 13th October 2023
Google's soon-to-be-launched "Project IDX" offering is based on VSCode (MIT) which suffers from the problems above. If you want to develop .NET, and Python and have the expectations that the Visual Studio Code LSPs will work there - forget it. Not possible, not legal.