Microsoft/GitHub:- Copilot – Opensource / Piracy

Legal Opinion

Software Freedom Conservancy

On the filing of the Class Action Law Suit over GitHub’s Copilot

Link

On the filing of the Class Action Law Suit over GitHub’s Copilot
November 4, 2022

Many of you are inquiring about a lawsuit filed yesterday afternoon by two “J Doe” Plaintiffs regarding the serious and ongoing GitHub Copilot problem which we have been working on for the last 18 months. This issue is dire and important, but includes many complex issues that intersect FOSS license compliance with moral questions of software freedom and the future of machine learning in human endeavor. Complex issues need careful, diligent, and community-oriented consideration and response.

The attorneys in this newly filed case — Matthew Butterick and the Joseph Saveri Law Firm — did reach out to us, and we’ve been in discussions with them as to the key issues of copyleft policy and concerns about problematic interpretations of copyleft that are inherent in this type of novel litigation. These attorneys expressed to us that they had Plaintiffs who wanted to move very quickly, and we certainly understand their frustration.

We pointed these attorneys to our Principles of Community-Oriented GPL Enforcement, which we co-drafted with the Free Software Foundation and has been endorsed by the Linux Netfilter Team, and many others. One of those principles is particularly relevant in this situation: Community-oriented enforcement must never prioritize financial gain.

Now that lawsuit has been filed, we call on the two J. Doe Plaintiffs (and any Plaintiffs whom these attorneys recruit hereafter) to also endorse these principles. We do share your frustration and anger that Microsoft’s GitHub has continued its infringement and Microsoft and GitHub’s refusal to work with the community regarding their aggressive anti-FOSS activity and unprecedented license violation. However, FOSS licensing is not primarily about business models, or financial recovery. GitHub’s actions with Copilot are offensive primarily because they seek to undermine the system of copyleft that is specifically designed to assure that users, developers, and consumers all have equal rights. We wrote the Principles to help guide ourselves and others through complex and thorny policy issues that always come up in FOSS licensing violations. None of us know for certain how this case will proceed, but we ask Plaintiffs now to stand up for the principles of FOSS licensing ahead of time — as we (and the other aforementioned organizations) have done.

We do note that this action is a class action. That means the lawyers here are seeking to bring this action not just for the two J. Doe’s, but their filing of this suit is a request for all of us to trust them to bring this action for everyone. Given that nearly every line of FOSS ever written is likely in the Copilot training set, it’s quite likely that nearly everyone reading this message will find themselves to be part of the class when the Court certifies the class. As such, every one of you, perhaps in the far future or perhaps very soon, will have to make a decision about whether to join this action or not. We, too, at SFC are making that decision right now.

For the avoidance of any doubt, we are not commenting on the legal claims in the case; we know there are many strong claims to pursue on this matter. However, as a public charity (focused on software freedoms and rights in the public good), we must carefully consider the policy implications of this suit, and fully explore potential unintended consequences of a victory (or defeat) by the Plaintiffs. A victory in this particular case may ultimately be a loss for FOSS licensing (for example) if the remedies fail to correct the bad behavior.

Please do subscribe to our blog and news feed to be sure you see future announcements. If you’d like to have interactive community discussion about the principles of copyleft enforcement, please also join the discussion on our principles-discuss mailing list. If you’d like to have an interactive discussion about the moral and ethical issues and concerns about AI-Assisted Software Programming Systems (such as Copilot), please join our ai-assist mailing list.

Finally, we do again take this opportunity to again ask Microsoft’s GitHub to start respecting FOSS licenses (copyleft in particular), cooperate with the community, & retract their incorrect claim that their behavior is “fair use”.

Stories

Bleeping Computer

Bill Toulas

Microsoft sued for open-source piracy through GitHub Copilot

Link

Programmer and lawyer Matthew Butterick has sued Microsoft, GitHub, and OpenAI, alleging that GitHub’s Copilot violates the terms of open-source licenses and infringes the rights of programmers.

GitHub Copilot, released in June 2022, is an AI-based programming aid that uses OpenAI Codex to generate real-time source code and function recommendations in Visual Studio.

The tool was trained with machine learning using billions of lines of code from public repositories and can transform natural language into code snippets across dozens of programming languages.

Clipping authors out

While Copilot can speed up the process of writing code and ease software development, its use of public open-source code has caused experts to worry that it violates licensing attributions and limitations.

Open-source licenses, like the GPL, Apache, and MIT licenses, require attribution of the author’s name and defining particular copyrights.

However, Copilot is removing this component, and even when the snippets are longer than 150 characters and taken directly from the training set, no attribution is given.

Some programmers have gone as far as to call this open-source laundering, and the legal implications of this approach were demonstrated after the launch of the AI tool.

“It appears Microsoft is profiting from others’ work by disregarding the conditions of the underlying open-source licenses and other legal requirements,” comments Joseph Saveri, the law firm representing Butterick in the litigation.

To make matters worse, people have reported cases of Copilot leaking secrets published on public repositories by mistake and thus included in the training set, like API keys.

Apart from the license violations, Butterick also alleges that the development feature violates the following:

  1. GitHub’s terms of service and privacy policies
  2. DMCA 1202, which forbids the removal of copyright-management information
  3. California Consumer Privacy Act
  4. Other laws giving rise to the related legal claims.

The complaint was submitted to the U.S. District Court of the Northern District of California, demanding the approval of statutory damages of $9,000,000,000.

“Each time Copilot provides an unlawful Output it violates Section 1202 three times (distributing the Licensed Materials without: (1) attribution, (2) copyright notice, and (3) License Terms),” reads the complaint.

“So, if each user receives just one Output that violates Section 1202 throughout their time using Copilot (up to fifteen months for the earliest adopters), then GitHub and OpenAI have violated the DMCA 3,600,000 times.

At minimum statutory damages of $2500 per violation, that translates to $9,000,000,000.”

 

Harming open-source

Butterick also touched on another subject in a blog post earlier in October, discussing the damage that Copilot could bring to open-source communities.

The programmer argued that the incentive for open-source contributions and collaboration is essentially removed by offering people code snippets and never telling them who created the code they are using.

“Microsoft is creating a new walled garden that will inhibit programmers from discovering traditional open-source communities,” writes Butterick.

“Over time, this process will starve these communities. User attention and engagement will be shifted […] away from the open-source projects themselves—away from their source repos, their issue trackers, their mailing lists, their discussion boards.”

Butterick fears that given enough time, Copilot will cause open source communities to decline, and by extension, the quality of the code in the training data will diminish.

 

Tweets

Chris Green (Parody) – @ChrisGr93091552

Explored GitHub copilot

Messed Up Line Boundary

Suggestions add nonsensical lines

 

Summary

We live at a time where office rumors hold sway.

We live in a time where short term gains receive the nod.

We live in a time where instagram pics, twitter tweets, and songs on how well I am living teases us.

I get it we all need distractions.

 

Referenced Work

  1. Software Freedom Conservancy
    • On the filing of the Class Action Law Suit over GitHub’s Copilot
      Link
  2. Bleeping Computer
    • Bill Toulas
      • Microsoft sued for open-source piracy through GitHub Copilot
        Link
  3. The Register
    • OpenAI, Microsoft, GitHub hit with lawsuit over Copilot
      Link

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s