Featued image for: Copilot Lawyers Checking Claims against Other AI Companies

Co-Pilot Lawyers Verifying Claims Against Other AI Companies

The attorneys who filed the GitHub Copilot lawsuit say they receive messages every day from creators concerned about how their work is being used by AI companies, according to lead attorney Matthew Butterick.

The New Stack asked if the attorneys plan to add other names to the lawsuit against GitHub, Microsoft (as owner of GitHub), and OpenAI for using open source code from GitHub to form OpenAI Codex, which powers Copilot, a code generation tool for programmers.

“We are investigating all of these allegations,” Butterick told The New Stack via email last week. “It is appalling that investors and profit-seeking AI companies are already adopting a strategy of massive intellectual property infringement. It will not work. There will be many more cases challenging these practices.

OpenAI Codex plans to offer its Codex templates through an API and maintains a private beta waiting list for companies that want to create other offerings besides the tool. There are also other AI-based completion tools, including AWS’s Code Whisperer, whose FAQ noted that its “ML models trained on various data sources, including Amazon and open source code.” Amazon did not return an email request regarding Code Whisperer. Similarly, Visual Studio’s AI-assisted development tool, Visual Studio IntelliCode, makes recommendations “based on thousands of open-source projects on GitHub, each with over 100 stars.”

We asked Butterick – who is also a programmer – if there was a way to train AI without violating open source licenses.

“Of course – read the license and do what it says!” Butterick said. “AI companies can do this – they just prefer not to, because it cuts into their profit margins. More broadly, AI companies will need to involve creators in the process to make it fair.

An OpenAI blog post identifies other AI-based offerings not mentioned in the lawsuit but which leverage OpenAI Codex:

  • Pygmy uses Codex to transform Figma designs into different front-end frameworks and match the developer’s coding style and preferences. “Codex enables Pygma to help developers instantly perform tasks that could have previously taken hours,” OpenAI wrote in its March 4 blog post.
  • Collapse uses Codex “to describe what a selection of code does in simple language so that anyone can get a quality explanation and learning tools”, including allowing users to highlight selections of code and get an explanation of its features. It also recently started offering an AI code completion service called Ghostwriter, which it says uses “large language models trained on publicly available code and tuned by Replit.” It is not clear if Ghost Writer uses OpenAI Codex,
  • Chain uses Codex to allow users to run a natural language command to search directly from the terminal for a terminal command.
  • machinewhich helps professional Java developers write quality code by using Codex to generate smart unit test models.

Music streaming has evolved

He compared the lawsuit, filed Thursday, Nov. 3, to streaming music services.

“With music streaming, we started with Napster, which was obviously illegal, and then we moved on to licensed services like Spotify and Apple Music,” he said. “This evolution is also going to happen with AI systems.”

We also asked if Copilot was logged if it violated open source licenses by training on Github code.

“It depends on the defendants,” Butterick said. “But we need to be far more concerned about the massive violation of creators’ rights than about ‘saving’ a wealthy corporation’s AI product. Copilot is a parasite and an existential threat to open source. Although given Microsoft’s longstanding competitive antagonism to open source, perhaps we shouldn’t be surprised.

Butterick reactivated his California bar membership in June to join class action litigants Joseph Saveri, Cadio Zirpoli and Travis Manfredi of the Joseph Saveri law firm in the federal trial. The 52-page complaint, plus an appendix and an exhibit, was posted online by attorneys and names two unnamed plaintiffs, one from California and the other from Illinois.

What Butterick wants developers to know

Butterick wanted developers to know that litigants are interested in hearing from “all open source stakeholders.”

“All of us working on the case are optimistic about the future of AI. But AI has to be fair and ethical for everyone,” he said. not.”

Some open source stakeholders may disagree with this view of what is fair play in open source – or any code. For example, Florin Pop, a frontend developer, recently asked Twitter if it was okay to copy code. The majority who responded made an ethical distinction, saying it was okay as long as developers used the code to learn how the code works, rather than just cutting and pasting the code. Others added that the license still matters and should be considered when copying code.

Remy Sanchez (@Xowap) of Madrid, Spain, the CTO of digital company With, tweeted specifically about the US legal action.

“This Copilot lawsuit doesn’t make much sense to me,” Sanchez said in a tweet. “The code does not have much value in itself. Good code is as boring as it gets. What matters is the execution, the purpose of what you do.

Ultimately, what matters may be up to a jury, not the developers, to decide.

Band Created with Sketch.


#CoPilot #Lawyers #Verifying #Claims #Companies

Leave a Comment

Your email address will not be published. Required fields are marked *