Ollama violating llama.cpp license for over a year
https://github.com/ollama/ollama/blob/main/llama/llama.cpp/L... says:
"The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."
The issue submitter claims:
The terms of the MIT license require that it distribute the copyright notice in both source and binary form.
But: a) that doesn't seem to be in the license text as far I can see; b) I see no evidence that upstream arranged to ship any notice in their binaries, so I don't see how it's reasonable to expect downstreams to do it; and c) in the distribution world (Debian, etc) that takes great care about license compliance, patching upstreams to include copyright notices in binaries isn't a thing. It's not the norm, and this is considered acceptable in our ecosystem.
Maybe I'm missing something, but the issue linked does not make the case that there's anything unacceptable going on here.
b) I see no evidence that upstream arranged to ship any notice in their binaries, so I don't see how it's reasonable to expect downstreams to do it
Downstream is not in compliance. The fact that upstream has made that compliance hard/impossible is not relevant to the fact that downstream is infringing.
There is a whole industry of tools around it (Fossid, Fossa, BlackDuck, Snyk), as well as Open Source projects ( FOSSology, scancode, oss-review-toolkit).
Re: Debian, they have copyright files in their packaged that are manually curated by Debian Developers and should include all those license texts and copyright notices.
I was concerned about the implication (or so I thought) that a binary executable should provide the required documentation (eg. via --version or similar). You are thinking about the text being included as part of a binary redistribution. That did not occur to me, because to me, GitHub issues refer to sources, not binary redistributions.
But of course GitHub does have a Releases page. If those binary redistributions do not contain the license text, then I accept that's something that Debian does do, and is the norm in our ecosystem.
But as other commenters have said, it's not completely clear that this is actually a violation of the license, since https://github.com/ollama/ollama/releases/tag/v0.7.0 for example bundles both source and binary downloads and the bundle does contain the license text via the source file download. Certainly anyone who downloads the binary from the maintainer via GitHub does have the required notice made available to them.
Maybe they’re getting a legal opinion. Maybe they’re leaving it open while they talk business to business. Maybe the right person to address the issue is on vacation.
Lots of people and companies choose not to engage in public battles. I don’t think that should be read as a sign of guilt (or innocence).
A README is often included with binaries. That’s a good place to include any licensing information.
in the distribution world (Debian, etc) that takes great care about license compliance, patching upstreams to include copyright notices in binaries isn't a thing
On Debian, you will find the llama.cpp copyright notice in /usr/share/doc/llama.cpp/copyright if you have installed the llama.cpp binary package.
The terms of the MIT license require that it distribute the copyright notice in both source and binary form.
No, MIT does not require that. The license says:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
The exact meaning of this sentence has never been challenged and never been ruled upon. Considering ollama's README has a link to llama.cpp's project page (which includes the license), I'd say the requirement has been satisfied.
It is certainly possible for a new Ollama user to not notice the acknowledgement.
[0] https://github.com/ggml-org/llama.cpp/blob/master/LICENSE#L3...
ollama doesn't include it.
I see it here? https://github.com/ollama/ollama/blob/main/llama/llama.cpp/L...
llamma.cpp https://github.com/ggml-org/llama.cpp/blob/master/LICENSE
ollama.cpp https://github.com/ollama/ollama/blob/main/LICENSE
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
The copyright notice is the bit at the top that identifies who owns the copyright to the code. You can use MIT code alongside any license you'd like as long as you attribute the MIT portions properly.
That said, this is a requirement that almost no one follows in non-source distributions and almost no one makes a stink about, so I suspect that the main reason why this is being brought up specifically is because a lot of people have beef with Ollama for not even giving any kind of public credit to llama.cpp for being the beating heart of their system.
Had they been less weird about giving credit in the normal, just-being-polite way I don't think anyone would have noticed that technically the license requires them to give a particular kind of attribution.
Many, many projects on GitHub don’t do it and are not license compliant.
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
Is your argument that a binary is not a copy or substantial portion of the Software?
If you distribute a binary, I think it's pretty obvious that either the binary itself should include the notice and the license, or the archive with the binary should include it. Ex: most packaging systems include the licenses in the docs directory that nobody looks at, which is probably sufficient.
This issue seems to be the typical case of someone being bothered for someone else, because it implies there's no "recognition of source material" when there's quite a bit of symbiosis between the projects.
Not sure I'd say there is "symbiosis" between ModelScope and llama.cpp just because you could download models from there via llama.cpp, just like you wouldn't say there is symbiosis between LM Studio and Hugging Face, or even more fun example: YouTube <> youtube-dl/yt-dlp.
Building on llama is perfectly valid and they're adding value on ease of use here. Just give the llama team appropriately prominent and clearly worded credit for their contributions and call it a day.
ignoring the issue is just another way of saying "catch me if you can." and even then open source lawsuits are rather toothless anyway, so the company clearly expects there to be zero consequence.
You know, the tool that very famously had a massive rug pull once it gained marketshare https://www.servethehome.com/docker-abruptly-starts-charging...
First here, we understand that Docker needs to generate revenue. Creating a foundational technology and not having revenue to grow the business is hard. At the same time, the notice period is what one may consider short.
If the money was starting to run dry, with everyone using the tech (and Docker Hub in particular) but not really giving them any money for it, then something was bound to change.
It's cool that there are other alternatives to Docker Hub though and projects like Podman. I feel like with a bigger grace period, the Docker pricing changes wouldn't have been a big deal.
Do any of you guys remember having a third_party_licenses folder after downloading a binary release from github/sourceforge? I think many popular tools will be out of compliance if this was checked...
Thank you to the GGML team for the tensor library that powers Ollama’s inference – accessing GGML directly from Go has given a portable way to design custom inference graphs and tackle harder model architectures not available before in Ollama."
I think Ollama can improve TLDR and add more attribution to llama.cpp to their README. I don't understand why there's no reply from ollama maintainers for so long
Why does anyone in the GenAI care about copyright, licenses, etc? (besides for being nice and getting the community to like you, which should matter for Ollama)
This whole field is built off piracy at a scale never before seen. Aaron Swartz blushes when he thinks about what Llama and other projects pulled off without anyone getting arrested. Why should I care when one piracy project messes with another?
The whole field is basically a celebration of copyright abolitionism and the creation of "dual power" ala 1917 Russia where copyright doesn't matter. Have some consistency and stop caring about this stuff.
Sending patches to ollama is less worthwhile than watching paint dry.
This has so far turned out to be the case: they ended up implementing it. And distastefully, too, exactly like the other 20 PR's would have implemented it; at the API level, that is.
My fork specifically introduces parsing of GBNF code blocks (Markdown ```gbnf) from the system prompt, so that any of the existing clients are supported out of the box without any effort on maintainer's part.