Why did containers happen?
Cgroups and namespaces were added to Linux in an attempt to add security to a design (UNIX) which has a fundamentally poor approach to security (shared global namespace, users, etc.).
It's really not going all that well, and I hope something like SEL4 can replace Linux for cloud server workloads eventually. Most applications use almost none of the Linux kernel's features. We could have very secure, high performance web servers, which get capabilities to the network stack as initial arguments, and don't have access to anything more.
Drivers for virtual devices are simple, we don't need Linux's vast driver support for cloud VMs. We essentially need a virtual ethernet device driver for SEL4, a network stack that runs on SEL4, and a simple init process that loads the network stack with capabilities for the network device, and loads the application with a capability to the network stack. Make building an image for that as easy as compiling a binary, and you could eliminate maybe 10s of millions of lines of complexity from the deployment of most server applications. No Linux, no docker.
Because SEL4 is actually well designed, you can run a sub kernel as a process on SEL4 relatively easily. Tada, now you can get rid of K8s too.
As about SEL4 - it is so elegant because it leaves all the difficult problems to the upper layer (coincidentally making them much more difficult).
Containers and namespaces are not about security
True. Yet containers, or more precisely the immutable images endemic to container systems, directly address the hardest part of application security: the supply chain. Between the low effort and risk entailed when revising images to address endlessly emerging vulnerabilities, and enabling systematized auditing of immutable images, container images provide invaluable tools for security processes.
I know about Nix and other such approaches. I also know these are more fragile than the deeply self-contained nature of containers and their images. That's why containers and their image paradigm have won, despite all the well-meaning and admirable alternatives.
A bypassable security mechanism is worse than useless
Also true. Yet this is orthogonal to the issues of supply chain management. If tomorrow, all the problems of escapable containers were somehow solved, whether by virtual machines on flawless hypervisors, or formally verified microkernels, or any other conceivable isolation mechanism, one would still need some means to manage the "content" of disparate applications, and container systems and the image paradigm would still be applicable.
I also know these are more fragile than the deeply self-contained nature of containers and their images
Not really. People only use Nix because it doesn't randomly break, bitrot or require arcane system setup.
Unlike containers. You really need k8s or something like it to mould Docker containers into something manageable.
People only use Nix because it doesn't randomly break, bitrot or require arcane system setup.
I'll stipulate this, despite knowing and appreciating the much greater value Nix has.
Then, the problem that Nix solves isn't something container users care about. At scale, the bare metal OS hosting containers is among the least of one's problems: typically a host image is some actively maintained, rigorously tested artifact provided by one of a couple different reliable sources. Ideally container users are indifferent to it, and they experience few if any surprises using them, including taking frequent updates to close vulnerabilities.
Unlike containers.
Containers randomly break or bitrot? I've never encountered that view. They don't do this as far as I'm aware. Container images incorporate layer hashing that ensure integrity: they do not "bitrot." Image immutability delivers highly consistent behavior, as opposed to "randomly break." The self-contained nature of containers delivers high portability, despite differences in "system setup." I fail to find any agreement with these claims. Today, people think nothing of developing images using one set of tools (Docker or what have you) and running these image using entirely distinct runtimes (containerd, cloud service runtimes, etc.) This is taken entirely for granted, and it works well.
Arcane system setup.
I don't know what is meant by "system setup" here, and "arcane" is subjective. What I do know is that the popular container systems are successfully and routinely used by neophytes, and that this doesn't happen when the "system setup" is too demanding and arcane. The other certainty I have is that whatever cost there is in acquiring the rather minimal knowledge needed to operate containers is vastly smaller than achieving the same ends without containers: the moment a system involves more than 2-3 runtime components, containers start paying off verses running the same components natively.
Containers randomly break or bitrot?
All the fucking time. Maybe it's possible to control your supply chain properly with containers, but nobody actually does that. 99% of the time they're pulling in some random "latest image" and applying bespoke shell commands on top.
I don't know what is meant by "system setup" here, and "arcane" is subjective.
Clearly you've never debugged container network problems before.
but nobody actually does that
They do. I assure you.
they're pulling in some random "latest image"
Hardly random. Vendoring validated images from designated publishers into secured private repos is the first step on the supply chain road.
Clearly you've never debugged container network problems before.
Configuring Traefik ingress to forward TCP connections to pods was literally the last thing I did yesterday. At one time or another I've debugged all the container network problems for every widely used protocol in existence, and a number of not so common ones.
first step on the supply chain road
99 percent of Docker container users aren't on the supply chain road. They just want to "docker pull", #yolo.
Configuring Traefik ingress to forward TCP connections to pods was literally the last thing I did yesterday
Docker does crazy insane modifications to your system settings behind the scenes. (Of which turning off the system firewall is the least crazy.)
Have fun when the magic Docker IP addresses happen to conflict with your corporate LAN.
Containers is "run these random shell commands I copy pasted from the internet on top of this random OS image I pulled from the internet, #yolo".
People copy and paste nix code all the damn time because it's downright unparseable and inscrutable to the majority of users. Just import <module>, set some attrs and hit build. #yolo
You see the difference?
As about SEL4 - it is so elegant because it leaves all the difficult problems to the upper layer (coincidentally making them much more difficult).
I completely buy this as an explanation for why SEL4 for user environments hasn't (and probably will never) take off. But there's just not that much to do to connect a server application to the network, where it can access all of its resources. I think a better explanation for the lack of server side adoption is poor marketing, lack of good documentation, and no company selling support for it as a best practice.
Using sel4 on a server requires complex software development to produce an operating environment in which you can actually do anything.
I’m not speaking ill of sel4; I’m a huge fan, and things like it’s take-grant capability model are extremely interesting and valuable contributions.
It’s just not a usable standalone operating system. It’s a tool kit for purpose-built appliances, or something that you could, with an enormous amount of effort, build a complete operating system on top of.
I'd love to work on this. It'd be a fun problem!
Are there any projects like that going on? It feels like an obvious thing.
There is work within major consumer product companies building such things (either with sel4, or things based on sel4's ideas), and there's Genode on seL4.
But there's just not that much to do to connect a server application to the network, where it can access all of its resources.
If you only care to run stateless stuff that never write anything (or at least never read what they wrote) - it's comparatively easy. Still gotta deal with the thousand drivers - even on the server there are a lot of quirky stuff. But then you gotta run the database somewhere. And once you run a database you get all the problems Linus warned about. So you gotta run the database on a separate Linux box (at that point - what do you win vs. using Linux for everything?) or develop a new database tailored for SeL4 (and that's quite a bit more complex than an OS kernel). An elegant solution that only solves a narrow set of cases stands no chance over a crude solution that solves every case.
Also, with the current sexy containerized stacks it's easy to forget, but having same kind of environment on the programmer's workbench and on the sever was once Unix's main selling point. It's kinda expensive to support a separate abstraction stack for a single purpose.
Containers and namespaces are not about security
People keep saying that, but I do not get it. If an attack that would work without a container, fails from inside a container (e.g. because it cannot read or write a particular file, or it cannot) it is better security.
A bypassable security mechanism is worse than useless.
It needs the bypass to exist, and it needs an extra step to actually bypass it.
Any security mechanism (short of air gaps) might have a bypass.
even if a malicious program is still able to detect it's not the true root.
Also true for security unless it can read or write to the true root.
I use containers as an extra security measure. i.e. as a way of reducing the chance that a compromise of one process will lead to a compromise of the rest of the system.
That said, I would guess that providers of container hosting must be fairly confident that they can keep them secure. I do not know what extra precautions they take though.
All of the hassle of installing things was in the Dockerfile, and it was run in containers so more reliable.
I think the important innovation of Docker was the image. It let people deploy consistent version of their software or download outside software.
What did it let people do that they couldn't already do with static linking?
It let people deploy consistent version of their software
- things like "a network port" can also be a dependency, but can't be "linked". And so on for all sorts of software that expects particular files to be in particular places, or requires deploying multiple communicating executables
- Linux requires that you be root to open a port below 1024, a security disaster
- some dependencies really do not like being statically linked (this includes the GNU standard library!), for things like nsswitch
Oh, and the layer caching made iterative development with _very_ rapid cycles possible. That lowered the bar for entry and raised the floor for everyone to get going easier.
But back to Dockerfiles. The configuration language used made it possible for anyone[tm] to build a container image, to ship a container image and to run the container. Fire-and-forget style. (Operating the things in practice and at any scale was left as an exercise for the reader.)
And because Anyone[tm] could do it, pretty much anyone did. For good and ill alike.
Trying to get the versions of software you needed to use all running on the same server was an exercise in fiddling.
For me, it was avoiding dependencies and making it easier to deploy programs (not services) to different servers w/o needing to install dependencies.
I seem to remember a meetup in SF around 2013 where Docker (was it still dotCloud back then?) was describing a primary use-case was easier deployment of services.
I'm sure for someone else, it was deployment/coordination of related services.
no more handmade scripts(or worse fully manual operations) stupid simple dockerfile scripts.. any employee would be able to understand and groups can organize around it
docker-compose tying services into their own subnet was really a cool thing though
edit: came back in to add reference to LXC, it's been probably 2 decades since i've thought about that.
Was it always so hard to build the software you needed on a single system?
Ironically one of the arguments for dynamic linking is memory efficiency and small exec size ( the other is around ease of centrally updating - say if you needed to eliminate a security bug ).
You could see that history repeat itself in Python - "pip install something" is way easier to do that messing with virtualenvs, and even works pretty well as long as number of package is small, so it was a recommendation for a long time. Over time, as number of Python apps on same PC grew, and as the libraries gained incompatible versions, people realized it's a much better idea to keep all things isolated in its own virtualenv, and now there are tools (like "uv" and "pipx") which make it trivial to do.
But there are no default "virtualenvs" for regular OS. Containers get closest. nix tries hard, but it is facing uphill battle - it goes very much "against the grain" of *nix systems, so every build script of every used app needs to be updated to work with it. Docker is just so much easier to use.
Golang has no dynamic code loading, so a lot of times it can be used without containers. But there is still global state (/etc/pki, /etc/timezone, mime.types , /usr/share/, random Linux tools the app might call on, etc...) so some people still package it in docker.
You’re talking about the needs it solves, but I think others were talking about the developments that made it possible.
My understanding is that Docker brought features to the server and desktop (dependency management, similarity of dev machine and production, etc), by building on top of namespacing capabilities of Linux with a usability layer on top.
Docker couldn’t have existed until those features were in place and once they existed it was an inevitability for them to be leveraged.
There were many use cases that rapidly emerged, but this eclipsed the rest.
Docker Hub then made it incredibly easy to find and distribute base images.
Google also made it “cool” by going big with it.
Namespaces were not an attempt to add security, but just grew out of work to make interfaces more flexible, like bind mounts. And Unix security is fundamentally good, not having namespaces isn't much of a point against it in the first place, but now it does have them.
And it's going pretty well indeed. All applications use many kernel features, and we do have very secure high performance web and other servers.
L4 systems have been around for as long as Linux, and SEL4 in particular for 2 decades. They haven't moved the needle much so I'd say it's not really going all that well for them so far. SEL4 is a great project that has done some important things don't get me wrong, but it doesn't seem to be a unix replacement poised for a coup.
Unix security is fundamentally good
L. Ron Hubbard is fundamentally good!
I kid, but seriously, good how? Because it ensures cybersecurity engineers will always have a job?
seL4 is not the final answer, but something close to it absolutely will be. Capability-based security is an irreducible concept at a mathematical level, meaning you can’t do better than it, at best you can match it, and its certainly not matched by anything else we’ve discovered in this space.
good how?
Good because it is simple both in terms of understanding it and implementing it, and sufficient in a lot of cases.
seL4 is not the final answer, but something close to it absolutely will be. Capability-based security is an irreducible concept at a mathematical level, meaning you can’t do better than it, at best you can match it, and its certainly not matched by anything else we’ve discovered in this space.
Security is not pure math though, it's systems and people and systems of people.
Because SEL4 is actually well designed, you can run a sub kernel as a process on SEL4 relatively easily. Tada, now you can get rid of K8s too.
k8s is about managing clusters of machines as if they were a single resource. Hence the name "borg" of its predecessor.
AFAIK, this isn't a use case handled by SEL4?
If you are already running SEL4 and you want to spawn an application that is totally isolated, or even an entire sub-kernel it's not different than spawning a process on UNIX. There is no need for the containerization plugins on SEL4. Additionally the isolation for the storage and networking plugins would be much better on SEL4, and wouldn't even really require additional specialized code. A reasonable init system would be all you need to wire up isolated components that provide storage and networking.
Kubernetes is seen as this complicated and impressive piece of software, but it's only impressive given the complexity of the APIs it is built on. Providing K8s functionality on top of SEL4 would be trivial in comparison.
Kubernetes is seen as this complicated and impressive piece of software, but it's only impressive given the complexity of the APIs it is built on.
There are other reasons it's impressive. Its API and core design is incredibly well-designed and general, something many other projects could and should learn from.
But the fact that it's impressive because of the complexity of the APIs it's built on is certainly a big part of its value. It means you can use a common declarative definition to define and deploy entire distributed systems, across large clusters, handling everything from ingress via load balancers to scaling and dynamic provisioning at the node level. It's essentially a high-level abstraction for entire data centers.
seL4 overlaps with that in a pretty minimal way. Would it be better as underlying infrastructure than the Linux kernel? Perhaps, but "providing K8s functionality on top of SEL4" would require reimplementing much of what Linux and various systems on top of it currently provide. Hardly "trivial in comparison".
Containerization is after all, as you mentioned, a plugin. As is network behavior. These are things that k8s doesn't have a strong opinion on beyond compliance with the required interface. You can switch container plugin and barely notice the difference. The job of k8s is to have control loops that manage fleets of resources.
That's why containers are called "containers". They're for shipping services around like containers on boats. Isolation, especially security isolation, isn't (or at least wasn't originally) the main idea.
You manage a fleet of machines and a fleet of apps. k8s is what orchestrates that. SEL4 is a microkernel -- it runs on a single machine. From the point of view of k8s, a single machine is disposable. From the point of view of SEL4, the machine is its whole world.
So while I see your point that SEL4 could be used on k8s nodes, it performs a very different function than k8s.
As others mentioned containers aren’t about security either, I think you’re rather missing the whole purpose of the cloud native ecosystem here.
I think the whole thing has been levels of abstraction around a runtime environment.
in the beginning we had the filesystem. We had /usr/bin, /usr/local/bin, etc.
then chroot where we could run an environment
then your chgroups/namespaces
then docker build and docker run
then swarm/k8s/etc
I think there was a parallel evolution around administration, like configure/make, then apt/yum/pacman, then ansible/puppet/chef and then finally dockerfile/yaml
RUN foo && \
bar && \
baz
thing, I completely agree.I've always wondered if there could be something like:
LAYER
RUN foo
RUN bar
RUN baz
LAYER
to accomplish something similar, or maybe: RUN foo
AND bar
AND baz
Anyone doing deployments in managed languages, regardless of AOT compiled, or using a JIT, the underlying operating system is mostly irrelevant, with exception of some corner cases regarding performance tweeks and such.
Even if those type 1 hypervisors happen to depend on Linux kernel for their implementation, it is pretty much transparent when using something like Vercel, or Lambda.
Drivers for virtual devices are simple, we don't need Linux's vast driver support for cloud VMs. We essentially need a virtual ethernet device driver for SEL4, a network stack that runs on SEL4, and a simple init process that loads the network stack with capabilities for the network device, and loads the application with a capability to the network stack. Make building an image for that as easy as compiling a binary, and you could eliminate maybe 10s of millions of lines of complexity from the deployment of most server applications. No Linux, no docker.
Wasn't this what unikernels were attempting a decade ago? I always thought they were neat but they never really took off.
I would totally be onboard with moving to seL4 for most cloud applications. I think Linux would be nearly impossible to get into a formally-verified state like seL4, and as you said most cloud stuff doesn't need most of the features of Linux.
Also seL4 is just cool.
In terms of security, I think even more secure than SEL4 or containers or VMs would be having a separate physical server for each application and not sharing CPUs or memory at all. Then you have a security boundary between applications that is based in physics.
Of course, that is too expensive for most business use cases, which is why people do not use it. I think using SEL4 will run into the same problem - you will get worse utilization out of the server compared to containers, so it is more expensive for business use cases and not attractive. If we want something to replace containers that thing would have to be both cheaper and more secure. And I'm not sure what that would be
Containers (meaning Docker) happened because CGroups and namespaces were arcane and required lots of specialized knowledge to create what most of us can intuitively understand as a "sandbox".
That might be why Docker was originally implemented, but why it "happened" is because everyone wanted to deploy Python and pre-uv Python package management sucks so bad that Docker was the least bad way to do that. Even pre-kubernetes, most people using Docker weren't using it for sandboxing, they were using it as fat jars for Python.
Even java things wher fatjars exist you at some point end up with os level dependencies like "and this logging thing needs to be set up, and these dirs need these rights, and this user needs to be in place" etc. Nowadays you can shove that into a container
Cgroups and namespaces were added to Linux in an attempt to add security to a design (UNIX) which has a fundamentally poor approach to security (shared global namespace, users, etc.)
Namespacing of all resources (no restriction to a shared global namespace) was actually taken directly from plan9. It does enable better security but it's about more than that; it also sets up a principled foundation for distributed compute. You can see this in how containerization enables the low-level layers of something like k8s - setting aside for the sake of argument the whole higher-level adaptive deployment and management that it's actually most well-known for.
I hope something like SEL4 can replace Linux for cloud server workloads eventually.
Why not 9front and diskless Linux microVMs, Firecracker/Kata-containers style?
Filesystem and process isolation in one, on an OS that's smaller than K8s?
Keep it simple and Unixy. Keep the existing binaries. Keep plain-text config and repos and images. Just replace the bottom layer of the stack, and migrate stuff to the host OS as and when it's convenient.
which has a fundamentally poor approach to security
Unix was not designed to be convenient for VPS providers. It was designed to allow a single computer to serve an entire floor of a single company. The security approach is appropriate for the deployment strategy.
As it did with all OSes, the Internet showed up, and promptly ruined everything.
Docker's claim to fame was connecting that existing stuff with layered filesystem images and packaging based off that. Docker even started off using LXC to cover those container runtime parts.
namespaces were added to Linux in an attempt to add security to a design (UNIX) which has a fundamentally poor approach to security (shared global namespace, users, etc.)
If the "fundamentally poor approach to security" is a shared global namespace, why are namespaces not just a fix that means the fundamental approach to security is no longer poor?
which get capabilities to the network stack as initial arguments, and don't have access to anything more
Systemd does this and it is widely used.
make tinyconfig
can get you pretty lean already.And I think this whole point about "virtualization", "security", making the most use of hardware, reducing costs, and so on, while true, it's an "Enterprise pitch" targeted at heads of tech and security. Nice side effects, but I couldn't care less.
There are real, fundamental benefits to containers for a solo developer running a solo app on a solo server.
Why? My application needs 2 or 3 other folders to write or read files into, maybe 2 or 3 other runtime executables (jvm, node, convert, think of the dozens of OSS CLI tools, not compile-time libraries), maybe apt-get install or download a few other dependencies.
Now I, as a indie developer, can "mkdir" a few files from a shell script. But that "mkdir" will work the first time. It will fail the second time saying "directory already exists". I can "apt-get install" a few things, but upgrading and versioning is a different story altogether. It's a matter of time before I realize I need atleast some barebones ansible or state management. I can tell you many times how I've reinvented "smallish" ansible in shell scripts before docker.
Now if I'm in an enterprise, I need to communicate this entire State of my app – to the sysadmin teams. Forget security and virtualization and all that. I need to explain every single part of the state, versions of java and tomcat, the directories, and all those are moving targets.
Containers reduce state management. A LOT. I can always "mkdir". I can always "apt-get install". It's an ephemeral image. I don't need to write half-broken shell scripts or use ansible or create mini-shell-ansible.
If you use a Dockerfile with docker-compose, you've solved 95% of state management. The only 5% left is to docker-compose the right source.
Skip the enterprisey parts. A normal field engineer or solo developer, like me, who's deploying a service on the field, even on my raspberry pi, would still use containers. It boils down to one word: "State management" which most people completely underestimate as "scripting". Containers grant a large control on state management to me, the developer, and simplify it by making it ephemeral. That's a big thing.
Fast forward to now, though, and I feel like the benefit of containers for development has largely been undone with the adoption of Devcontainers. Because, at least from my perspective, the real value of containers for development was looser coupling between the run-time environment for the application you do your typing in, and the run-time environment where you do your testing. And Devcontainers are designed to make those two tightly coupled again.
The devcontainer, also does not preclude the simple testing container.
The simplest (and arguably best) usage for a devcontainer is simply to set up a working development environment (i.e. to have the correct version of the compiler, linter, formatters, headers, static libraries, etc installed). Yes, you can do this via non-integrated container builds, but then you usually need to have your editor connect to such a container, so the language server can access all of that, plus when doing this manually you need to handle mapping in your source code.
Now, you probably want to have your main Dockerfile set up most of the same stuff for its build stage, although normally you want the output stage to only have the runtime stuff. For interpreted languages the output stage is usually similar to the "build" stage, but out to omit linters or other pure development time tooling.
If you want to avoid the overlap between your devcontainer and your main Dockerfile's build stage? Good idea! Just specify a stage in your main Dockerfile where you have all development time tooling installed, but which comes before you copy your code in. Then in your .devcontainer.json file, set the `build.dockerfile` property to point at your Dockerfile, and the `build.target` to specify that target stage. (If you need some customizations only for dev container, your docker file can have a tiny otherwise unused stage that derives from the previous one, with just those changes.)
Under this approach, the devcontainer is supposed to be suitable for basic development tasks (e.g. compiling, linting, running automated tests that don't need external services.), and any other non-containerized testing you would otherwise do. For your containerized testing, you want the `ghcr.io/devcontainers/features/docker-outside-of-docker:1` feature added, at which point you can just use just run `docker compose` from the editor terminal, exactly like you would if not using dev containers at all.
There’s no coupling being forced by devcontainers. It’s just a useful abstraction layer, versus doing it all manually. There is some magic happening under the hood where it takes your specified image or dockerfile and adds a few extra layers in there, but you can do that all yourself if you wanted to.
I will say, if you stray too far off the happy path with devcontainers, it will drive you insane, and you’ll be better off just doing it yourself, like most things that originated from MSFT. But those edge cases are pretty rare. 99% of workflows will be very happily supported with relatively minimal declarative json configuration.
I can always "apt-get install".
I don't think you can reliably fix a specific version of a package though, meaning things will still break here the same way they did before containers.
If you want to lock down versions on a system, Apt Pinning: https://wiki.debian.org/AptConfiguration#Using_pinning
If you have a herd of systems - prod environments, VMs for CI, lots of dev workstations, and especially if your product is an appliance VM: you might want to run your own apt mirror, creating known-good snapshots of your packages. I use https://www.aptly.info/
Containers can also be a great solution though.
Docker for Windows Containers itself was a horrible exercise in frustration just because of it's own dependency issues, and I thought it was a bad idea from the start because of it, and it dilluted Docker for Linux IMO.
Docker/Containers and Compose are pretty great to work with, assuming your application has dependencies like Databases, Cache, etc. Not to mention options such as separating TLS certificate setup and termination from the application server(s) or scaling to larger orchestration options... though I haven't gone past compose for home-lab or on my own server(s).
I can also better position data storage and application configurations for backup/restore by using containers and volumes next to the compose/config. I've literally been able to migrate apps between servers with compose-down, rsync, dns change, compose up -d on the new server. In general, it's been pretty great all around.
You can make a Docker image deterministic/hermetic, but it's usually a lot more work.
But the images themselves are, and that is a great improvement on pre-docker state of the art. Before docker, if you wanted to run the app with all of the dependencies as of last month, you had _no way to know_ at all. With docker, you pull that old image and you get exactly the same version of every dependency (except kernel) with practically zero effort.
Sure, it's annoying that instead of few-kB-long lockfile you are now having hundred of MBs of docker images. But all the better alternatives are significantly harder.
Something trivial - like "hey, that function is failing... was it failing with last week's version as well?" - is very hard to arrange if you have any non-trivial dependencies. You have to build some homebrew lockfile mechanism (ugly!) and then you discover that most open-source mirrors don't keep old versions for that long, and so now you have to set up mirror as well... And then there is dependency resolution problems as you try to downgrade stuff...
And then at some point someone gets a great idea: "hey, instead of trying to get dpkg to do things it was not designed for, why don't we snapshot entire filesystem" - and then the docker is born again.
Basically the Linux world was actively designed to apps difficult to distribute.
For a sysadmin, distros like Debian were an innovative godsend for installing and patching stuff. Especially compared to the hell that was Windows server sysadmin back in the 90s.
The developer oriented language ecosystem dependency explosion was a more recent thing. When the core distros started, apps were distributed as tarballs of source code. The distros were the next step in distribution - hence the name.
You should be installing it from a distro package!!
What about security updates of dependencies??
And so on. Docker basically overrules these impractical ideas.
You make software harder to distribute (so inconvenient for developers and distributors) but gain better security updates and lower resource usage.
Containers are a related (as the GP comment says) thing, but offer a different and varied set of tradeoffs.
Those tradeoffs also depend on what you are using containers for. Scaling by deploying large numbers of containers on a cloud providers? Applications with bundled dependencies on the same physical server? As a way of providing a uniform development environment?
Those tradeoffs also depend on what you are using containers for. Scaling by deploying large numbers of containers on a cloud providers? Applications with bundled dependencies on the same physical server? As a way of providing a uniform development environment?
Those are all pretty much the same thing. I want to distribute programs and have them work reliably. Think about how they would work if Linux apps were portable as standard:
Scaling by deploying large numbers of containers on a cloud providers?
You would just rsync your deployment and run it.
Applications with bundled dependencies on the same physical server?
Just unzip each app in its own folder.
As a way of providing a uniform development environment?
Just provide a zip with all the required development tools.
Those are all pretty much the same thing. I want to distribute programs and have them work reliably.
Yes, they are very similar in someways, but the tradeoffs (compared to using containers) would be very different.
You would just rsync your deployment and run it.
If you are scaling horizontally and not using containers you are already probably automating provisioning and maintenance of VMs, so you can just use the same tools to automate deployment. You would also be running one application per VM so you do not need to worry about portability.
Just unzip each app in its own folder.
What is stopping people from doing this? You can use an existing system like Appimage, or write a windows like installer (Komodo used to have one). The main barrier as far as I can see is that users do not like it.
Just provide a zip with all the required development tools.
vs a container you still have to configure it and isolation can be nice to have in a development environment.
vs installing what you need with a package manager, it would be less hassle in some cases but this is a problem that is largely solved by things like language package managers.
What is stopping people from doing this?
Most Linux apps do not bundle their dependencies, don't provide binary downloads, and aren't portable (they use absolute paths). Some dependencies are especially awkward like glibc and Python.
It is improving with programs written in Rust and Go which tend to a) be statically linked, and b) are more modern so they are less likely to make the mistake of using absolute paths.
Incidentally this is also the reason Nix has to install everything globally in a single root-owned directory.
The main barrier as far as I can see is that users do not like it.
I don't think so. They've never been given the option.
Most Linux apps do not bundle their dependencies, don't provide binary downloads, and aren't portable (they use absolute paths).
That is because the developers choose not to, and no one else chooses to do it for them. On the other hand lots of people package applications (and libraries) for all the Linux distros out there.
I don't think so. They've never been given the option.
The options exist. AppImage does exactly what you want. Snap and Flatpak are cross distro, have lots of apps, and are preinstalled by many major distros.
If you are considering bare-metal servers with deb files, you compare them to bare-metal servers with docker containers. And in the latter case, you immediately get all the compatibility, reproducibility, ease of deployment, ease of testing, etc... and there is no need for a single YAML file.
If you need a reliable deployment without catching 500 errors from Docker Hub, then you need a local registry.
Yes, and with debs you need local apt repository
If you need a secure system without accumulating tons of CVEs in your base images, then you need to rebuild your images regularly, so you need a build pipeline.
presumably you were building your deb with build pipeline as well.. so the only real change is that pipeline now has to has timer as well, not just "on demand"
To reliably automate image updates, you need an orchestrator or switch to podman with `podman auto-update` because Docker can't replace a container with a new image in place.
With debs you only have automatic-updates, which is not sufficient for deployments. So either way, you need _some_ system to deploy the images and monitor the servers.
To keep your service running, you again need an orchestrator because Docker somehow occasionally fails to start containers even with --restart=always. If you need dependencies between services, you need at least Docker Compose and YAML or a full orchestrator, or wrap each service in a systemd unit and switch all restart policies to systemd.
deb files have the same problems, but here dockerfiles have an actual advantage: if you run supervisor _inside_ docker, then you can actually debug this locally on your machine!
No more "we use fancy systemd / ansible setups for prod, but on dev machines here are some junky shell scripts" - you can poke the things locally.
And you need a log collection service because the default Docker driver sucks and blocks on log writes or drops messages otherwise. This is just the minimum for production use.
What about deb files? I remember bad old pre-systemd days where each app had to do its own logs, as well as handle rotations - or log directly to third-party collection server. If that's your cup of tea, you can totally do this in docker world as well, no changes for you here!
With systemd's arrival, the logs actually got much better, so it's feasible to use systemd's logs. But here is a great news! docker has "journald" driver, so it can send its logs to systemd as well... So there is feature parity there as well.
The key point is there are all sorts of so-called "best practices" and new microservice-y way of doing things, but they are all optional. If you don't like them, you are totally free to use traditional methods with Docker! You still get to keep your automation, but you no longer have to worry about your entire infra breaking, with no easy revert button, because your upstream released broken package.
To be fair, they are _usually_ pretty good about that, the last big breakage I've seen was that git "security" fix which basically broke git commands as root. There is also some problems with Ubuntu LTS kernel upgrades, but docker won't save you here, you need to use something like AMI images.
Basically the Linux world was actively designed to apps difficult to distribute.
It has "too many experts", meaning that everyone has too much decision making power to force their own tiny variations into existing tools. So you end up needing 5+ different Python versions spread all over the file system just to run basic programs.
The "solution" for a long time was to spin up single application Virtual Machines, which was a heavy way to solve it and reduced the overall system resources available to the application making them stupidly inefficient solutions. The modern cloud was invented during this phase, which is why one of the base primitives of all current cloud systems is the VM.
Containers both "solved" the dependency distribution problem as well as the resource allocation problem sort of at once.
The traditional way tends to assume that there will be only one version of something installed on a system. It also assumes that when installing a package you distribute binaries, config files, data files, libraries and whatnot across lots and lots of system directories. I grew up on traditional UNIX. I’ve spent 35+ years using perhaps 15-20 different flavors of UNIX, including some really, really obscure variants. For what I did up until around 2000, this was good enough. I liked learning about new variants. And more importantly: it was familiar to me.
It was around that time I started writing software for huge collections of servers sitting in data centers on a different continent. Out of necessity I had to make my software more robust and easier to manage. It had to coexist with lots of other stuff I had no control over.
It would have to be statically linked, everything I needed had to be in one place so you could easily install and uninstall. (Eventually in all-in-one JAR files when I started writing software in Java). And I couldn’t make too many assumptions about the environment my software was running in.
UNIX could have done with a re-thinking of how you deal with software, but that never happened. I think an important reason for this is that when you ask people to re-imagine something, it becomes more complex. We just can’t help ourselves.
Look at how we reimagined managing services with systemd. Yes, now that it has matured a bit and people are getting used to it, it isn’t terrible. But it also isn’t good. No part of it is simple. No part of it is elegant. Even the command line tools are awkward. Even the naming of the command line tools fail the most basic litmus test (long prefixes that require too many keystrokes to tab-complete says a lot about how people think about usability - or don’t).
Again, systemd isn’t bad. But it certainly isn’t great.
As for blaming Python, well, blame the people who write software for _distribution_ in Python. Python isn’t a language that lends itself to writing software for distribution and the Python community isn’t the kind of community that will fix it.
Point out that it is problematic and you will be pointed to whatever mitigation that is popular at the time (to quote Queen “I've fallen in love for the first time. And this time I know it's for real”), and people will get upset with you, downvote you and call you names.
I’m too old to spend time on this so for me it is much easier to just ban Python from my projects. I’ve tried many times, I’ve been patient, and it always ends up biting me in the ass. Something more substantial has to happen before I’ll waste another minute on it.
UNIX could have done with a re-thinking of how you deal with software, but that never happened.
I think it did, but the Unix world has an inherent bad case of "not invented here" syndrome, and a deep cultural reluctance to admit that other systems (OSes, languages, and more) do some things better.
NeXTstep fixed a big swath of issues (in the mid-to-late 1980s). It threw out X and replaced it with Display Postscript. It threw out some of the traditional filesystem layout and replaced it with `.app` bundles: every app in its own directory hierarchy, along with all its dependencies. Isolation and dependency packaging in one.
(NeXT realised this is important but it has to be readable and user-friendly. It replaces the traditional filesystem with something more readable. 15Y later, Nix realised the same lesson, but forgot the 2nd, so it throws out the traditional FHS and replaces it with something less readable, which needs software to manage it. The NeXT way means you can install an app with a single `cp` command or one drag-and-drop operation.)
Some of this filtered back upstream to Ritchie, Thompson and Pike, resulting in Plan 9: bin X, replace it with something simpler and filesystem-based. Virtualise the filesystem, so everything is in a container with a virtual filesystem.
But it wasn't Unixy enough so you couldn't move existing code to it. And it wasn't FOSS, and arrived at the same time as a just-barely-good-enough FOSS Unix for COTS hardware was coming: Linux on x86.
(The BSDs treated x86 as a 2nd class citizen, with grudging limited support and the traditional infighting.)
But people don’t use Darwin for servers to any significant degree. I should have been a bit more specific and narrowed it down to Linux and possibly some BSDs that are used for servers today.
I see the role of Docker as mostly a way to contain the “splatter” style of installing applications. Isolating the mess that is my application from the mess that is the system so I can both fire it up and then dispose of it again cleanly and without damaging my system. (As for isolation in the sense of “security”, not so much)
a way to contain the “splatter” style of installing applications
Darwin is one way of looking at it, true. I just referred to the first publicly released version. NeXTstep became Mac OS X Server became OS X became macOS, iOS, iPadOS, watchOS, tvOS, etc. Same code, many generations later.
So, yes, you're right, little presence on servers, but still, the problems aren't limited to servers.
On DOS, classic MacOS, on RISC OS, on DR GEM, on AmigaOS, on OS/2, and later on, on 16-bit Windows, the way that you install an app is that you make a directory, put the app and its dependencies in it, and maybe amend the system path to include that directory.
All single-user OSes, of course, so do what you want with %PATH% or its equivalent.
Unix was a multi-user OS for minicomputers, so the assumption is that the app will be shared. So, break it up into bits, and store those component files into the OS's existing filesystem hierarchy (FSH). Binaries in `/bin`, libraries in `/lib`, config in `/etc`, logs and state in `/var`, and so on -- and you can leave $PATH alone.
Make sense in 1970. By 1980 it was on big shared departmental computers. Still made sense. By 1990 it was on single-user workstations, but they cost as much as minicomputers, so why change?
The thing is, the industry evolved underneath. Unix ended up running on a hundred million times more single-user machines (and VMs and containers) than multiuser shared hosts.
The assumptions of the machine being shared turned out to be wrong. That's the exception, not the rule.
NeXT's insight was to only keep the essential bits of the shared FSH layout, and to embed all the dependencies in a folder tree for each app -- and then to provide OS mechanisms to recognise and manipulate those directory trees as individual entities. That was the key insight.
Plan 9 virtualised the whole FSH. Clever but hard to wrap one's head around. It's all containers all the way down. No "real" FSH.
Docker virtualises it using containers. Also clever but in a cunning-engineer's-hacky-kludge kind of way, IMHO.
I think GoboLinux maybe made the smartest call. Do the NeXT thing, junk the existing hierarchy -- but make a new more-readable one, with the filesystem as the isolation mechanism, and apply it to the OS and its components as well. Then you have much less need for containers.
There’s also the overly restrictive dependency list, because each deps in turn is happy to break its api every 6 months.
engineers simply lost the ability to succinctly package applications and their dependencies into simple to distribute and run packages.
but this is what docker is
If anything, java kinda showed it doesn't have to suck, but as not all things are java, you need something more general
how managing multiple java runtime versions is supposed to work is still beyond me... it's a different tool at every company, and the instructions never seem to work
There is also a convention of using the `JAVA_HOME` environment variable to allow tools to locate the correct JDK directory. For example, in a unix shell, add `$JAVA_HOME/bin` to your `PATH`.
And sometimes you want to ship multiple services together.
In any case 'docker run x' is easier and seemingly less error prone than a single sudo apt get install
They did have what you could call userspace container management via application servers, though.
Unless you just mean that using Kubernetes at all is replicating application servers, which was my point. Kubernetes makes language-specific application servers like Wildfly/JBoss or Websphere obsolete, and is much more powerful, generic, and an improvement in just about every respect.
The reason Spring includes those libraries is partly historical - Spring is old, and dates from the applications server days. Newer frameworks like Micronaut and Quarkus use more focused and performant libraries like Netty, Vert.x, and Undertow instead.
The switch was often much more than a minor upgrade, because it often made splitting up monoliths possible in ways that the Java ecosystem itself didn't have good support for.
Java at least uses binary dependencies very rarely, and they usually have the decency of bundling the compiled dependencies... But it seems Java and Go just saw the writing on the wall and mostly just reimplement everything. I did have problems with the Snappy compression in the Kafka libraries, though, for instance .
If you look at most projects in the C world, they only provide the list of dependencies and some build config Makefile/Meson/Cmake/... But the latter is more of a sample and if your platform is not common or differs from the developer, you have the option to modify it (which is what most distros and port systems do).
But good luck doing that with the sprawling tree of modern packages managers. Where there's multiple copies of the same libraries inside the same project just because.
It was pretty old, and required a very specific version of java, not available on modern systems. Plus some config files in global locations.
Packaging it in the docker container made it so much easier to use.
The problem is/was that buildpacks aren't as flexible and only work if the buildpack exists for your language/runtime/stack.
Of course you now need to build and maintain those abstract towers, so more jobs for everybody!
Put another way: stuff like Electron makes a pretty good case for the "cheap hardware leads to shitty software quality/distribution mechanisms" claim. But does Docker? Containers aren't generally any more expensive in hardware other than disk-space to run than any other app. And disk space was always (at least since the advent of the discrete HDD) one of the cheapest parts of a computer to scale up.
That's when sbuild[0], a tool to build deb packages in containers, was created. It was pretty innovative in that it started from clean container every time, and thus would build deb in a reliable way even if user's machine had some funky dependencies installed.
(Note that was schroot containers, docker did not exist back then)
[0] https://metadata.ftp-master.debian.org/changelogs//main/s/sb...
Then the technology from OpenVZ slowly made its way into the mainline Linux, in the form of cgroups and namespaces. LWN called it a "container puzzle", with tens of moving pieces. And it was largely finished by early 2010-s.
I built my own container system in 2012 that used cgroups to oversubscribe the RAM, with simple chroot-based file namespaces for isolation. We even used XFS projects (raise your hand if you know what this is!) for the disk quota management. I remember that I had to use systemtap to patch the kernel to be able to find out which process died as a result of the OOM killer, there were no standard ways to do that.
We sold it as a part of our biotech startup to Illumina. Then we sold it again to Amazon as a part of another startup :)
The genius of Docker was the layered overlayfs-based image building. This one simple innovation made it possible to build images in a constructive way, without having to waste half an hour for each minor change. I was floored with its simplicity and power when I first saw it.
All my projects (primarily web apps) are using docker compose which configures multiple containers (php/python/node runtime, nginx server, database, scheduler, etc) and run as a dev environment on my machine. The source code is mounted as a volume. This same compose file is then also used for the deployment to the production server (with minor changes that remove debug settings for example).
This approach has worked well for me as a solo dev creating web apps for my clients.
It has also enabled extreme flexibility in the stacks that I use, I can switch dev environments easily and quickly.
I guess it's worth keeping in mind that Justin only quit Docker a few months ago, and his long tenure as CTO there will have (obviously) informed the majority of the opinions in the article. I think the deployment over development spin and some of the other takes there more closely reflect the conversations he had with large corp paying customers at the exec level than the workflows of solo devs that switch dev environments much more frequently than most etc.
Before Docker I was using Xampp and FTP’ing source code to the prod server.
The virtualization/isolation aspect came first, the SWSoft Virtuozzo was doing that quite well in early 2000s. They even had [some] IO isolation which I think took around a decade to support elsewhere. Then gradually pieces of Virtuozzo/OpenVZ reached the mainline in a form of cgroups/LXC and the whole thing slowly brewed for a while until the Docker added the two missing pieces: the fast image rebuilds and the out-of-the-box user experience.
Docker of course was the revolution, but by then sufficiently advanced companies have been already using containers for isolation for a full decade.
I vaguely remember having to turn on some features in VirtualBox at the time to speed up my VMs, it was a massive uplift in performance if you had a CPU that supported it.
I used it heavily back in the days, long before kernel namespaces went upstream and docker became a thing.
https://en.wikipedia.org/wiki/Cgroups
(arguably FreeBSD jails and various mainframe operating systems preceded Linux containers but not by that name)
Other companies like Yahoo, Whatsapp, Netflix also followed interesting patterns of using strong understanding of how to be efficient on cheap hardware. Notably those three all were FreeBSD users at least in their early days.
Anyways digging it up, looks like the primary author was at Facebook for a year before cgroupsv2, redhat for three years before that, and Google before that. So... I don't know haha you'd have to ask him.
When people say that static executables would solve the problem they are wrong, a static executable just means that you can eschew constructing a separate file-system inside your container - and you will probably need to populate some locations anyway.
Properly configured containers are actually supposed to be secure sandboxes, such that any violation is a kernel exploit. However the Linux kernel attack surface is very large so no one serious who offers multi-tenant hosting can afford to rely on containers for isolation. They have to assume that a container escape 0day can be sourced. It may be more accurate to say that a general kernel 0day can be sourced since the entire kernel surface area is open for anyone to poke. seccomp can mitigate the surface area but also narrow down the usefulness.
Docker has made some strange decisions for default behavior but if you take a more hands on approach such as with bubblewrap/bwrap nothing will leak in.
Containers were around for a decade or more on FreeBSD and Solaris. They let you divvy up expensive big Unix iron.
Same as VMs were around on mainframes from the late 1960s and expensive Unix RISC servers from the late 1980s.
Linux didn't need it because it was cheap. So Linux replaced that older more expensive stuff, on cheap COTS hardware: x86.
Once everything was commoditised and cost-cut, suddenly, efficiency started to matter, so VMware thrived and was copied and VMs were everywhere.
Then the low usage and inefficiency of resource sharing of VMs made them look expensive, so they started to get displaced by the cheaper easier tech of containers, making "it works on my machine, so let's ship my machine" scale to production.
Unfortunely HPe has removed most of HP-UX documentation out of Internet so it is hard to point it out.
However there are still some digital traces,
HP-UX 10.24 release,
This is a Virtual Vault release of HP-UX, providing enhanced security features. Virtual Vault is a compartmentalised operating system in which each file is assigned a compartment and processes only have access to files in the appropriate compartment and unlike most other UNIX systems the superuser (or root) does not have complete access to the system without following correct procedures.
https://en.wikipedia.org/wiki/HP-UX
A forum discussion,
https://community.hpe.com/t5/operating-system-hp-ux/hp-virtu...
Now it would be great to get back those HP-UX Vault PDFs.
I wrote about containers being the next big thing in 2011:
https://www.theregister.com/Print/2011/07/18/brief_history_o...
I credit FreeBSD and Solaris and AIX, but I think I didn't know about HP/UX.
- Persistant state? Must declare a volume. - IO with external services? Must declare the ports (and maybe addresses). - Configurable parameters? Must declare some env variables. - Trasitive dependecies? Must declare them, but using a mechanism of your choosing (e.g. via the package manager of your base image distro).
Separation of state (as in persistency) and application (as in binaries, assets) makes updates easy. Backups also.
Having most all IO visible and explicit simplifies operation and integration.
And a single, (too?!?) simple config mechanism increases reusability, by enabling e.g. lightweight tailoring of generic application service containers (such as mariadb).
Together this bunch of forced, yet leaky abstractions is just good enough to foster immense reuse & composability on to a plethora of applications, all while allowing to treat them almost entirely like blackboxes. IMHO that is why OCI containers became this big, compared to other virtualization and (application-) cuntainer technologies.
Those meetings still happen sometimes, but "cloud" *aaS and containers really put a dent in that sort of thing.
Some highlights:
- How far behind Kubernetes was at the time of launch. Docker Swarm was significantly more simple to use, and Apache Mesos scheduler could already handle 10,000 nodes (and was being used by Netflix).
- RedHat's early contributions were key, despite having the semi-competing project of OpenShift.
- The decision to Open Source K8S came down to one meeting brief meeting at Google. Many of the senior engineers attended remotely from Seattle, not bothering to fly out because they thought their request to go OS was going to get shutdown.
- Brief part at the end where Kelsey Hightower talks about what he thinks might come after Kubernetes. He mentions, and I thought this was very interesting ... Serverless making a return. It really seemed like Serverless would be "the thing" in 2016-2017 but containers were too powerful. Maybe now with KNative or some future fusing of Container Orchestration + K8S?
Application composition from open source components became the dominant way of constructing applications over the last decade.
I'm just as interested in why this ^ happened. I imagine it's pretty unique to software? I don't hear of car companies publishing component designs free for competitors to use, or pharmaceuticals freely waiving the IP in their patents or processes. Certainly not as "the dominant way of" doing business.
I wonder if LLM coding assistants had come about earlier, whether this would have been as prevalent. Companies might have been more inclined to create more of their own tooling from scratch since LLMs make it cheap (in theory). Individuals might have been less inclined to work on open source as hobbies because LLMs make it less personal. Companies might be less inclined to adopt open-source LLM-managed libraries because it's too chaotic.
If I write some code, it needs a computer and environment to run. If I’m writing for what’s popular, that’s pretty much a given. In short, for code the design is the product.
If I design a pharmaceutical, someone still has to make it. Same for car parts. This effort is actually greater than the effort of design. If you include regulation, it’s way higher.
So, this great feedback loop of creation-impact-collaboration never forms. The loop would be too big and involve too much other stuff.
The closest thing isn’t actually manufacturing, it’s more like writing and music. People have been reusing each other’s stuff forever in those spaces.
Docker also made Go credible as a programming language,
Can someone explain why Docker was beneficial for Go?
And before docker, not many large applications were.
Where I find myself advocating today is very much a "rational check" on infrastructure, and curtailing accordingly. We have the tooling to ensure high availability, but does everything need to be HA? Do our SLAs for enterprise tooling really need five-nines of availability, or can we knock some applications down to a limited schedule? Does dev/test need to be live 24/7, or can we power it off when not in use? Why are we only focusing on availability and not scalability? The list goes on, but they're also not popular in enterprises with entrenched politics, which admittedly is where I find myself struggling against the current. If my social chops were better, I suspect I'd thrive in consultancy doing just that.
All that being said, I do like containers that are done right (properly documented, secure-by-default, ready for scaling), and I continue driving more applications towards containerization in the enterprise where possible. They're the right solution for ~60-80% of enterprise use cases, with the difficulty being getting vendors on board with the idea that their software won't have a dedicated VM or hardware anymore (which everyone fiercely resists, because container-based licensing can be a PITA to them). For the rest, VMs are more than fine, and we have a growing number of ways for both to exist peacefully in the same environment. As this area of technology matures (along with "backporting" from hyperscalers to private cloud again) further, I'm really looking forward to managing global estates in smaller teams for bigger firms - things that VMs, Containers, and Infrastructure-as-Code allow.
What people do with Docker is spin up a database or another service to develop or test against.
Yep. Being able to run
docker run --rm --publish=127.0.0.1:27017:27017 'mongo:3.6.8'
or docker run --rm --publish=127.0.0.1:27017:27017 'mongo:5.0'
and then get rid of it with simple Ctrl-C is a godsend.If the installation crashed (which was quite common, happened to me once) it was easier to just format the computer completely and start again. Effed up the database? Probably easier to format everything and install from scratch
https://static.googleusercontent.com/media/research.google.c...
I liked the article. It's close to my adverntures with containers. I think the invention of docker is indeed mostly packaging: Dockerfile (functional) is pretty neat, docker hub (addressing a container) is awesome, and the ENTRYPOINT in Dockerfile is great, it distinguishes Docker from .deb.
But indeed, beyond Dockerfile things are bleak. Docker compose never rose to my expectations. To get serious things you need load blancer, storage, addressing, and these are beyond traditional containers scope.
hosting provider's ... desire to establish a clean, clear-cut separation between their own services and those of their customers
https://en.wikipedia.org/wiki/FreeBSD_jail
My guess Linux started getting requests rom various orgs for a while, so in true Linux fashion, we got a a few different container type methods years later.
I still think Jails are the best of the bunch, but they can be a bit hard to setup. Once setup, Jails works great.
So here we are :)
Do you use K8s? No! That's old! I use Thrumba! It's just a clone of K8s by some startup because people figured out that the easiest way to make gobs of money is/was to build platform products and then get people to use them.
I was always surprised someone didn't invent a tool for ftping to your container and updating the PHP
We thought of it, and were thankful that it was not obvious to our bosses, because lord forbid they would make it standard process and we would be right back where we started, with long lived images and filesystem changes, and hacks, and managing containers like pets.
Docker was easy to adopt as it did not change very much about how you used software.
What?! Docker images should seperate the program from the data. Something casual image maintainers fail to adhere to. I find docker extreme hard to use for persistent state that is not a remote DB.
It is even less secure than VMs (which seem to be like swiss chess if you try to break out). (Security-wise: rkt failed five years ago, BSD jails could do it better?)
Also, Docker was supposed to solve performance issues of VMs, debloating them. Instead, we got hypervisors or full VMs as hosts for Docker images, just to get the image as a de facto meta package manager to ship software.
So for me, it can be convenient, but I have no idea what I am doing when adding make file like statements to a yml to run full containers just to run e.g. a web server with some dependecies. But the workflow for that was novell to me, who previously fought with ruby and python module installations on host system.
In a past life, I remember having to juggle third-party repositories in order to get very specific versions of various services, which resulted in more than few instances of hair-pull-inducing untangling of dependency weirdness.
This might be controversial, but I personally think that distro repos being the assumed first resort of software distribution on Linux has done untold amounts of damage to the software ecosystem on Linux. Containers, alongside Flatpak and Steam, are thankfully undoing the damage.
Linux is just a kernel - you need to ship your own userland with it. Therefore, early distros had to assemble an entire OS around this newfangled kernel from bits and pieces, and those bits and pieces needed a way to be installed and removed at will. Eventually this installation mechanism gets scope creep and and suddenly things like FreeCiv and XBill are distributed using the same underlying system that bash and cron use.
This system of distro packaging might be good as a selling point for a distro - so people can brag about their distro comes with 10,000 packages or whatever. That said, I can think of no other operating system out there where the happiest path of releasing software is to simply release a tarball of the source, hope a distro maintainer packages it for you, hope they do it properly, and hope that nobody runs into a bug due to a newer or older version of a dependency you didn't test against.
Instead of designing a solution and perfecting it overtime, it's endless tweaking where there's a new redesign every years. And you're supposed to use the exact computer as the Dev to get their code to work.
Kubernetes was also not the obvious winner in its time with Mesos in particular seeming like a possible alternative when it wasn't clear if orchestration and resource management weren't possibly different product categories.
I was at Red Hat at the time and my impression was they did a pretty good job of jumping onto where the community momentum at the time was--while doubtless influencing that momentum at the time.
This might be controversial, but I personally think that distro repos being the assumed first resort of software distribution on Linux has done untold amounts of damage to the software ecosystem on Linux.
Hard agree. After getting used to "system updates are... system updates; user software that's not part of the base system is managed by a separate package manager from system updates, doesn't need root, and approximately never breaks the base system (to include the graphical environment); development/project dependencies are not and should not be managed by either of those but through project-specific means" on macOS, the standard Linux "one package manager does everything" approach feels simply wrong.
development/project dependencies are not and should not be managed by either of those but through project-specific means" on macOS, the standard Linux "one package manager does everything" approach feels simply wrong.
This predates macOS. The mainframe folks did this separation eons ago (see IBM VM/CMS).
On Unix, it's mostly the result of getting rid of your sysadmins who actually had a clue. Even in Unix-land in the Bad Old Days(tm), we used to have "/usr/local" for a reason. You didn't want the system updating your Perl version and bringing everything to a screeching halt; you used the version of Perl in /usr/local that was under your control.
Yes there was an idea of creating bespoke filesystems for apps, custom mount structures that plan9 had. That containers also did something semi-parallel to. But container images as read only overlays (with a final rw top overlay) feel like a very narrow craft. Plan9 had a lot more to it (everything as a file), and containers have a lot more to them (process, user, net namespaces, container images to pre-assembled layers).
I can see some shared territory but these concerns feel mostly orthogonal. I could easily imagine a plan9 like entity arising amid the containerized world: these aren't really in tension with each other. There's also a decade and a half+ gap between Plan9's hayday and the rise of containers.
Enterprise software vendors sold libraries and then "application servers", essentially promising infrastructure (typically tied to databases).
Enterprise software developers -- Google in particular -- got tired of depending on others' licensed infrastructure. This birthed Spring and Docker, splitting the market.
(Fun aside: when is a container a vm? When it runs via Apple containerization.)
For instance I could create my own login screen for an web service without having to worry about the package manager overriding my code, because I inject it into the container, which is already updated.
I can also forcefully reroute much easier ports or network connections the way I want it.
It let you create diffs of a filesystem, and layer them with configurations, similar to containers. Useful for managing computer labs at the time.
At some point in the future people are going to realize that every system should work that way.