Speeding up PostgreSQL dump/restore snapshots
pg_restore will handle roles, indexes, etc assuming you didn't switch the flags around to disable them
If you're on EC2, hopefully you're using disk snapshots and WAL archiving.
One of the reasons Uber mentioned when moving to mysql in their engineering blog
If I'm not mistaken, this was in 2016 (that's 10 years next year, time flies when you're having fun) -- which is practically an eternity in IT. I'm no DBA but I'm fairly sure many changes have been made to Postgres since then, including logical replication (which can be selective), parallel apply of large transactions in v16, and so on.
I'm not saying this means their points are invalid, I don't know Postgres well enough for that, but any point made almost 10 years ago against one of the most popular and most actively developed options in its field should probably be taken with a pinch of salt.
I'm not saying this means their points are invalid, I don't know Postgres well enough for that, but any point made almost 10 years ago against one of the most popular and most actively developed options in its field should probably be taken with a pinch of salt.
Heh, I remember the countless articles after that debacle back then pointing out all the reasons why their migration was entirely pointless and could've been summed up to "devs not knowing the tools they're working with" before starting multi million projects to fuel their cv driven development.
So even if you aren't willing to do so, their rational for the migration was fully debunked even back then
We encourage staff to play with both, and they can play with impunity since it's a copy that will get replaced soon-ish.
This makes it important that both work reliably, which means we know when our backups stop working.
We haven't had a disaster recovery situation yet(hopefully never), but I feel fairly confident that getting the DB back shouldn't be a big deal.
And, of course, your disaster recovery plan is incomplete until you've tested it (at scale). You don't want to be looking up Postgres documentation when you need to restore from a cold backup, you want to be following the checklist you have in your recovery plan and already verified.
in the 100-2TB range where probably most businesses lie
Assuming you mean that range to start at 100GB, I've worked with databases that size multiple times but as a freelancer it's definitely not been "most" businesses in that range.
I'm also curious how Distributed Replicated Block Device (DRBD) would perform, it would cause obvious latency but perhaps it would be an easier and more efficient solution for a "hot spare" setup than using Postgres native functionality. To my understanding, DRBD can be configured to protect you from hardware IO errors by "detaching" from an erroring disk.
I also don't know if it's a valid point, but I've heard people say that you don't want a fancy CoW filesystem for databases, since much of the functionality offered are things that databases already solve themselves, so you might be sacrificing performance for safety from things that "should not happen"(tm) anyway, depending on how it's set up I guess.
However we are developing pgstream in order to bring in data and sync it from other Postgres providers. pgstream can also do anonymisation and in the future subsetting. Basically this means that no matter which Postgres service you are using (RDS, CloudSQL, etc) you can get still use Xata for staging and dev branches.
I also recommend pg_repack[2] to squash tables on a live system and reclaim disk space. It has saved me so much space.
If things go truly south, just hope you have a read replica you can use as your new master. Most SLAs are not written with 72h+ of downtime. Have you tried the nuclear recovery plan, from scratch? Does it work?
Even if you have a DR plan you should assume your incremental backups are no good and you need to restore the whole thing from scratch.
"Restore from scratch" can mean a lot of different things, if you have a read replica that you can promote then in relative terms to 72h+ downtime, this should be fairly quick, no?
If you have block-level backups or snapshots, with ZFS for example as someone mentioned, it should also be relatively quick -- although I assume this would make any hypothetical read replica split-brain.