This is a shame. Hosting a high visibility server is no joke, and I don’t envy the admins and the very difficult work they do.
It’s simultaneously an argument for and against decentralization.
For - a single instance can get knocked out without talking out the whole fediverse.
Against - it seems as though high visibility communities are potentially fairly easy to target and take down.
I think that decentralization wins out here in the end, but it does feel like there may be a need for some sort of fallback mechanism to be in place at an instance/community level. I suspect this might evolve somehow over time. It would require some way to expand trust between instances and or portability of communities (which could be fraught with user trust/data integrity issues).
If things don’t evolve it could grow into a whack-a-mole game for bad actors, or there might need to be more investment into server infrastructure (which could work against decentralization if only because of economies of scale).
Or maybe there’s no issue after all? I’m just imagining potential implications of a scaling fediverse - it’s fascinating and exciting stuff!
This is the primary reason why I’m ok for my instance to not grow massively. We got 10K people and we have pretty good traffic ,without overloading us or making too much of a target. We still get new users since we allow registrations, but the application requirements retain the quality
I’m realizing that I signed up for a probably-at-risk instance (lemmy.ml). I’m quite left but not necessarily an anarchist so it would seem applying to lemmy.dbzer0.com wouldn’t be a good move. (But I did enjoy reading your application requirements!) Recs on other small but reliable instances?
Absolutely makes sense. If lemmy is going to have any truly large communities though, investment in infrastructure/ops as well as function/moderation will be absolutely needed. (It’s an ‘if’, of course)
Time will tell how the community will want to lead it.
a single instance can get knocked out without talking out the whole fediverse
Honestly, it may as well have in this case. LemmyWorld is the de facto “hub” for basically the entire Threadiverse right now. All the major communities are seeing the most activity through LemmyWorld. While I’m subscribed to a lot of communities from other instances, sometimes duplicates of ones found on LemmyWorld, losing LemmyWorld would still a huge chunk of the content that I’m trying to see.
I really do wish that more specialized instances would sprout up and that some of these communities could cluster together across multiple pockets of the Threadiverse. I feel like this makes it less likely to lose huge chunks of content, and also makes fewer large targets for somebody to want to attack in the first place.
You don’t need to necessarily centralize to defend against DDos or similar attacks. You can add things like Cloudflare for DDos mitigations, CDN and maybe something like Kubernetes for horizontal scaling of servers (spin up more servers to handle extended load) transparently behind the scenes. This can also get you the benefits of low geographical latency, so a load-balancer fetches you data from the closest replica of a database geographically, etc.
Of course, all this adds up in terms of cost, but I think this might be worth it for the largest instances. I suppose that can still be considered centralization.
If we wanted to encourage small many small instances instead, perhaps there could be a transparent load-balancer layer for the fediverse that instances could sign up for, that is managed by a devops group. Alternatively, lemmy could have built-in load-balancing, caching, etc. as part of its codebase that instance operators can set up with their own accounts at Cloudflare, etc.
Agreed. Ultimately, that’s the point. There are solutions (with ongoing vigilance required) but it comes with an ongoing cost, be it server infrastructure or human resources).
I think the federated load balancer might be interesting but I expect there are many pitfalls that need to be considered and addressed wrt security, trust and integrity of data.
Anyway, it’s amazing to see this all grow and evolve.
Yeah everyone using Cloudflare is definitely centralisation, but maybe a kind of centralisation that allows for easier switching to something else if Cloudflare gets too crazy.
DDoS is a war of attrition - and the best way to win a war of attrition is to make it cost much more than $1 to make you spend $1, and to be able to outspend the attackers (e.g. the whole community bands together to support the victims against the attacker). I think the best response depends on who is attacking.
Network level DDoS is likely using stolen bandwidth - but the person directing the attack is probably paying someone for the use of it (i.e. they didn’t compromise the equipment themselves, someone else builds botnets and rents them out). If you can identify what traffic is part of a DDoS, you can track down where it is coming from, and alert the owner of the network where it is coming from, which hurts the person providing the services to the attacker quite a lot. If I have a reputation of: if you attack me for someone else, I’ll cost you a significant part of your business that will take you months to build back up, then you are not going to offer that service cheaply, or even at all.
Application level DDoS usually relies on amplification of cost - I do something relatively inexpensive (like send a packet opening a connection), and it makes you do something really expensive involving databases, disk IO etc…; a good mitigation is to redesign the API to flip that on its head, so you do something expensive, and I do something relatively cheaper for you. There is an open issue about using Hashcash to do just that at: https://github.com/LemmyNet/lemmy/issues/3204 - the downside is that it forces users (even on mobile devices) to use more compute / power for every request to Lemmy, but I think there is a balance that can be struck there where it isn’t too bad for users, but makes that type of attack infeasible.
permit separate, low-traffic, highly rate-limited, auth-only servers. They would be strictly rate-limited and only accept connections from whitelisted partner servers, because they only handle auth.
any partner server can authenticate a user and handle content for the server/auth-server pair, but only does so under certain conditions (determined by the partner - all the time, when ping api call > n seconds, or manually, for example)
The problem with these types of redundancy schemes is that it simply takes a Internet backbone hiccough (or AWS fuck up) to cause there to be multiple primaries (i.e. lemmy.world is online still, but some portion of the internet can’t see it, so a replica promotes itself to primary, people use both, how do you reconcile it).
This is not even beginning to talk about the nightmare scenarios possible if someone hacks a replica.
Edit: Still, this is a good thought and similar to how some actual software packages do things.
A lot of those issues of ‘multiple primaries’ can be resolved with intelligent data types and actions. That is, if we have a notion of how the data is organized, a lot of decisions can be made a priori. Ones that can’t can be read-only during a split.
Comment groups are mergeable sets. Any unique comment is a valid comment.
For any individual comment, any tombstone causes a comment to be unseeable (and ideally be deleted). Any edits are latest-wins.
A lot can be sorted out that way - enough to be usable. Some databases even support that on a db level.
This is a shame. Hosting a high visibility server is no joke, and I don’t envy the admins and the very difficult work they do. It’s simultaneously an argument for and against decentralization. For - a single instance can get knocked out without talking out the whole fediverse. Against - it seems as though high visibility communities are potentially fairly easy to target and take down.
I think that decentralization wins out here in the end, but it does feel like there may be a need for some sort of fallback mechanism to be in place at an instance/community level. I suspect this might evolve somehow over time. It would require some way to expand trust between instances and or portability of communities (which could be fraught with user trust/data integrity issues).
If things don’t evolve it could grow into a whack-a-mole game for bad actors, or there might need to be more investment into server infrastructure (which could work against decentralization if only because of economies of scale).
Or maybe there’s no issue after all? I’m just imagining potential implications of a scaling fediverse - it’s fascinating and exciting stuff!
Thoughts?
This is the primary reason why I’m ok for my instance to not grow massively. We got 10K people and we have pretty good traffic ,without overloading us or making too much of a target. We still get new users since we allow registrations, but the application requirements retain the quality
I’m realizing that I signed up for a probably-at-risk instance (lemmy.ml). I’m quite left but not necessarily an anarchist so it would seem applying to lemmy.dbzer0.com wouldn’t be a good move. (But I did enjoy reading your application requirements!) Recs on other small but reliable instances?
You don’t need to be an anarchist to apply to lemmy.dbzer0.com. Just follow the rules.
Absolutely makes sense. If lemmy is going to have any truly large communities though, investment in infrastructure/ops as well as function/moderation will be absolutely needed. (It’s an ‘if’, of course)
Time will tell how the community will want to lead it.
Honestly, it may as well have in this case. LemmyWorld is the de facto “hub” for basically the entire Threadiverse right now. All the major communities are seeing the most activity through LemmyWorld. While I’m subscribed to a lot of communities from other instances, sometimes duplicates of ones found on LemmyWorld, losing LemmyWorld would still a huge chunk of the content that I’m trying to see.
I really do wish that more specialized instances would sprout up and that some of these communities could cluster together across multiple pockets of the Threadiverse. I feel like this makes it less likely to lose huge chunks of content, and also makes fewer large targets for somebody to want to attack in the first place.
You don’t need to necessarily centralize to defend against DDos or similar attacks. You can add things like Cloudflare for DDos mitigations, CDN and maybe something like Kubernetes for horizontal scaling of servers (spin up more servers to handle extended load) transparently behind the scenes. This can also get you the benefits of low geographical latency, so a load-balancer fetches you data from the closest replica of a database geographically, etc.
Of course, all this adds up in terms of cost, but I think this might be worth it for the largest instances. I suppose that can still be considered centralization.
If we wanted to encourage small many small instances instead, perhaps there could be a transparent load-balancer layer for the fediverse that instances could sign up for, that is managed by a devops group. Alternatively, lemmy could have built-in load-balancing, caching, etc. as part of its codebase that instance operators can set up with their own accounts at Cloudflare, etc.
Agreed. Ultimately, that’s the point. There are solutions (with ongoing vigilance required) but it comes with an ongoing cost, be it server infrastructure or human resources).
I think the federated load balancer might be interesting but I expect there are many pitfalls that need to be considered and addressed wrt security, trust and integrity of data.
Anyway, it’s amazing to see this all grow and evolve.
Definitely, very exciting times!
Yeah everyone using Cloudflare is definitely centralisation, but maybe a kind of centralisation that allows for easier switching to something else if Cloudflare gets too crazy.
DDoS is a war of attrition - and the best way to win a war of attrition is to make it cost much more than $1 to make you spend $1, and to be able to outspend the attackers (e.g. the whole community bands together to support the victims against the attacker). I think the best response depends on who is attacking.
Network level DDoS is likely using stolen bandwidth - but the person directing the attack is probably paying someone for the use of it (i.e. they didn’t compromise the equipment themselves, someone else builds botnets and rents them out). If you can identify what traffic is part of a DDoS, you can track down where it is coming from, and alert the owner of the network where it is coming from, which hurts the person providing the services to the attacker quite a lot. If I have a reputation of: if you attack me for someone else, I’ll cost you a significant part of your business that will take you months to build back up, then you are not going to offer that service cheaply, or even at all.
Application level DDoS usually relies on amplification of cost - I do something relatively inexpensive (like send a packet opening a connection), and it makes you do something really expensive involving databases, disk IO etc…; a good mitigation is to redesign the API to flip that on its head, so you do something expensive, and I do something relatively cheaper for you. There is an open issue about using Hashcash to do just that at: https://github.com/LemmyNet/lemmy/issues/3204 - the downside is that it forces users (even on mobile devices) to use more compute / power for every request to Lemmy, but I think there is a balance that can be struck there where it isn’t too bad for users, but makes that type of attack infeasible.
I think this might be interesting:
The problem with these types of redundancy schemes is that it simply takes a Internet backbone hiccough (or AWS fuck up) to cause there to be multiple primaries (i.e. lemmy.world is online still, but some portion of the internet can’t see it, so a replica promotes itself to primary, people use both, how do you reconcile it).
This is not even beginning to talk about the nightmare scenarios possible if someone hacks a replica.
Edit: Still, this is a good thought and similar to how some actual software packages do things.
A lot of those issues of ‘multiple primaries’ can be resolved with intelligent data types and actions. That is, if we have a notion of how the data is organized, a lot of decisions can be made a priori. Ones that can’t can be read-only during a split.
Comment groups are mergeable sets. Any unique comment is a valid comment.
For any individual comment, any tombstone causes a comment to be unseeable (and ideally be deleted). Any edits are latest-wins.
A lot can be sorted out that way - enough to be usable. Some databases even support that on a db level.
Can’t post to op… But… Somebody just s scared.