No clicks, no content: The unsustainable future of AI search
However, me personally, I don't want to be lured into some web store when I'm looking for some vaguely related information. Luckily, there's tons of information on the web provided not by commercial entities but by volunteers: wikipedia, forum users (e.g. StackOverflow), blogs. (Sure, some people run blogs as a source of income, but I think that's a small percentage of all bloggers.)
Have you ever looked for a specific recipe just to end up on someone's cooking website where they first tell your their life story before - after scrolling for a half a day - you'll finally find what you've actually come there for (the recipe!) at the bottom of their page? Well, if that was gone, I'd say good riddance!
"But you don't get it", you might interject, "it's not that the boilerplate will disappear in the future, the whole goddamn blog page will disappear, including the recipe you're looking for." Yeah, I get it, sure. But I also have an answer for that: "oh, well" (ymmv).
My point is, I don't mind if less commercial stuff is going to be sustainable in a future version of the web. I'm old enough to have experience the geocities version of the early web that consisted of enthusiasts being online not for commercial interests but for fun. It was less polished and less professional, for sure, but less interesting? I don't think so.
If new people don't discover your site with useful user-created content, they won't contribute to it. You're also cutting off the pipeline for recruiting new users to your forum or Q&A site.
This trend was happening before LLMs entered the arena.
Once, before I realized this, I recommended users of a forum use Discord. The impact was severe and fortunately brief. We all realized we would not be leaving the usual, often high value info for others, and ourselves to benefit from in the future.
We unwound that mess and now carry on in the usual way.
Discord has carved out a huge chunk of discussion people will wish was available in the future.
I haven't really tried one as a QA or knowledge sharing site, perhaps they're much less good at that.
The excellent communities I have been a part of can be searched. People can read it and learn. The Discord ones, unless they publish to a wiki or something, just don't exist.
Discord is great for chatting with your friends, gaming, etc. but man it's a horrible knowledge repository.
horrible knowledge repository
I don't disagree, but that does not change the fact that people have moved from sites like SO to Discord for this purpose.
There are Q&A channels, so not everything is chat, but Discord search is abysmal
Slack is another place where former SO content / answers are happening. Discourse too. The tl;dr is that it has become more fragmented, for better or worse
SO has a related problem to Reddit. Some mods high on their status and power
Every time I boot into Manjaro to do some gaming, almost always there’s a new update for Discordavailable and guess what? The updates to Manjaro are always lagging behind a few days to a week, and Discord won’t run with a slightly out of date client.
The only way to get is working is using the snap, and who doesn’t want use some 3rd party package manager just to send some kbit/s voice data?
Additionally the interface sucks and is really bloated
But that's just like, my opinion, dude.
If there's an old question the most upvoted answer will be at the top. Better solutions are often available if the previous answer was 10 years ago, but they will be buried.
Solution is obviously to scroll down as well as read the comments, but that can be time consuming.
Stack Overflow for example is in a steep decline, only a small fraction of questions are posted today compared to the peak.
...which might be beneficial. A problem they'd been trying to deal with for over a decade was the massive influx of low quality duplicate and "do my homework for me" questions from people who don't even bother looking for a solution. If they've all moved off to AI things, problem solved and maybe SO can return to its high-quality origins?
Stack Overflow for example is in a steep decline
This is because they're a big bunch of assholes and no one wants to deal with that. Their decline started way before ChatGPT came in.
The people who made SO are not going anywhere, there will always be a SO, a wikipedia, a search engine. Let it evolve to the next thing.
People will have less or no motivation to create them, because well, why would they? It will be just a food for AI of some corporation.
And more importantly, people won't be finding and joining communities that produce the websites like stack overflow.
It was nice while it lasted, but likely it will be something that existed only for one generation.
Monkey see monkey do where people start these activities because they see others doing them will disappear entirely.
People will have less or no motivation to create them
Not sure if we surf the same internets... In the web I am surfing, the more "motivation" (trying to get ad revenue) the author has, the crappier the content is. If I want to find high quality information, invariably I am seeking authors with no "motivation" whatsoever to produce the content (wikipedia, hacker news, reddit with a heavy filter etc.) I'm pretty sure we would be better off if the whole ad industry vanished.
It will be just a food for AI of some corporation.
Food that said corporation makes a profit off while paying the author nothing.
And that's assuming a world where people only ask LLMs and don't care about the provenance or trustworthiness of information at all. Which seems unlikely, even conspiracy nuts have some sources they trust more than others. The web will be fine. It will change drastically with the death of click-based advertising, but I don't see a future where it disappears
And that's assuming a world where people only ask LLMs and don't care about the provenance or trustworthiness of information at all. Which seems unlikely, even conspiracy nuts have some sources they trust more than others.
They will just have one model that they trust more because (a) it aligns with their views, or (b) it's a sycophant and agrees with anything and everything.
They definitely are not the people most likely to care about clicking "source links".
But unfortunately also sites that generate high quality information (eg independent research, reviews, journalism) will also struggle and be more reliant on paywalls and subscriptions.
In the short term it will feel liberating; in the long term it will kill the web.
In the short term it will feel liberating
They won't feel so liberated when they find ads embedded in the model's response in ways that make it difficult to uBlock.
Luckily, there's tons of information on the web provided not by commercial entities but by volunteers
The question is: is there content which is useful, but not provided by volunteers? We see more and more content behind paywalls, and it is a loss for many people who can't pay, because they won't be able to access the same content for free supported by ads.
So the result is poor people are going to lose access to certain contents, while well to do people will still have access.
many people who can't pay
Everybody is already paying for Spotify and for Netflix. They can pay for mass syndication of textual content. But it needs to be like Spotify or YouTube, where everything and anything goes. Poor people always had access to read newspapers.
On the other, I think it's unlikely the fun old geocities era comes back.
We'll probably get stuff that looks like it, but it's hawking nationalist revisionist propaganda instead of occult shapeshifting magic lifted from Sabine Baring-Gould, and a thousand Temple OS-inspired clones instead of python.
Lots of people might be willing to run websites for fun or personal satisfaction or whatever, but how many people will continue to be willing to do so when they don't actually get to present the content to visitors and it's instead just regurgitated by AI? Half the fun of hosting your own website is personalizing it and choosing how to share the content. Even people blogging for fun put a lot of thought into their posts on how to phrase an argument or tell a story. But what's the point when nobody will ever see your actual post, just your thoughts rearranged and presented by AI? Maybe some people only care about the information being out there in any form, but I'd be willing to bet that's yet a smaller subset of even the people who would contribute in a return to geocities version of the web.
Recently i've noticed google is even less effective than normal because it's turning up ad-filled pages that vaguely relate to the query but are clearly entirely generated by an LLM that also doesn't know what you're looking for.
So, we went from 95% crap to 100% crap over the course of about two years. There is obviously still stuff out there worth finding, but I can't imagine LLMs are going to get us there.
But great content creators will want to be rewarded one way or another. Book writers get paid, movie makers as well, so why shouldn't those who share their wisdom in a blog post? If someone is making money off your content and you aren't, you will not be happy. When it's not even attributed to you, it's even worse.
SEO is a response to Google's incentives, and Google can fix it if it wants.
the geocities version of the early web that consisted of enthusiasts being online not for commercial interests but for fun
LLMs are making many of the enthusiasts who were online just "for fun" feel sick for contributing to their training set.
someone's cooking website where they first tell your their life story before
I haven't seen a recipe page without a "Jump to Recipe" page button in forever.
Not saying this is good, but it’s the reason behind it.
We might see a resurgence in curated content but I have my worries. Google gets worse and worse but also traditional curated sites have started simply repost what's trending on Twitter.
ChatGPT, Google, and its competitors are rapidly diverting traffic from publishers. Publishers are fighting to survive through lawsuits, partnerships, paywalls, and micropayments. It’s pretty bleak, but unfortunately I think the situation is far worse than it seems.The article focuses mainly on the publishing industry, news and magazine sites that rely primarily on visits to their sites and selling ads.
I'm not sure where this comes from. The way forward for publishers of content like newspapers is subscription fees and has been for a long time.
The New York Times, The Wall Street Journal, and The Economist revenues are subscription fee dominant, for example.
https://www.pewresearch.org/journalism/fact-sheet/social-med...
Folks who want more traditional journalism will pay for it.
Folks who want more traditional journalism will pay for it.
If that is a tiny minority of people then there won't be a critical mass available to pay real journalists. No journalist can afford to work on long form investigative stories on minimum wage.
Even relatively straightforward legwork on a completely local story requires some driving around doing interviews. A whistleblower isn't going to just do a Zoom call with a journalist. A journalist can't get a first-hand account of an event from watching a webcam.
Good journalism isn't cheap. It doesn't have to be lavishly expensive but it's definitely not cheap. If only the New York Times can pay to hire journalists there won't be any meaningful journalism because they simply cannot scale to cover the world let alone the country.
Conversely a lot of 'news' in its raw form is posted to social media.
What you're talking about is long form journalism which is expensive and not popular with the 30 second soundbite population we've grown.
There you go, realtime. As soon as the people you want it to parrot have posted about it.
You can find examples of national papers of record having successful subscription models for text content. If you're only subscribing to one paper, it'll be one of those. And you can find 1-5 person outfits with strong personal brands (often built in other media) and a loyal following, who specially want a given person's take on things and want to read everything they write. Basically financials that resemble the "1000 true fans" model.
But between those extremes it is a lot harder to make subscriptions work.
Mostly useful when you're only looking for the presence of words or terms in the output (including the presence of related words), rather than a coherent explanation with the quality of human-written text.
Sometimes the response is accidentally a truthful statement if interpreted as human text. The quality of a model is judged by how well-tuned they are for increasing the rate these accidents (for the lack of a better word).
[1] EDIT: In the sense of "semantic web"; not in the sense of "actually understanding meaning" or any type of psychological sense.
the trade-off is that instead of returning links it returns a generated soup of words that are semantically close to your "query".
I get links in my responses from Gemini. I would also not describe the response as soup, the answers are often quite specific and in the terms of my prompt instead of the inputs (developer queries are a prime example)
I'll stop calling them a soup when the part that generates a human-readable response is completely separate from the knowledge/information part; when an untrained program can respond with "I don't know" due to deliberate (/debuggable) mapping of lack of data to a minimal subset of language rules and words that are encoded in the program, rather than having "I don't know" be a series of tokens generated from the training data.
My personal experience has led me to increase my monthly spend, because the ROI is there, the UX is much improved
Hallucinations will never go away, but I put them in the same category as clicking search results to outdated or completely wrong blog posts. There's a back button
I'm keen to build an agent from scratch with copilot extension being open source and tools like BentoML that can help me build out the agentic workflows that can scale on a beefy H100 machine
Content can’t be free if you want it to be of any quality.
Likewise, a lot of content produced with commercial interest in mind is total garbage (this is e.g. where the term "click-bait" originates from).
There's always quality stuff and crap, no matter whether it's been produced for free or not.
Or worse, if its content were distributed in short videos: "What to know what's that giant fire ball on the sky? Watch until the end!", with a like-and-subscribe animation covering the bottom 20% of the video every 5 seconds.
Wikipedia can also only work because the upstream scientific and academic work to produce what gets posted there is largely subsidized. Wikipedia posters and maintainers do not have to pay the true cost of the content they are posting and very little of it is original.
This model won’t work for, say, journalism, which is very expensive. It won’t work for difficult polished software products. It won’t work for truly original artistic or literary work which takes tremendous amounts of time to produce. If, for example, authors can’t charge for a novel, then only people with trust funds or who are independently wealthy can afford to invest the time it takes to write a book.
The people pointing out how bad ad supported content is are proving my point, which was that there must be some kind of economic model. If there is no working one, content producers default to ads which leads to enshittification.
Content can’t be free if you want it to be of any quality.
There are lots smaller local websites which can produce useful local content because of ad support. Those may not have enough subscribers to continue behind a paywall.
The big channels nowadays usually have 2 websites: one that is free and full of ads and pop-ups with very superficial news (seemingly written by interns) and one with actual quality analysis, journalism etc. that allow you access of 3 articles a month before you need to pay or something of that sorts.
I think the “serving ads” business hasn’t worked for a while.
Do I as a user have to do a micro transaction whenever an LLM generates an answer on one of those paywalled articles? Because as a user, I do not wish to read the quality journalist analysis, I wish for it to be part of the LLM answer that is tailored towards me.
But this is a huge simplification of course, and another thousand problems arise from this model. So I have no idea what’s the “good enough” solution we’ll head towards, or whether the web will change completely from this.
Many widely used machine-learning models rely on copyrighted data. For instance, Google finds the most relevant web pages for a search term by relying on a machine learning model trained on copyrighted web data. But the use of copyrighted data by machine learning models that generate content (or give answers to search queries than link to sites with the answers) poses new (reasonable) questions about fair use. By not sharing the proceeds, such systems also kill the incentives to produce original content on which they rely. For instance, if we don’t incentivize content producers, e.g., people who respond to Stack Overflow questions, the ability of these models to answer questions in new areas is likely to be lower. The concern about fair use can be addressed by training on data from content producers who have opted to share their data. The second problem is more challenging. How do you build a system that shares proceeds with content producers?
https://www.gojiberries.io/generative-ai-and-the-market-for-...
There's a simple solution. People that publish things can put up a paywall and people can pay what the content is worth.
The thing that AI endangers is not valuable content, it's the SEO clickbait cashcow, and as far as I'm concerned, the faster AI kills that off, the better.
That monetization model is corrupt as hell, produces all sorts of perverse incentives, and is the epitome of the enshittification of the web.
Burn, baby, burn.
Valuable content is endangered because writers feel demotivated it their material is just stolen by overfunded big corporations.
Paywalls only work for known publications and not for someone who writes the perfect tutorial on how to solve boot issues in Debian. Why would anyone write that if it's just stolen and monetized without attribution?
Long term that vendor lock in will go away and prices will go down to something reasonable.
Long ago CPUs were super expensive too, now they're so cheap we put them in toothbrushes
Repeat this loop a million times with diverse students and you get a distribution of what kind of explanations work. The model gets better at explaining through its own experience.
When AI provides a response it is possible to judge that response in hindsight. You look at the next 20 messages or sessions from next days and judge based on what followed. The chat logs provide a way to do long range credit assignment.
The modern abuse of copyrights by the likes of Disney does not negate this otherwise wonderful institution.
Copyright is a huge benefit to society.
Is it? We don't have the technology to duplicate the Earth in, say, 1776 but without copyright, and run an experiment, so all we can point at is a logic argument that we need to incentivize writer and artists and creators. Which I mean, sure. I want to write the next great American novel and not have to work for the rest of my life. and for my children and their children to not have to work either. Is that really for the betterment of society though? You can give some additional logic arguments in favor of that, but without Earth duplication technology, there's nothing that really constitutues real actual proof. The closest comparison we have that I know of is to look at China, which has far weaker intellectual property laws, and, well, they haven't fallen into lawless anarchy.
It's not lawless anarchy, it's just less of the works that copyright rewards (you get more of what you reward). So all those free books/music/movies that are made each year you still have, but you have less professionals taking a year off to go write their book. You have less decent funding for educations books. You have less sharing of knowledge and ideas, and I would say that makes society worse. If people were going to release it for free, they would be doing that today. People just don't work like that.
Very few creative works require kazillion-dollar budget, and presumably many of the current ones are subject to technical improvements making them accessible (you could probably produce a film/series with "1995 broadcast TV" production quality with consumer equipment today).
Copyright enables the runaway success "I made enough selling records/prints of my painting/copies of my novel that none of my children for five generations have to work." But we only say that to a handful of people per year. A reasonable grant programme could say "I can spend a couple years touring the country playing small town ampitheatres, writing my dream novel, or trying to put together a movie with friends and still be able to eat at the end of the day", to thousands of creative types every year.
There is a renaissance of extremely high quality YouTube videos from creators with very few subscribers. Particularly in Math and Science content.
But the general web is full of a lot really bad websites that effectively just waste people's time.
When it comes to “I have a specific question I need answering and then I’m done” the Web feels horribly clumsy and full of absolute garbage to wade through because they don’t want you to get the answer and go away. They want to milk your eyeballs for impressions and attention.
I prefer a million times today's web, which serves everybody and where I can find all kind of thoughts and ideas, without restriction. You just need to make an effort to find it. I prefer a million times a well stocked supermarket with all the ingredients I need to make anything I want, rather than a restaurant which serves only one meal made perfectly.
.. mainly that’s because that’s the only game left
Money is not the only motivation to create.
Maybe we will just go back to pay content as it was before the Internet era? Magazines and such
I enjoy reading some very niche magazines and, at the moment, I cannot tell if it's AI generated or not. So, as long as the reader is entertained, magazines will do just fine.
Either that, or we will develop some sort of proof that the content is human generated, not AI.
If this book came out today, in 2025, how would you know that the 420 pages are actually worth your time and not just a bunch of hallucinated LLM slop?
I've been wondering whether Wikipedia and libraries in 2030 will be in a better overall place quality-wise, or will just be overrun.
The last few times I looked for information on YouTube (by typing a keyword phrase or question instead of looking up a specific channel/creator), most of the top results were AI-narrated presentations. Some of those were filled with comments of people correcting obvious mistakes in the content (which as a layperson I would not have seen as mistakes)
but i wouldn't mind getting back to the internet of the 80/90s where you could easily find more genuine content and less aggregators, replicators, marketeers and clickfarms. if that's "killing the internet" then it couldn't happen soon enough (i guess marketeers will not go away no matter what, that's a given).
the fear of decline of original content doesn't seem serious. much of what there is now is endless regurgitation anyway. while most of the free stuff nowadays is indeed just noise, the most valuable, original and quality stuff is free, has always been, and it's there. people have been contributing interesting stuff for multiple reasons and in multiple ways for decades, and still do; it is just buried under tons of rubbish. i see no reason why they would stop. if anything, a less noisy internet could be an incentive, and if gaps in knowledge form that will be even more reason to share and contribute, and things like stackoverflow will come back once llms become obsolete enough.
I believe there's strong overlap between technically minded people and ad blockers. Maybe the challenge is that AI search appeals to less technically-minded people, who would have otherwise been exposed to ads?
With that said there are some queries AI does better than search engines. If I have something I'm trying to remember from the distant past with poorly refined definitions I can iterate to a solution much better than reading spam filled sites.
Every single time I open up google and try searching for the information... I get frustrated being forced so do the agentic work and sift through the crap... and I fall back on ChatGPT or Gemini.
People want signal and answers, not the 10 blue links as this post tries to argue.
The other thing is this: most high quality and valuable content can now be produced by individuals and finds distribution on social networks where they can occasionally charge for it as well. The drawback to google indexing those links was also that SEO-companies started targeting these mediums (eg: reddit, medium, forums). We needed an early regulation to minimize the needless hacking of SEO, but we let the market play it out, so it should still play out.
If LLMs ruin the economic viability of corporate blogspam, that's a net positive for society in my eyes. One of the few net positives we can expect from the AI bubble, as far as I can tell.
Of course, the new problem is that we have a bunch of LLMs trained on corporate blogspam, producing low-quality information that only feels plausible because of its correct grammar and neutral voice.
I think I'm going to miss the world where I could more easily trace the provenance of information.
Good riddance.
We can be forgiven for not seeing how social media was going to become weaponized against us, how streaming's promise of no ads was only temporary. There's no excuse for not seeing it coming this time.
The AI companies in turn will hoover in the deluge because they need something new to train their models with, embedding the ad copy deeply into the model itself.
Your local AI of the future will be just as ad-riddled.
Just wait until the foundational models are all fed with increasing amounts of ads.
Which foundational models?
Not all model providers are in the ad business and while the chances of building a supercomputer in your basement to train such a model are zero, some of the companies that build such models aren't exactly huge. Mistral for example is (according to Wikipedia) 150 people - this means that a company that can make their own model from scratch doesn't need to be some giant corporation. Which in turn means that it is possible for new companies to pop up, if there is a need for them - in this case, if some models become ad-infested, chances are other models will use their ad-free status as a feature.
And this is assuming only companies make such models. But some days ago i was reading here in HN about a new foundational model being trained by ETH Zurich and somehow i doubt a public university will inject ads in it.
• Small models have performed really well in recent years. • Newer phones will have more RAM. • Private LLMs do not necessarily need to run locally.
for 20 years, websites that released idol nudes dominated (https://www.change.org/p/allkpop-com-shut-down-allkpop )
then the first wave of slop that figured out you can buy links (https://x.com/paulkimio/status/1550532282288508929)
and now for the past three and a half years it's been indian news websites that dominate
good riddance to the search-powered internet and seoers, i don't mind bleeding-- i've been bleeding this whole time :)
Note this is a personal blog without any seo incentives.
AI is killing the web – can anything save it? - https://news.ycombinator.com/item?id=44623361 - July 2025 (448 comments)
Other related threads?