this post was submitted on 05 Mar 2024
170 points (95.2% liked)

DeGoogle Yourself

8753 readers
3 users here now

A community for those that would like to get away from Google.

Here you may post anything related to DeGoogling, why we should do it or good software alternatives!

Rules

  1. Be respectful even in disagreement

  2. No advertising unless it is very relevent and justified. Do not do this excessively.

  3. No low value posts / memes. We or you need to learn, or discuss something.

Related communities

[email protected] [email protected] [email protected] [email protected] [email protected] [email protected]

founded 4 years ago
MODERATORS
 

In an age of LLMs, is it time to reconsider human-edited web directories?

Back in the early-to-mid '90s, one of the main ways of finding anything on the web was to browse through a web directory.

These directories generally had a list of categories on their front page. News/Sport/Entertainment/Arts/Technology/Fashion/etc.

Each of those categories had subcategories, and sub-subcategories that you clicked through until you got to a list of websites. These lists were maintained by actual humans.

Typically, these directories also had a limited web search that would crawl through the pages of websites listed in the directory.

Lycos, Excite, and of course Yahoo all offered web directories of this sort.

(EDIT: I initially also mentioned AltaVista. It did offer a web directory by the late '90s, but this was something it tacked on much later.)

By the late '90s, the standard narrative goes, the web got too big to index websites manually.

Google promised the world its algorithms would weed out the spam automatically.

And for a time, it worked.

But then SEO and SEM became a multi-billion-dollar industry. The spambots proliferated. Google itself began promoting its own content and advertisers above search results.

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

My question is, if a lot of the web is turning to crap, do we even want to search the entire web anymore?

Do we really want to search every single website on the web?

Or just those that aren't filled with LLM-generated SEO spam?

Or just those that don't feature 200 tracking scripts, and passive-aggressive privacy warnings, and paywalls, and popovers, and newsletters, and increasingly obnoxious banner ads, and dark patterns to prevent you cancelling your "free trial" subscription?

At some point, does it become more desirable to go back to search engines that only crawl pages on human-curated lists of trustworthy, quality websites?

And is it time to begin considering what a modern version of those early web directories might look like?

@degoogle #tech #google #web #internet #LLM #LLMs #enshittification #technology #search #SearchEngines #SEO #SEM

(page 2) 33 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 2 points 8 months ago

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

True, but these things can also be used by us, to curate/maintain a high quality link collection. However, I'm not sure 'pages' will be read by humans in 5 years, so I have a feeling we wont need such a collection anymore. Well, not for humans but probably for our individual LLM's.

[โ€“] [email protected] 2 points 8 months ago

@ajsadauskas @degoogle hopefully they don't look like Dmoz, because i still have unpleasant flashbacks of that dark time ๐Ÿ˜‹

[โ€“] [email protected] 2 points 8 months ago

I remember a time when you could be a paper magazine every other week with curated lists of link on various topics. There were ads, but just paper ads :)

[โ€“] [email protected] 2 points 8 months ago

@ajsadauskas @degoogle a bit of history of Yahoo here, started as a web directory https://www.wired.com/1996/05/indexweb/

[โ€“] [email protected] 2 points 8 months ago
[โ€“] [email protected] 2 points 8 months ago (1 children)

@ajsadauskas @degoogle So, classic mid-90s Yahoo. Or LookSmart, which was initially curated by Reader's Digest.

[โ€“] [email protected] 2 points 8 months ago (1 children)

@ajsadauskas @degoogle I mean we could still use all modern tools. I'm hosting a searxng manually and there is currently an ever growing block list for AI generated websites that I regularly import to keep up to date. You could also make it as allow list thing to have all websites blocked and allow websites gradually.

load more comments (1 replies)
[โ€“] [email protected] 2 points 8 months ago (1 children)
[โ€“] [email protected] 1 points 8 months ago

Oooh, I like it! Thank you so much for sharing this here! :)

[โ€“] [email protected] 1 points 7 months ago

@ajsadauskas @degoogle
New online family game is coming next month ! Only first 1000 will get to play it for free for 1 month !

Check out https://www.meeteli.com

[โ€“] [email protected] 1 points 8 months ago (1 children)

@ajsadauskas @degoogle it sounds a bit like Kagiโ€˜s Small Web initiative and search. have you seen it? https://blog.kagi.com/small-web

load more comments (1 replies)
[โ€“] [email protected] 1 points 8 months ago
[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas @degoogle I've always wanted to try or contribute to one of these!

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas @degoogle

It would be sad to go back to walled gardens like AOL, particularly since they were corporate-owned. But a sort of Kite Mark, certifying a site is free of LLMs, would be useful. Then users could choose for themselves.

[โ€“] [email protected] 1 points 8 months ago
[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas I think Github's awesome lists are kind of like this. They're human-maintained catalogues of worthwhile websites on a specific topic.

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas sounds like you want https://curlie.org/ - which seems to be up to date and interesting.

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas @degoogle ah the good ol' days. I was a curator on yahoo's directory for a few years, before it ended.

load more comments
view more: โ€น prev next โ€บ