I wonder if it's possible to bring public opinion into the error function - find weights for ChatGPT such that the next token is predicted correctly but also such that the overall output falls within the public average opinion.... But then - is that a "good enough" metric?
Technology
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
The ways to control for algorithmic bias are typically through additional human developed layers to counteract bias present when you ingest large datasets to train. But that's extremely work intensive. I've seen some interesting hypotheticals where algorithms designed specifically to identify bias can be used to tune layers with custom weighting to attempt to pull bias back down to acceptable levels, but even then we'll probably need to watch how this changes language about groups for which there is bias.
I think the trouble with human oversight is that it’s still going to keep whatever bias the overseer has.
AI is programmed by humans or trained on human data. Either we're dealing in extremes where it's impossible to not have bias (which is important framing to measure bias) or we're talking about how to minimize bias not make it perfect.
"Particularly underrepresented groups include Mormons and those over 65..."
What a disaster! I hope someone gets on that ASAP! /s
I don't see how misaligning to public opinion = bias.
The public is already hugely biased; we surrender the general education of the entire adult population to news media, social media, and the entertainment industry. Which all sway public perception for their own financial and political gains.
Tbh the "public" as a mass entity is going to be more wrong than a language model; I see who y'all vote for, I wouldn't trust you with anything 🫠
Bias shouldn’t exist in a language model. Human beings continue to complicate reality because of boredom.
I'm lost here, what are you trying to say
@Gaywallet I'm coming to think that expecting models to produce human-like values and underlying representations is a mistake, and we should recognize them as cognition tools which are entirely possible to misuse.
Why? LLMs get worse at tasks as you attempt to train them with RLHF - and those with the base models will use them without filtering for a significant intelligence-at-scale advantage. They'll give the masses the moralized, literally dumber version.