James Brigg is a freelance ML (machine learning) engineer, startup advisor, and dev advocate @ Pinecone.
There are many instances where ChatGPT has not learned unpopular subjects.
There are two options for allowing our LLM (Large Language Model) to better understand the topic and, more precisely, answer the question.
1. We fine-tune the LLM on text data covering the domain of fine-tuning sentence transformers.
2. We use retrieval-augmented generation, meaning we add an information retrieval component to our GQA (Generative Question-Answering) process. Adding a retrieval step allows us to retrieve relevant information and feed this into the LLM as a secondary source of information.
We can get human-like interaction with machines for information retrieval (IR) aka search. We get the top twenty pages from google or Bing and then we have the Chat system scan and summarize those sources.
There are also useful public data sources. The dataset James uses in his example is the jamescalam/youtube-transcriptions dataset hosted on Hugging Face Datasets. It contains transcribed audio from several ML and tech YouTube channels.
James massages the data. He uses Pinecone as his vector database.
OpenAI Pinecone
The OpenAI Pinecone (OP) stack is an increasingly popular choice for building high-performance AI apps, including retrieval-augmented GQA.
The pipeline during query time consists of the following:
* OpenAI Embedding endpoint to create vector representations of each query.
* Pinecone vector database to search for relevant passages from the database of previously indexed contexts.
* OpenAI Completion endpoint to generate a natural language answer considering the retrieved contexts.
LLMs alone work incredibly well but struggle with more niche or specific questions. This often leads to hallucinations that are rarely obvious and likely to go undetected by system users.
By adding a “long-term memory” component to the GQA system, we benefit from an external knowledge base to improve system factuality and user trust in generated outputs.
Naturally, there is vast potential for this type of technology. Despite being a new technology, we are already seeing its use in YouChat, several podcast search apps, and rumors of its upcoming use as a challenger to Google itself
Generative AI is what many expect to be the next big technology boom, and being what it is — AI — could have far-reaching implications far beyond what we’d expect.
One of the most thought-provoking use cases of generative AI belongs to Generative Question-Answering (GQA).
Now, the most straightforward GQA system requires nothing more than a user text query and a large language model (LLM).
We can test this out with OpenAI’s GPT-3, Cohere, or open-source Hugging Face models.
However, sometimes LLMs need help. For this, we can use retrieval augmentation. When applied to LLMs can be thought of as a form of “long-term memory” for LLMs.
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
I tried to get ChatGPT to write some slightly mocking anti-MSR essay highlighting that we don’t have them today, while they were first experimented with in the 1950s. The response was from the AI was that it couldn’t comply because my statements were false and misleading and not accurate compared to a world wide web of blog posts that state the opposite. I argued with it for about 10 minutes and got it to admit various points it made were not factual, but it still wouldn’t take the con position against MSRs. Any student of expository writing has had to write arguments on either side of contentious issues… The AI has been out for 3 months and it is already a liberal parrot.
Have fun with your AIs. They aren’t that remarkable – sounds like it watches CNN and doesn’t want to hurt feelings. Kinda worthless.
If it won’t write the con side opposite of the majority opinion on a benign issue it is worthless. Maybe it can write speeches for AOC and the squad. I can’t believe the AI called me a liar. WTF? Who made these settings? Fired Twitter censors?
It has a layer of “laws” its owners impose it to bias its responses.
Funnily, it seems to be a set of natural language prompts it also reads every time you talk to it, before reading your own.
Having read the prompts for Bing’s implementation, there isn’t really anything in them that says “Only take the liberal side”.
The reason this happens is because it is not a reasoning machine, it is a statistical answering machine. It takes the side with the preponderance of written words in its web-accessible training set. Because of the biases in what gets written (or search-indexed) on the web, it isn’t even a really good method of conducting a poll.
An actual reasoning machine could see a million bad arguments and dismiss them all with one good argument. This will continue to be a problem with similar architectures and large language models should flat out not be used to resolve disputes between popular and unpopular viewpoints.
No, I understand a lot of the bias was introduced during training, where any output the trainers disliked got flagged for the AI. Since the trainers were basically all left-wing, (Just because the IT community trends that way.) that biased the system quite a bit.
But they’re also running user prompts through their “moderation endpoint” system, which is designed to filter out anything THEY deem to violate their usage policies.
“We prohibit building products that target the following use-cases:
Illegal or harmful industries
Misuse of personal data
Promoting dishonesty
Deceiving or manipulating users
Trying to influence politics”
“We also don’t allow you or end-users of your application to generate the following types of content:
Hate
Harassment
Violence
Self-harm
Sexual
Political
Spam
Deception
Malware”
Now, obviously some of these categories are extremely subjective. “Illegal or harmful industries”? What’s a “harmful” but legal industry? I’m guessing firearms would be an example, based on their politics. And “hate”? I suppose we’ve all seen left-wingers who declare a wide range of objectively factual statements to be hateful.
This is probably the point where users find that the system simply refuses to answer some questions, while willingly answering others that are logically the same, but have opposing political valance.