Understanding the Overuse of "Delve" in ChatGPT Responses
Written on
Chapter 1: The Mystery of "Delve"
Have you ever wondered why ChatGPT tends to use the word "delve" so often? This peculiar tendency can be attributed to the model's construction and training methods.
The enigma behind ChatGPT's frequent use of the term "delve"—one of its top ten most employed words—has finally been unraveled, and the explanation is quite unexpected. While terms like "delve" or "tapestry" might not commonly appear in our daily conversations (I must admit, I'm unsure about the meaning of "tapestry"), ChatGPT seems to have a preference for them. You may have observed that it often overuses these terms in its responses.
Recent data reveals a sharp increase in the use of "delve" within medical literature, as illustrated in the chart below. This chart, published in March 2024, reflects the usage patterns for the full year of 2023, coinciding with ChatGPT's rise in popularity.
The term "delve," along with phrases like "as an AI language model...," has woven itself into the fabric of ChatGPT's identity, often signaling that a piece of text was generated by an AI rather than a human.
This leads to a puzzling question that has intrigued both myself and machine learning professionals: Given that ChatGPT is trained on human-generated data, how did it come to frequently use the word "delve"? Is this an emergent behavior? And why this particular term?
A compelling article from The Guardian titled "How cheap, outsourced labour in Africa is shaping AI English" may shed light on this conundrum. The key to understanding this phenomenon may lie in Africa and the methods used in ChatGPT's development.
Section 1.1: The Origins of "Delve"
Returning to the question, "What caused ChatGPT to adopt the term 'delve' so prominently?" If ChatGPT's usage of "delve" is disproportionate compared to everyday language and online content, it suggests that its language patterns were modified after being trained on a broad data set.
Following extensive training on vast amounts of data, measures are implemented to ensure the AI remains focused and aligned with human expectations. This is achieved through a process called Supervised Learning, where human annotators assess the model's outputs. Their evaluations are crucial for refining the model.
Here's a concise overview of this process:
- A foundational Language Learning Model (LLM) is trained using a comprehensive data set.
- A Reward Model (RM) is introduced to discern what humans deem "good" and "aligned."
- The LLM produces multiple outputs, from which humans select the most appropriate.
- The RM learns from these human selections, translating preferences into rewards or scores.
- The LLM receives feedback and adjusts its behavior to garner more rewards.
- This cycle continues: the LLM generates outputs, humans provide feedback, and the model iteratively improves.
This general framework of Reinforcement Learning from Human Feedback (RLHF) is what helps ChatGPT remain cautious and focused. However, this effective alignment hinges on human evaluation. This evaluation can involve simple approval or disapproval of outputs or providing ideal responses that the LLM is expected to generate.
The volume of feedback given is minuscule compared to the vast text corpus used for the initial training. Nonetheless, providing sufficient feedback is labor-intensive and costly, leading large AI firms to outsource this work to regions in the global south, where English-speaking labor is more affordable.
As noted by The Guardian, an extensive network of cost-effective freelancers in Africa undertakes these annotation tasks. In Nigeria, the term "delve" appears more frequently in business English than it does in the UK or the US. Consequently, the workers who are training the AI systems provide input that reflects this linguistic preference, resulting in an AI that communicates in a manner reminiscent of African English.
This situation exemplifies poor sampling (Selection Bias), where the evaluators differ from the intended user demographic, potentially skewing the writing style. To ensure a better sampling for an AI Assistant meant for global English speakers, it is essential to include annotators from diverse backgrounds and styles.
Section 1.2: Addressing the Biases
While it's unclear if this leads to subtle biases in ChatGPT's writing style, the underlying issue likely stems from the RLHF phase rather than the initial training. The writing style of ChatGPT is already somewhat robotic and easily identifiable, regardless of its use of "delve." However, there's a significant lesson here: understanding where things could go wrong in this process can help prevent similar pitfalls in future research and development efforts.
Chapter 2: Making ChatGPT More Human-Like
So, how can we address this issue? A straightforward solution to enhance ChatGPT's human-like qualities and reduce its reliance on awkward terms like "delve" is through Prompt Engineering. Here are several strategies:
- Directly instruct ChatGPT to avoid using "delve" or specific words.
- Assign ChatGPT a specific role or persona.
- Employ few-shot learning by providing it with examples of text you want it to emulate, such as your own blog posts.
The downside to these approaches is the time they require. I prefer not to spend five minutes prompting ChatGPT before asking it a question. I'm in search of a quick, reliable method that integrates seamlessly with ChatGPT, perhaps akin to a Chrome extension.
If you've discovered a solution to this issue or know of a dependable tool, please share it in the comments. This appears to be a widespread challenge.
Thank you for reading,
— Hesam
The first video titled "Why ChatGPT Cannot Help You Land a Job + HOW To Fix It" discusses the challenges of using AI in job applications and offers solutions to enhance effectiveness.
The second video, "Can ChatGPT's AI analyze data better than a human being?", explores the capabilities of AI in data analysis compared to human performance.