It's been two months since OpenAI released its latest large language model, ChatGPT, and it's already making waves in the tech world. ChatGPT presents itself as a tool for translation, information retrieval, poetry, essays, and program code, but it has the potential to revolutionize the search industry and be a threat to Google's ad business.
ChatGPT is changing academia, having passed law school exams at the University of Minnesota and Google’s code interview exams for a junior programmer, all without any further training.
While ChatGPT is not the only large language model available, it's the first of its kind to be easily accessible to the public and developers. As ChatGPT becomes more widely used and integrated into other software, it's important to consider the implications on security, privacy, and power dynamics.
Bias and Misinformation
One of the well-known weaknesses of ChatGPT is its tendency to spread false information and implicit biases. For example, asking ChatGPT to summarize an episode of a show may result in mixed-up characters, repetition, or omitting important information.
This issue of bias is not unique to ChatGPT and affects language models in general. The 2016 release of Microsoft's chatbot Tay on Twitter is a prime example of how user inputs can quickly poison the well and lead to inflammatory language. ChatGPT may not be as outrageous as Tay, but biases still exist and can propagate as the model is integrated into software.
The real-world impact of biased decision-making models can be seen in the Finnish case of Svea Ekonomi AB, an online loan provider that preferred Swedish-speaking applicants over Finnish-speaking ones, and the Dutch child benefit scandal, which brought down the government. Governments around the world are modernizing their administrative infrastructure and tools like ChatGPT are likely to play a role.
Bias also plays a role in influencing consumer behavior through targeted advertisements, as seen in the 2016 US presidential campaign and the services of Cambridge Analytica. ChatGPT is capable of generating realistically appearing disinformation at high speed, backed by strong rhetoric skills, and can be fine-tuned by developers with a political agenda.
The use of ChatGPT raises privacy concerns, particularly with regards to privacy legislation. The General Data Protection Regulation (GDPR) in the European Union grants users the right to be forgotten and their data deleted. However, when users' queries are used to train a language model, their inputs are stored inside the model and cannot be easily deleted.
Processing requests that contain sensitive information, such as medical records, financial data, or trade secrets, is also likely to violate privacy laws. Text submitted to ChatGPT is available to OpenAI in clear text, making user privacy a concern. Hosting ChatGPT in multiple jurisdictions for increased compliance or self-hosting for large companies or government agencies may be solutions, but small and medium-sized companies may not have the resources to do so. (continue in part 2: Model Attacks, Exploits and Vulnerabilities)
-->Read also: The Rise of Large Language Models ~ Part 2: Model Attacks, Exploits, and Vulnerabilities