The Rise of Large Language Models | Model Attacks, Exploits, Vulnerabilities

Training Data Leakage

One of the biggest risks associated with LLMs is the potential for training data to be leaked. When a model is trained on a large body of data, it is supposed to learn patterns but not memorize exact training examples.

For example, a model should learn that credit cards are composed of 16 digits without memorizing individual credit card numbers. However, it does happen as has been demonstrated on GPT-2, a predecessor of ChatGPT. This type of attack is known as a training data extraction attack and can occur when the model predicts the text that most likely follows a given input text. For example, if someone starts a query with "Jane Doe, credit card number 1234," the model could yield Jane's full credit card number if it was memorized during training.

Membership inference attacks, which reveal to attackers whether a particular input was part of the training set, are also possible. Research is underway to sanitize the training data using methods such as differential privacy to prevent these attacks, but ChatGPT is currently in use without these protections. Additionally, current GDPR legislation does not address the risk of data leakage from trained models.

Malicious Queries and Context Escalation

As applications are developed using ChatGPT, services may lock users into a specific use case to prevent abuse. For example, a toothbrush retail website may integrate ChatGPT to assist users in choosing and purchasing the right toothbrush and fine-tune it to only allow questions related to toothbrushes. However, it remains to be seen how users will be able to break out of this limited context and use ChatGPT for other purposes, potentially at the cost of the toothbrush retailer's resources.

When ChatGPT is connected to backend API's and databases for making purchases, an attacker could escalate breaking out of context further by making ChatGPT-generated malicious backend queries. This type of attack is similar to an SQL injection, where queries to a database are manipulated to extract information or modify the database. For this reason, language models should not be trusted to generate safe or valid requests to backend services, and established methods of preventing such attacks continue to be necessary.

Competition in the Market

ChatGPT is not the only language model available, and OpenAI is not the only player in the market. OpenAI recently beat Google to the launch of its AI-powered chatbot, Bard, which will be powered by LaMDA, a conversational LLM that convinced a Google developer it was sentient. Yann LeCun, the chief AI scientist at Meta, argues that Google and Meta have been hesitant to launch similar systems because they can produce false information, which poses a risk to their businesses' reputation.

Meta recently published an open-source non-commercial license model, OPT-175B, which performs similarly to GPT-3, the model on which ChatGPT is based, while being 7 times more energy efficient during training. Open research collectives Bigscience and EleutherAI also provide pre-trained language models that are fully open source.

Currently, ChatGPT is the most performant publicly available model. While it shares similarities with other models, ChatGPT is fine-tuned in a process of human labelling and reinforcement learning. Furthermore, ChatGPT is a particularly well integrated application at the forefront of delivering a monetizable product that is likely to gain broad adoption. Already in 2019, Microsoft invested $1 billion into OpenAI, with another $10 billion in 2023. While the partnership between OpenAI and Microsoft is official, it's claimed that Microsoft would end up with 49% stake in OpenAI by unofficial sources, thus concentrating more power in a well established player, if true.

Conclusion

ChatGPT promises to make information more accessible to a wide range of users and to help them develop ideas and build software more effectively. Yet, it's integration with software also poses risks. For the time being, ChatGPT is not well suited for critical applications, be it because they rely on accurate outputs or because the inputs are sensitive. Similarly, algorithmic bias continues to be an increasingly important subject of discussion, especially when employed by the public sector. The technology behind ChatGPT is exciting and in principle publicly available, while ease of use may encourage a concentration of power in the hands of a few. Embracing Artificial Intelligence by understanding and discourse is our most promising way forward to improve on shortcomings and to mitigate its risks while reaping its benefits and seeing its beauty.

Vibe coding and project duration: Micro-efficiency vs. macro-complexity

Artificial intelligence (AI) has found its way into software development with tools such as GitHub Copilot and concepts such as vibe coding. These AI-powered tools promise significant efficiency gains at the micro level by enabling developers to complete certain ...

The Rise of Large Language Models ~ Part 2: Model Attacks, Exploits, and Vulnerabilities

Training Data Leakage

Malicious Queries and Context Escalation

Competition in the Market

Conclusion

Comments

Recent posts

SysML/KerML: modeling languages for Ecore based tools

itemis CREATE - now available on Cloud, Visual Studio Code, and Eclipse

Vibe coding and project duration: Micro-efficiency vs. macro-complexity