Artificial intelligence firm OpenAI is working on a tool to watermark the AI-generated text to prevent people from taking such texts and passing them off as their own work. The information comes from Scott Aaronson, an OpenAI guest researcher in his blog.
Although still under discussion, the watermark will use a cryptographic pseudorandom function to add an unnoticeable signal to the text. A cryptographic tool incorporates a detectable signature in the words generated by OpenAI’s text-generating AI models.
GPT models work with language in the form of tokens. The input text is read as tokens, and outputs are provided as tokens. GPT models provide a list of tokens as output, each with an associated score, from which an external algorithm chooses a winner to produce as the following token output.
The external algorithm is hosted on OpenAI’s servers. The algorithm chooses a single winner from the list, weighted by their score, and incorporates some amount of randomness to keep output unique and varied. This randomness factor is exactly where OpenAI will implement the watermarking scheme.
Instead of taking some typical random function, OpenAI is creating its own with a key only they can access. From the list of output tokens, the one that maximizes the randomness function would be selected as the next output.
With this custom pseudorandom function, any given text string could be analyzed to find if it maximizes the function like GPT output would. Even if humans lightly modify GPT output, the average function maximization would still indicate a GPT creation.