Mirage of Generative AI Self-Governance

The use of generated artefacts is showing no signs of slowing down. As we integrate generative AI in live customer applications and daily use software like MS Office, we will find it difficult to monitor and govern this capability.

The concept that many vendors are chasing involves machines governing themselves (or other machines).

This automated governance comes in three main flavours:

  1. Small Model supervised by Large Model
  2. Two different but similarly performant Models with one governing the output from the other
  3. Self-supervision through reflection where the model governs its own output

The problem with all these methods is that we expect the governing model to know ‘something’ more than the model whose output is being governed.

This extra ‘thing’ can only come through the prompt as usually there is no other way to tune the governing model with additional information on a per request basis.

The Paradox

The fact that the model is trained with human oversight and guidance means that the models are implicitly trained to respond like a human but with machine like precision and alignment. This leads to a trade-off between exploration and repeatability

Then what if we ask a state-of-the-art model to provide the ‘features’ of text written by an AI model? The question being ‘Tell me something about fruits’.

This is what we get for GPT-4o:

The text exhibits several characteristics that suggest it was likely written by an AI model. It has a structured and coherent presentation of ideas about fruits, including their nutritional benefits, aesthetic appeal, and personal anecdote regarding watermelons. The transition from general statements about fruits to a personal story is somewhat abrupt and lacks deeper emotional context, which is common in AI-generated text. Additionally, the phrasing is generic and resembles the informative and light-hearted style often found in AI compositions.

GPT-4o

For Gemini Flash 1.5:

The text exhibits characteristics commonly found in AI-generated text:

  • Generic and positive statements: Phrases like “Fruits are nature’s candy” and “They are also good to eat and come in a wide variety of flavours” are generic and lack specific detail or nuance. This is typical of AI models trained on vast datasets of general knowledge.
  • Simple sentence structure: The sentences are short, simple, and declarative. More complex sentence structures and varied sentence beginnings are less common in AI-generated text.
  • Abrupt shift in topic: The transition from a general description of fruit to a personal anecdote about watermelons is abrupt and feels somewhat unnatural. A human writer would likely create a smoother transition.
  • Lack of unique perspective: The writing lacks a strong, individual voice or unique perspective. The statements about fruit are quite common and widely known.

While not definitive proof, the combination of these factors strongly suggests the text was likely generated by an AI model.

Gemini Flash 1.5

Now this is quite interesting, both the models think that simple sentence structures, lack of strong individual voice, and abrupt shift in topics points to AI related topics. But these can show up in text written by humans. Especially in case you just want to complete the writing task or not proficient in the English language.

And both models were trained by different vendors!

So as we get AI trained by humans to behave more like humans, AI takes those features as the hallmarks of AI generated content. It will lead to reduction of variance and therefore will not fulfil the requirement of human-like yet precise and aligned response.

Why is this important? This is important because if we think back to previous section around the model knowing ‘more’, as we build models to govern AI generated content we are likely to setup a biased system trained to emulate human biases but is asked to govern AI generated content.

Take for example the following piece of text:

Fruits are natures candy. They provide many of the required vitamins and minerals. They are also good to eat and come in a wide variety of flavours. Finally, they are amazing to look at with vibrant colours. But fruits like peaches can cause allergy.

Let me tell you a story about watermelons. They are my favourite fruit ever since I was a child.

This was written by me but both GPT-4o and Gemini Flash 1.5 thought this was written by AI. It seems I write like an AI model!

What can we do about it?

A solution for this is to provide variability within the generation process so that during different phases of generation the model is able to take different style, tone, etc. that makes the text read human-like yet precise and aligned.

This means finding some way of changing the model weights based on the current input while text generation is ongoing.

The temperature setting allows us to tailor the sampling process but this is applied after the fact and does not impact the model’s ‘thinking’.

A way to visualise this is to understand that current models are designed as static structures like old school skyscrapers and therefore cannot be taller than a certain height as they cannot adapt to the stress caused by the wind, whereas what is needed is a design that can adapt to the environment like modern skyscrapers that flex with the wind.

The environment for the model includes the prompt, the task, the data, and the partially generated output.

Static vs Flex…

Leave a Comment