What actually is AI?
What actually is AI?
ArthurTalks — Post #4 10 minute read
I write this blog today to explain in my own words to so called demystify how "Artificial Intelligence (AI)" works, and how at a base level it's never been that smart until the last 3 years or so since the release of ChatGPT in late 2022. Most people use it everyday and still don't understand how it works.

Basic AI has always been in use from the inception of using things such as video games and then in the golden era of 2016ish where you had Siri / Alexa and now having sophisticated reasoning models that can create a set of answers based on a set of generated answers, that's why in some way you always get semi unique answers every time you asking something into ChatGPT.
Let's go through it in multiple layers,
Level 0 → Level 1 → Level 1.5 → Level 2 → Level 2.5 → Level 3
Level 0: Basic bot behaviour
This the absolute most basic of it all, it being a dialogue tree, it's everywhere let's say in a simple chatbot on a website that has a pop up at the bottom right asking
"What can I help you with today?"
And then followed by some basic questions e.g billing, help, etc and then followed by another set of questions depending on what you click, before ultimately you try and find the button that is "Connect me to a human", where even then it's hard to verify if it's actually someone on the other end.

This is most commonly found in video games where you interact with an NPC. Ultimately it's just a flowchart that looks like a big family tree, it's nothing more than yes a if you say Answer A return back Response A1, A2 and A3
Level 1: Chat and voice assistants
This is where it started to seem smarter, where you had voice assistants such as Siri or Alexa where you could ask it something smarter than a hard coded question, you could Siri a combination of the same question, e.g
- what is the weather like today?
- how hot is it today?
- hey will it be cloudy today?
- Is London going to rain?

This is a form of pattern matching. A developer at the back wrote a bunch of code with rules such as:
If the user says "weather" or words related to a location → it calls the Weather API → reads the result returned
It's essentially a script that allows some lag to have time to retrieve the results.
Level 1.5: Semantic search
There is another set up that uses semantic search, that can be classed at Level 1.5 where it doesn't match exact words but rather it converts your question into a list of numbers (this is called an embedding) in a theoretical vector space, and tries to capture of the meaning of your question and then goes into a database of pre written answers (which is a list of numbers) and grabs the closest match.
Essentially let's say my question of:
"What is the weather like today?" returns back a number 17 in this list, it then searches and matches the for what number 17's answer should be. That's it.

It's essentially just like me asking my Mom where did I keep my running shoes? She knows where to find it and gives it to me, she doesn't know why I keep it in the cupboard or how long it's been there, just knows it's there.
Level 2: ChatGPT, Claude, Microsoft Copilot and other LLMs
So this is where it starts to actually get interesting and less pre-defined through these Large Language Models, Claude ChatGPT, Perplexity, Gemini and co. None of this really was in everyday use till end of 2022, where my housemate in university called me down to try this cool new "website", and only to find out it was able to code up an assignment of creating a game of Soduku in Java where I worked day and night for weeks, done in a few minutes… that's when I knew I was actually cooked

Where they don't fetch pre written answers but more rather generates answers, one at a time too.
An LLM is basically trained to predict the next word, that's really it too, it just thinks of the next word on the fly each but it just knows that 99% it's correct as it's been trained and verified TRILLIONS of times.
It has wide amounts of data ranging from books, Wikipedia, code repositories, news articles to even Reddit forums.
So let's say you have an input of something like:
The word "How", it predicts it in an order of if you have How → there's a probability chance the next word is… Are… and with "How Are"… the next word is… You
- How
- Are
- You
- ?
In more poetic ways it has literature written in it too, so let's say we have Shakespeare,
INPUT: "To be or not to ___"
Based on the input words, there is almost certain chance that the next word is "be", but it just predicted it.
This is also another more higher level bit of knowledge that LLMs don't work in words but work in something called "tokens" which is also another buzzword that people throw away. All tokens are just little chunks, and on a very average estimation it's about 3/4 of a word of arbitrary length.

A fun fact was that for the longest time why AI had trouble with Maths is because in a sense it is meant to predict the next number based on the context of what you have before, but a simple case was ** does it check if 1 + 1 then next character is equals or another + as it's up to probability
I've probably used to word "Sophisticated" a lot this blog, but if it were to be broken into tokens it's like about 3 as it breaks down the words into "Soph", "istic" and then "ated".
Now this is the part where you now you can tend to hear the phrase "AI hallucinating", where LLMs are designed to always a positive response. But they can tend to make stuff up, because it produces text that "sounds right" based patterns, as too most of the time it doesn't actually look anything up.

For example, it can have been trained that Tottenham Hotspurs has always had some many fans in London, how Tottenham have always been in the "Top 4", and how they were runners up to the Premier League in the Leicester City winning season. It can have so much information that makes up Tottenham to be a good club, and then based on this gives the deduction that they can win the league. Even though it's probably never true!

Level 3: Reasoning models and deep research
So now this is where it actually seems like true intelligence and starts to drain AI companies budget. As they nowadays start to give you access to these "upgrades" that start eating into their compute power budgets. As with tokenisation of words at Level 2, all of this costs money as it has a computer running somewhere in some warehouse that does this "thinking".
So when you have something called a "reasoning" model it's creating a version of ChatGPT that doesn't just predict words but rather do something similar to what humans usually do, where it "think before they speak", this is where instead of a model giving you the answer it has a thought process, e.g. breaks down your problem step by step and checking itself and then writes the final answer.

This is exactly the equivalent of instead of saying what comes to your head at a pub quiz, but rather having 30 seconds to think and consider the question closer. This in return uses the "tokens" we talked about earlier for its thought process instead of answering.
The final boss of it all is what we currently as of 19th May 2026, call Deep Research or Agentic Search.
This is where the model is given and empowered with so many tools, such web search capabilities. Where now when you it something like
"Write me a report on the commodity market has changed over the course of the last 5-10 years with live prices".
It fetches data for live prices, looks at the "commodity" market as a whole, searches the tools it's been given and does a literal "deep dive" in. It does planning, searching, reading, more searching and then ultimately writing.
You can think of it as
- if a normal LLM is just asking a friend a question in the hallway at work, they tell what comes to their mind right away.
- a reasoning model is sitting down with your friend for a coffee where you can collect your thoughts better and think better
- Where your friend went to the library and comes prepared with full list of notes

This all kind of still stems from next word token prediction, it's just how many layers prediction there is, and it's just a huge loop of clever next token predictions.
The bottom line
So the bottom line of this whole long boring post is that every time you see something produced from Gemini, it's likely had a long iteration of that the next word is. It's in a sense a lot of pattern recognition, no different to how humans work, if I'm going down the stairs and I see people walk down on the left and people come up on the right, then it's likely I should go to the left.
Spelling arrows are someting interasting as to LLMs it doesn't actually matter that much as it reads your sentence based on context and as it was trained on the internet it knows all kinds of errors
Also a good note to know is as smart as "AI" is they also have a knowledge cutoff which is just what the model was last trained up till which means if you ask it about a recent event, it would simply not know it unless it does a live web search there and then, but it's only remembered in that chat window context
There's a non zero chance that this whole blog was written by Claude but guess we'll never know eh. Also if you ever see this larger dash " — " instead of a smaller one like this " - " then you know it's AI as through all the training they've been been trained to see that this " — " is what to use.. good indicator to fact check something on first glance
Nothing else from me, cheers all
💬 Comments
Got a thought on this one? Leave a comment below.
0 Comments
No comments yet. Be the first!