If you tinkered with AI already, you probably used a chatbot. You ask it a question, wait a second, and it gives you the answer. It’s impressive, but it is also limited. The entire field is now moving beyond just creating chatbots, into the world of AI Agents: autonomous or semi-autonomous systems that can reason, plan and interact with tools in order to complete a task.

Let’s first start with what AI Agents are.

What are AI Agents?

At their core AI Agents are LLMs. Traditional LLMs have limitations:

They cannot obtain information after the cutoff of their training
They cannot interact with external systems
They cannot reason or plan.

That is what an AI Agent is: an LLM that can reason, plan and interact with the external systems using tool use.

This ability to reason, plan, and interact isn't a single feature. It's a "stack" of several distinct concepts and techniques that work together.

Let's break down the "must-know concepts" that give an AI agent its power. We can group them into how an agent thinks and acts.

How Agents "Think" (The Reasoning Engine)

At the core of an agent is its ability to reason. We are all already accustomed to how an LLM works. You give it a prompt and it gives you the answer. You don’t know how the LLM got to the answer and the LLM lacks the power to plan its steps. This is something that needed to be solved for AI Agents to become usable and reliable. The agents needed to:

Have ability to reason and plan
Have ability to “think” in plain English (for debugging purposes)

A simple prompt-and-response is not enough. Agents need strategies to solve complex problems. This is where "thoughts" come in. There are 2 important concepts that enable agents to think: Chain-of-Thought (CoT) and Three of Thoughts.

Chain-of-Thought (CoT)

This is the foundational concept. Instead of just asking an LLM for the final answer, you prompt it to "think step-by-step." This simple instruction forces the model to lay out its logical progression. Let’s take an agent that I built as an example: recruiting agent. The goal of this agent is to extract all of the information from candidate resumes and generate a uniform format of the resume.

Without CoT, the agent might just output: "Skills: Python." With CoT, the agent's internal monologue looks like this:"1. My goal is to extract skills from the provided text. 2. I should first scan the document for common section headers like 'Skills', 'Expertise', or 'Technical Proficiencies'. 3. I found a section titled 'Technical Proficiencies' on page 2. 4. Now, I will read the bullet points under this section and list them. 5. The skills are: Python, React, SQL, and AWS."

This step-by-step process is:

Far more accurate
Reliable
Tracable

Standard prompting vs. Chain-of-Thought Prompting

Tree of Thoughts (ToT)

This is the next evolution. CoT follows a single step-by-step path. But what if that path is a dead end? Resumes are messy—what if the "Skills" section is poorly formatted?ToT allows the model to explore multiple different paths at the same time, like branches on a tree.

Example: Finding the candidate's most recent job.

Branch A (The "Simple" Path): "I'll just take the first job listed under the 'Experience' header. I've extracted 'Software Engineer at OldJob Inc, 2018-2020'."
Branch B (The "Smarter" Path): "I'll scan all job entries, extract their date ranges, and look for the one with an end date of 'Present' or the latest year. I found 'Senior Engineer at NewJob Corp, 2020-Present'."
Branch C (The "Cross-Check" Path): "I'll look at the candidate's 'Summary' at the top of the resume. It says, 'Senior Engineer with 5 years of experience...' This matches Branch B."

A ToT-enabled agent can evaluate all three "thoughts," conclude that Branch A is incorrect, and confidently proceed with the correct information from Branch B.

This is how our AI agents can “think”, let’s now explore how they can act.

How Agents can “act”

Thinking is important, it shows that our agents can really understand the problems at hand and solve them, not only that, but that they can also tell us the thought process so we can correct them. Although it is only half of the story, our agents also need to be able to act, or more specifically, to use the tools.

A tool gives the agent access to external systems like:

Access to google search
Access to files
Various useful APIs

To be useful, our resume agent must be able to act on its thoughts and interact with the digital world. This is where it goes from a "thinker" to a "doer."

Function Calling & Tool Use

It's the bridge to the outside world. We give our LLM a list of tools (which are just our own code functions) that it can call.

By itself, a Large Language Model can only process and generate text; it cannot check the weather, browse a website, or save a file. Function calling solves this by providing the LLM with a "toolbox" of predefined code functions (like get_weather(location) or save_resume_data(data)). When the LLM's reasoning determines it needs external information, it doesn't try to run the code itself. Instead, it outputs a structured JSON object that essentially says, "I need to use the get_weather tool with the location set to 'Novi Sad'." Your application then executes this real function, gets the real-time weather, and feeds that data back to the LLM to continue the conversation.

For our AI Recruiter Agent, we would define a set of tools like:

read_pdf_text(file_path): A tool that takes a PDF and returns its raw text.
extract_structured_info(text, schema): A powerful tool that takes the raw text and a schema (e.g., {skills: [], projects: []}) and uses the LLM to "fill in the blanks."
save_to_database(candidate_data): A tool to save the extracted JSON object to our company database.
generate_uniform_resume(data, template_id): The final tool that takes the structured data and generates the new, client-ready resume.
The LLM doesn't run this code; it just knows when to ask for it by outputting a special JSON command.

Function calling and tool use is very impressive when paired with our next concept.

ReAct Framework

This is the heartbeat of the agent. 'ReAct' stands for "Reasoning" and "Acting," and it provides the fundamental loop that connects thinking (CoT) and doing (Function Calling). Instead of trying to solve a complex problem in one go, ReAct teaches the agent to work in a continuous cycle. First, the agent reasons about the problem (e.g., "The user wants the weather, I need to find the right tool"). Then, it acts by choosing a tool (like get_weather) or forming a response. Finally, it observes the result of its action (e.g., "The weather is 30°C"). This observation feeds back into its next "Reasoning" step, allowing it to plan, self-correct, and take a series of logical actions until the final goal is complete.

Let's see our AI Recruiter Agent in action, handling one new resume using the ReAct loop:

Cycle 1

Thought: "The user has uploaded 'candidate_resume_v4.pdf'. My first step is to read the text from this file."
Action: call tool: read_pdf_text(file_path='candidate_resume_v4.pdf')
Observation: (The tool runs and returns 2 pages of raw, jumbled text.)

Cycle 2

Thought: "I have the raw text. It's messy, but I can see sections for skills and experience. I need to use my extraction tool to get the structured data."
Action: call tool: extract_structured_info(text='...', schema={...})
Observation: (The tool runs and returns a clean JSON object: {name: 'Jane Doe', skills: ['JavaScript', 'LangGraph', 'Python'], ...})

Cycle 3

Thought: "The extraction was successful! The data is clean. Now I must save this to our central database so the candidate is in our system."
Action: call tool: save_to_database(candidate_data={...})
Observation: (The tool runs and returns: {success: true, candidate_id: 'c-456'})

Cycle 4

Thought: "The candidate is saved. The final task is to generate the standardized resume for the client using our 'modern' template."
Action: call tool: generate_uniform_resume(data={...}, template_id='modern')
Observation: (The tool runs and returns: {file_path: 'client_resume_jane_doe.pdf'})

Cycle 5

Thought: "The final file has been generated. My work is complete."
Action: (Respond to user: "The resume for Jane Doe has been processed and the new file is ready.")

This "Thought-Action-Observation" loop is what allows an agent to complete a complex task that a simple chatbot never could.

Conclusion: From Chatbot to Collaborator

This is how we bridge the gap. By combining sophisticated reasoning patterns like Chain-of-Thought and Tree of Thoughts with the practical ability to act using Function Calling and the ReAct loop, we fundamentally change what an AI is capable of.

We're no longer limited to a simple chatbot that just answers questions. We now have the building blocks for a true collaborator, an agent that can understand a complex, multi-step goal, plan a course of action, interact with external tools, and see a task through to completion. As we saw with the AI Recruiter example, these concepts are what allow an agent to take a messy PDF and turn it into a perfectly formatted, database-ready entry, all on its own.

And this is just the beginning. Concepts like Long-term Memory, Self-Correction, and even Multi-agent Collaboration are already building on this foundation, creating systems that can learn from past interactions and work in teams. The shift from chatbot to agent isn't just an upgrade; it's the start of an entirely new way to build and interact with software.

AI Agents: From chatbots to collaborators