Have you tried these reasoning techniques with Copilot?

Have you tried these reasoning techniques with Copilot?

I have spent the last few months working closely with Microsoft and some of their largest customers as they seek to implement Copilot (Office 365) into their organizations. I'll cover implementation and secure connection of enterprise data sources (i.e. non-Office 365) to Copilot in another post, but for this post I'll focus on how your approach to prompting Copilot can drastically change your experience.

In order to do this, let me introduce you to the concept of "reasoning techniques". Think of these as cognitive techniques to reach and end goal, each with different approaches as to how you would process information, reason, make decisions and determine your next actions based on previous information.

These reasoning techniques when applied to the field of agent cognitive architectures make use of prompt engineering and task planning.

Put more simply, the question you ask and the framework you provide the model to answer, will change the results that you see.

Let's see it in action and learn more about three of the most popular techniques: ReAct, Chain-of-Thought (CoT) and Tree-of-Thoughts (ToT).

ReAct (Reasoning + Acting) Prompting

ReAct is a paradigm that combines reasoning with action planning to enable LLMs to solve complex tasks through interleaved reasoning traces and actions. The key aspects of ReAct include:

  • Generates both verbal reasoning traces and task-specific actions that can interact with external environments or tools
  • Allows models to retrieve additional information through actions and use reasoning to determine what to retrieve next
  • Helps overcome issues like hallucination and error propagation by enabling interaction with external sources

Example Use Case: Troubleshooting a computer issue

Prompt:

Task: Help diagnose why a laptop won't connect to WiFi
Use the following format:

Thought: [reasoning about the current situation]
Action: [specific action to take]
Observation: [result of the action]

Expected Result:

Thought: First check if WiFi is enabled on the laptop
Action: Verify WiFi switch/button is turned on
Observation: WiFi is enabled but still not connecting

Thought: Should check if other devices can connect
Action: Test WiFi connection with smartphone
Observation: Other devices connect successfully

Thought: Issue likely specific to laptop's WiFi settings
Action: Reset network adapter in Device Manager
[Process continues until resolution...]

Chain-of-Thought (CoT) Prompting

Chain-of-Thought is a prompting technique that breaks down complex problems into sequential reasoning steps. Its main characteristics are:

  • Structures responses by guiding LLMs through intermediate steps toward a final answer
  • Works best with larger models (≥100B parameters) as smaller models may produce illogical chains
  • Particularly effective for tasks requiring multi-step reasoning like math problems and logical deduction

Example Use Case: Calculating total cost for a party budget

Example Prompt:

Let's solve this step by step:

Calculate the total cost for a party with:

  • - 25 guests
  • - $12 per person for food
  • - $8 per person for drinks
  • - $200 flat fee for venue rental

Expected Result:

Let me break this down:

  1. Food cost: 25 guests × $12 = $300
  2. Drinks cost: 25 guests × $8 = $200
  3. Venue rental: $200 flat fee
  4. Total cost: $300 + $200 + $200 = $700

The total party cost would be $700.

Tree-of-Thought (ToT) Prompting

Tree-of-Thoughts extends CoT by enabling exploration of multiple reasoning paths simultaneously. Key features include:

  • Maintains a tree structure where thoughts represent coherent language sequences serving as intermediate steps
  • Allows for backtracking and exploring different branches when a particular path proves unsuccessful
  • Combines generation and evaluation of thoughts with search algorithms like breadth-first and depth-first search

Example Use Case: Planning a weekend trip itinerary

Prompt:

Imagine three travel experts planning a weekend in Paris. Each expert will suggest one activity for each time slot, then evaluate the others' suggestions. If an expert realizes their suggestion isn't optimal, they'll withdraw it.

Plan an itinerary for:

Saturday morning (8AM-12PM)
Saturday afternoon (12PM-6PM)
Saturday evening (6PM-10PM)

Expected Result:

Expert 1: Morning - Visit Louvre early to avoid crowds
Expert 2: Morning - Start with Eiffel Tower for sunrise views
Expert 3: Morning - Begin with local café and Notre-Dame

[Experts evaluate]
Expert 2 withdraws: "Eiffel Tower too crowded in morning"
Expert 3 withdraws: "Notre-Dame currently under renovation"

Final morning plan: Louvre visit (Expert 1's suggestion)
[Process continues for afternoon and evening slots...]

Key Differences

Technique Primary Focus Structure External Interaction
ReAct Reasoning + Action Interleaved steps Yes, through actions
CoT Sequential Reasoning Linear chain No
ToT Multiple Reasoning Paths Tree structure No

ReAct differs from CoT by incorporating actions and external knowledge sources, while ToT expands on CoT by exploring multiple potential solutions simultaneously rather than following a single chain of reasoning.

Summary

So, what have we learned? That there are multiple strategies on how to engineer your prompts, each with their own advantages and disadvantages depending on your desired use case.

Understanding these strategies is growing in importance as Copilot expands rapidly to Agentic AI through the use of Copilot Studio and Custom Engine Agents.

But like everything in life, there is a counter argument. As models and their workflows start to automatically add reasoning/deeper thinking steps the need for users to perform these steps at prompt time is reduced or in some cases mitigated. Abram Jackson (Microsoft Copilot Extensibility PM) covers this is great detail here.

Try the prompts above in your own Copilot to see them in action!