The Event Horizon of Thought: Where Attention Meets Architecture

09/12/2025

32 min listen

Bartosz Lenart

Get Instant Insight

Your attention has limits. Stop fighting them. Design around them.

The Core Idea

Event horizon = boundary of no return. Your attention works the same way: Working memory: 4 7 items max LLM context window: Fixed tokens Beyond these: inaccessible

Overload these limits → performance collapses. Not weakness. Physics.

Solution: Don't fight the boundary. Architect within it.

The Paradox

Constraints grant freedom.

Vague space: infinite but paralyzed Constrained space: smaller but actionable

Example: "What should we build?" → generic "Bootstrap. $5M deals. 5 person team. 18 months." → specific, novel

Constraints don't shrink the solution space. They focus it. Like light through a lens.

Three Layers

  1. Physiological: Sleep, circadian rhythm, nutrition. For AI: GPU memory, architecture, tokens.
  2. Environmental: Physical space, noise, interruptions. For AI: interface design, context switching.
  3. Social: Norms reward shallow or deep attention. Culture shapes cognition at scale.

Design implication: Match complexity to capacity. Reduce cognitive load. Create rituals.

Infrastructure > Willpower

Willpower is fragile. Infrastructure is durable.

Don't memorize prompt tips. Instead: Build templates that encode patterns Create system messages that auto establish constraints Develop rituals (clarify → reason → synthesize) Design interfaces where good practices are easy

When practice becomes infrastructure, attention flows without draining willpower.

Five Techniques (All Externalize Reasoning)

  1. Chain of Thought: Ask for step by step reasoning. Makes invisible thought visible, testable, refinable.
  2. Tree of Thought: Generate multiple paths. Evaluate, prune, backtrack. Escapes where single path gets stuck.
  3. Self Consistency: 5 10 independent paths. Select most consistent. Redundancy beats fragility.
  4. RAG: Retrieve real sources instead of trusting model memory. Ground in data, not patterns.
  5. Constraints: Make horizons explicit. State what fits, what decision, what reshapes the space.

Pattern: All respect attention limits by externalizing reasoning.

The Shift: Attention → Intention

Attention economy: Maximize time on site. Infinite scroll. Result: fragmentation, fatigue. Intention economy: Design for explicit goals. Minimize friction to completion. Reduce clutter.

You accomplish goals faster, then disengage. No endless browsing.

Relativistic Jet

Your expertise is a singularity surrounded by a chaotic accretion disk. Without structure, this energy radiates as useless thermal heat.

Architecture acts as the magnetic field. It twists that chaos into a Relativistic Jet: a highly focused beam of coherent signal.

Without architecture: Thermal glow (random heat). With architecture: Relativistic jet (infinite reach).

Your decisive output is that jet. It is the only way the weight of your internal singularity effectively touches the world.

The Choice

Either you architect your attention, or it architects you.

If you design it: coherence, focus, intention If you don't: scattered, habitual, pulled by defaults

Choose consciously.

Quick Reference

WantUseWhy
Transparent reasoningChain of ThoughtExternalizes thought
Explore multiple pathsTree of ThoughtEscapes local maxima
High confidenceSelf ConsistencyRedundancy beats fragility
Grounded outputRAGExternal memory
Focused spaceConstraintsReshapes possibility
Expert simulationRole Based PromptingAdopts domain styles
Multi step workflowsPrompt ChainingSequential logic
Complex tasksFew Shot PromptingLearns from examples

Bottom Line

Your attention limit isn't a bug. It's a feature.

Constraints don't restrict you. They focus you.

Stop fighting the boundary. Architect within it.

That's where genius lives.

[ NEURAL COMPRESSION COMPLETE ]

80% signal retained.
Full depth below.

In physics, an event horizon is a boundary of no return.

In human cognition and artificial intelligence, it's the edge of what you can simultaneously hold in mind.

This essay explores why understanding that boundary transforms how you work with AI.123

Event Horizon

Abstract

The Event Horizon of Thought argues that the hard limits of attention, whether human working memory or AI context windows, are not obstacles to overcome but structures to navigate.

Drawing on the physics of black holes, this essay posits that intelligence functions best not by fighting these constraints but by orbiting them as attractor states that focus reasoning.

The analysis decomposes attention into physiological, environmental, and social layers, showing how techniques like Chain of Thought and Retrieval Augmented Generation serve as "attention infrastructure." These tools externalize reasoning, bypassing the fragility of willpower and the quadratic complexity of transformer models.

The work advocates shifting from an Attention Economy to an Intention Economy, providing a framework for designing workflows that respect the "physics of thought," treating cognitive boundaries as gravity wells that compress vast possibility spaces into actionable, high velocity decisions.

The Point of No Return

Imagine standing at the edge of a black hole. Nothing escapes once it crosses that boundary - not matter, not energy, not even light. The event horizon isn't a wall you can push through. It's a fold in spacetime itself where the rules change.

Outside the event horizon: escape remains possible. Inside the event horizon: things curve toward the singularity.

Context Window Architecture

Language models have an event horizon too. It's called the context window: the fixed number of tokens the model can process at once.123

Information outside your LLM's context window is computationally inaccessible, regardless of whether it was in the training data. This is not a temporary technical problem. It reflects the quadratic complexity of the attention mechanism itself.23

This isn't a limitation to work around. It's the structure to design within.

The same principle applies to human attention. Your working memory holds approximately 4 7 items simultaneously.4567

That's your cognitive event horizon.

Beyond it, information requires deliberate retrieval. Overload working memory with 20 simultaneous concerns and performance degrades. Not because you lack capability, but because attention is a bounded resource.86

The insight: Both humans and AI systems operate within hard constraints that function as gravity wells. Intelligence doesn't fight these constraints. It orbits them.910

What happens when you stop treating these boundaries as problems?

When Constraints Become Design Parameters

Most people think of constraints as obstacles. They're not. They're attractor states, the natural places where systems settle when forced to organize themselves.9

Consider a vague prompt to an LLM: "What should we build?"

Without constraints, the model's attention disperses across its entire training distribution. It defaults to generic patterns: "Build a SaaS," "Find an underserved market," "Focus on retention." Training data attractors. Not bad advice, just the most common patterns in internet text.

Now add constraints: "Bootstrap only. $5M enterprise deals. 5 person team. 18 month runway."

Everything changes. The solution space geometrically reconfigures. Different strategies surface.

Constraints don't limit intelligence. They focus it.910

This is why Norbert Wiener's cybernetics framework remains radical: systems controlled by explicit feedback loops and stated constraints produce coherent behavior.1011

Wiener recognized that complex systems are driven by feedback loops: communication that forms cycles rather than linear flows, with systems using past behavior to model and improve future outcomes.10

Application to your AI workflow: The most effective prompts don't ask for everything at once. They specify your event horizon: what information fits in this conversation, what decision you're making, what constraints shape the solution space. State these clearly and the model's attention focuses like light through a lens.91011

Three Layers of Attention Architecture

Three Layers of Attention

Attention doesn't operate on a single level. It's structured across multiple interacting layers:

Layer 1: Physiological/Computational Foundation

In humans: Sleep quality, circadian phase, and nutritional state set baseline attention capacity.126

Physics: Time moves slower closer to a massive object.

Application: Deep focus (closeness to the singularity of the problem) dilates subjective time. 90 minutes of "Event Horizon" work feels different than 90 minutes of shallow, low gravity browsing.

In LLMs: GPU memory, model architecture (attention mechanism design, parameter count), and token budget set baseline processing capacity.23

Design implication: Match task complexity to available cognitive resources. Schedule deep reasoning work during peak attention hours. Break complex problems into sessions that respect both human and model constraints.

Layer 2: Environmental/Interface Architecture

In humans: Physical environment profoundly shapes attention.6 Circadian alignment predicts cognitive function.12 Acoustic environment affects cognitive load. Open offices fragment attention more than enclosed spaces.6

In LLMs: Interface design determines attention topology.13 Systems optimized for rapid engagement train shallow attention through constant context switching.

Systems designed for depth introduce strategic friction at distraction points while lowering friction for sustained focus.

Design implication: Create interfaces that reduce cognitive load. Make conversation history easily scannable. Provide clear boundaries between distinct reasoning tasks. Design for sustained engagement rather than compulsive checking.

Layer 3: Social/Institutional Architecture

In organizations: Attention is contagious. Shared environments synchronize focus.

Norms that reward immediate responsiveness train shallow attention. Norms that honor deep focus cultivate sustained attention at scale.6

Design implication: Implement organizational rituals around AI use. Start sessions with goal clarification. End with structured synthesis. Create visibility around good practice.

The Infrastructure vs. Willpower Paradox

Willpower Decay and Infrastructure

Here's a truth that changes how you work: willpower is a fragile capacitor. It runs out.6

You cannot maintain constant vigilance through sheer mental effort. Relying on willpower to stay focused, remember complex prompting strategies, or follow best practices in every conversation fails.

Infrastructure is different. Infrastructure encodes desirable flows into stable structures.13 A road doesn't require you to decide the direction every step. A template doesn't require you to remember the structure. A ritual doesn't require sustained intention. It becomes automatic.

In practice: Don't memorize prompt engineering tips and resolve to use them perfectly. Instead:

Build templates that encode good patterns Create system messages that establish constraints automatically Develop interaction rituals (clarify objective first; always request reasoning chains for complex work; synthesize insights before closing) Design interfaces that make good practices easy and poor ones slightly harder

When these become infrastructure, they reshape attention flow without taxing willpower.13

What does this infrastructure look like in practice?

Modern Prompting Techniques: Attention Scaffolding

The following techniques work because they externalize reasoning from working memory:

Chain of Thought: Externalizing the Reasoning Path

Ask for step by step reasoning rather than direct answers.14 This makes the model's reasoning trajectory visible, enabling verification and correction.

Performance: On arithmetic and commonsense reasoning tasks, chain of thought prompting improves accuracy, particularly in large models (≥100B parameters).14

When to use: Any task where transparency and correctness matter more than speed.14

Cognitive science grounding: Externalizing reasoning offloads working memory.6 Internal reasoning is invisible. Written reasoning creates artifacts enabling reflection and refinement.14

Tree of Thought: Structured Exploration

Tree of Thought (ToT) generates multiple reasoning paths, evaluates each, and prunes poor branches.15 Unlike chain of thought's single trajectory, ToT explores a branching tree of alternatives.

Performance: On Game of 24, ToT achieved 74% success vs. 4 9% for standard chain of thought.15 On mini crosswords, 60% vs. ~15%.15

When to use: Tasks requiring backtracking, strategic planning, or multiple intermediate decisions where early mistakes are correctable.15

Implementation: "Generate 3 possible next moves. Evaluate each: sure/maybe/impossible. Keep promising branches. Backtrack if stuck. Show reasoning for each evaluation."15

Attention rationale: ToT externalizes tree search that would otherwise occur in working memory.46 By making branches explicit, it offloads cognitive burden to external representation, allowing deeper exploration within attention constraints.15

Going deeper: For an extended framework that builds on Tree-of-Thought with parallel reasoning across ten distinct cognitive modes, see Yggdrasil: Parallel AI Reasoning Architecture.

Self Consistency: Verification Through Redundancy

Generate 5 10 independent reasoning paths and select the most consistent answer.16 Redundancy reduces dependence on single paths that may capture spurious patterns.

Performance: Improves accuracy by 4 18% over single path chain of thought depending on benchmark (e.g., +17.9% on GSM8K, +12.2% on AQuA, +11.0% on SVAMP, +6.4% on StrategyQA, +3.9% on ARC challenge).16

When to use: High-confidence factual accuracy is needed: mathematical calculations, logical reasoning, verifiable problems.16

Wang et al. (2023) demonstrated that sampling multiple diverse reasoning paths through chain of thought and marginalizing out the sampled reasoning paths produces more robust results than greedy decoding.16

Retrieval Augmented Generation: External Memory

RAG reduces hallucination by augmenting models with real time access to authoritative sources.1718 Instead of trusting model memory, retrieve relevant passages, rank them, then generate grounded answers with citations.17

Modern approaches (2025):

Hybrid search: Combine keyword and semantic search for maximum recall17 Graph RAG: Build knowledge graphs enabling cross document reasoning17 Re ranking: Use cross encoders to sort by true relevance, not just embedding similarity17

Attention rationale: RAG externalizes memory, functioning as the computational equivalent of human external memory systems (notes, writing, databases).1819

This expands the effective attention horizon: the model can "attend" to information far beyond its context window by retrieving it on demand.17

External memory systems enabled human cognitive evolution because they overcome working memory limitations.197

Additional Modern Prompting Techniques

Few Shot Prompting / In Context Learning

Few shot prompting (also called in context learning) provides the model with multiple examples (shots) in the prompt. This helps LLMs generalize to new tasks by learning from contextual examples, improving accuracy and output quality, especially for nuanced or complex tasks.2021

Zero Shot Prompting

Zero shot prompting asks the model to perform a task without providing examples. It relies on the model's generalization from its pretraining data. Zero shot is efficient for well known tasks but may underperform for ambiguous or novel problems.2022

Role Based Prompting / Persona Prompting

This technique instructs the LLM to assume a specific role or persona (e.g., "act as a software engineer"). Role based prompting encourages adoption of context sensitive communication styles and can improve quality, engagement, or specialty simulation.23242526 Clearly define the role, context, and constraints for best results.

ReAct (Reasoning + Acting)

ReAct prompts models to interleave explicit reasoning traces with action steps (e.g., "Thought: … Action: … Observation: …"). It is used for complex problem solving, especially where planning and tool use are interconnected. This approach supports dynamic adjustment of actions based on reasoning output and feedback.27 For how this "thought" language relates to prompt-elicited registers rather than genuine self-monitoring, see The Mirror Has No Face.

Meta Prompting / Self Refinement

Meta prompting uses the LLM to analyze and optimize its own prompts, or iteratively critique and refine outputs (self refine loops). It can be used to enhance clarity, reduce bias, and improve output alignment by making iterative prompt and answer improvements.212829

Prompt Chaining

Prompt chaining involves structuring tasks as sequential, interconnected prompts. The output of one step is fed as input into the next, enabling complex multi step processes and dynamic workflows.243031

Prompt Decomposition

Prompt decomposition breaks complex, multi faceted challenges into manageable subtasks, each given as a focused sub prompt. This improves result quality for intricate tasks and enables modular, interpretable workflows.213233

What happens when the systems you use are designed to capture your attention rather than serve your goals?

The Attention Economy vs. The Intention Economy

Scattered Attention vs Focused Intention

Modern digital systems operate on attention economy logic: capture users and maximize time on site. This creates pressure toward shallow attention patterns.

An emerging alternative is the intention economy: systems designed around explicit user intent, minimizing friction to goal completion rather than maximizing engagement.34

Attention economy design: Infinite scroll, push notifications, algorithmic feeds optimized for engagement. Result: fragmented attention, cognitive fatigue.

Intention economy design: Clear goal specification, efficient task pathways, reduced clutter, cognitive load reduction as explicit goal.

The distinction matters for how you design AI interaction. Attention based designs encourage extended casual browsing. Intention based designs help users accomplish goals and then disengage.

The Event Horizon in Practice: A Decision Framework

Problem: Should we pivot our product?
Constraints: 5 person team, 12 month runway, zero market presence.
Event Horizon: Define what information fits in this decision.

Step 1: State the horizon
"This decision requires market opportunity analysis, execution risk assessment, and financial sustainability evaluation. We'll spend 90 minutes in focused analysis with a 10 minute midpoint break."

This respects attention boundaries.4 It compresses vast possibility into manageable scope.

Step 2: Use Tree of Thought
"Generate 3 independent reasoning paths: (1) Market opportunity, (2) Execution risk, (3) Financial viability. For each path, generate 2 analyses. Identify consistent conclusions and disagreements. Flag each step: sure/maybe/impossible."15

This externalizes the tree search you'd otherwise attempt in working memory.4

Step 3: Retrieve grounding data
"Query knowledge base: customer interviews mentioning target segments, past B2B SaaS pivot case studies, competitive positioning data."17

This prevents attention drift to generic patterns. Grounding in actual data anchors reasoning.17

Step 4: Compress and iterate
Reduce analysis to 5 metrics (market opportunity, execution risk, financial risk, competitive advantage, strategic alignment). Score each 1 10. Calculate weighted average.

This respects working memory limits.4 It converts diffuse analysis into a clear decision point.

Step 5: Iterate on gaps
"Execution risk scored 4/10 (high). How did [competitor examples from retrieval] handle this? What minimum viable test could we run?"10

Feedback loops refine understanding. Each iteration changes the event horizon, reshaping the solution space.10

Full Neuroinclusivity

This entire framework supports not only human and AI cooperation. It also benefits both neurotypical and neurodivergent profiles because explicit attention architecture helps everyone.

Clear structures, external memory, ritual scaffolding, reduced cognitive load: these universally enable better thinking.6

Organizations that implement these principles aren't accommodating outliers. They're building systems that respect how attention mechanisms actually work.6

The Relativistic Jet Insight

Hawking Radiation of Thought

Your expertise is a singularity too dense to transmit directly. Surrounding it is a chaotic accretion disk of thoughts, memories, and inputs spinning at high speed.

Without structure, this energy just radiates as heat thermal noise that warms the room but moves nothing.

But active black holes possess a mechanism to solve this: magnetic fields. These fields act as architecture, twisting the chaotic energy of the disk and launching it outward in Relativistic Jets highly focused beams of plasma traveling at near light speed.

Without architecture: Thermal glow. Random heat. Entropy.

With architecture: Relativistic jet. Coherent signal. Infinite reach.

The "collapsed thought" you deliver that single, decisive email or strategy is the jet.

It is the only way the weight of the singularity can effectively touch the world outside.

The Paradox That Rewires Everything

Here's the paradox that changes your approach to work: explicit constraints grant more freedom, not less.9

Vague possibility space is infinite but unusable. Constrained possibility space is smaller but actionable. The constraints aren't limitations. They're enabling devices that focus attention where it matters.10

In physics, objects don't escape gravity through strength. They orbit efficiently by accepting the constraint and organizing motion around it.

In AI work, the same principle applies.

The question isn't whether constraints exist. They always do.9

The question is: Will you make them explicit and harness them, or let them operate invisibly?910

When you specify event horizons clearly, when you design attention architecture deliberately, when you treat bounded resources as design parameters rather than problems to overcome, intelligence stops fighting its own nature and starts orchestrating it.

That's where coherent, powerful thinking lives.11

Quick Reference: When to Use Each Technique

TechniqueBest ForWhy It Works
Chain of ThoughtStep by step reasoning, transparencyExternalizes reasoning from working memory
Tree of ThoughtComplex reasoning, backtracking, planningExternalizes tree search, enables pruning
Self ConsistencyHigh confidence answers, verificationRedundancy reduces noise in single paths
RAGFactual grounding, citations, fresh dataExpands attention horizon via retrieval
Constraint DefinitionAll strategic decisionsReshapes solution topology, focuses attention
Feedback LoopsIterative refinement, discoveryCybernetic control through sensing error
Few Shot PromptingNuanced/complex tasks, generalizationLearns from contextual examples
Zero Shot PromptingGeneric tasks, efficiencyGeneralizes from pretraining data
Role Based PromptingSimulated experts, context sensitive tasksAdopts role relevant styles
ReActPlanning, integrated tool useReasoning steps interleaved with actions
Meta PromptingSelf optimization, refinementPrompt/answer improvement loops
Prompt ChainingMulti step workflowsSequential logic for complex tasks
Prompt DecompositionIntricate challengesModular, interpretable workflows

What does it feel like when all of this converges?

The Architecture of a Well Ordered Thought

At the end of a day structured by intentional attention, coherence is possible. The mind feels stitched together. Fragments align into patterns.

Small acts of focus become arguments in favor of a life designed rather than merely endured.

This is what attention architecture makes possible.8613

It's technical. Token budgets, embedding dimensions, inference latency.

It's ethical. Neurodiversity justice, cognitive respect, agency preservation.

It's aesthetic. The satisfaction of coherent problem solving, of thought moving like a practiced instrument.

Designing attention, whether for individual cognition, AI systems, or organizations, requires awareness of physiology, environment, culture, and systems thinking. It requires humility about complexity.91011

The reward isn't immediate heroism. It's the patient generosity of coherence. A mind (or system) that moves through work like a tuned instrument, producing structures that survive erosion. In that survival, work gains meaning. And in that meaning, the accumulated hours of focused thought become a place worth inhabiting.


What's Next

This framework emerged from exploring how attention limits shape both human cognition and AI systems. For a deeper dive into how AI can reason within these constraints using parallel thought exploration, read:

Beyond Tree-of-Thought: Yggdrasil - Parallel AI Reasoning Architecture

Learn how to navigate complex solution spaces by spawning multiple reasoning paths simultaneously, each operating in distinct cognitive modes while respecting attention boundaries.

If you want a practical framework for using agentic AI without outsourcing your thinking, read:

AI as Cognitive Prosthetic: You are the art, AI is an interface, the result is value.


License

This work is licensed under a Creative Commons Attribution 4.0 International License.

You are free to: Share copy and redistribute the material in any medium or format Adapt remix, transform, and build upon the material for any purpose, even commercially

Under the following terms: Attribution You must give appropriate credit, provide a link to the license, and indicate if changes were made.

Footnotes

  1. Understanding AI. (2025, November). Context rot: The emerging challenge that could hold back LLM advances. Understanding AI. 2

  2. IEEE. (2025, January). Longer attention span: Increasing transformer context length with sparse graph processing techniques. IEEE Xplore, Article 11078479. 2 3 4

  3. Keles, F. D., Wijewardena, P. M., & Hegde, C. (2022). On the computational complexity of self attention. arXiv preprint, arXiv:2209.04881. https://arxiv.org/abs/2209.04881 2 3 4

  4. Cowan, N. (2010). The magical mystery four: How is working memory capacity limited, and why? Current Directions in Psychological Science, 19(1), 51 57. https://doi.org/10.1177/0963721409359277 2 3 4 5

  5. Cowan, N. (2001). The magical number 4 in short term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24(1), 97 185. https://doi.org/10.1017/S0140525X01003922

  6. Postle, B. R. (2006). Working memory as an emergent property of the mind and brain. Neuroscience, 139(1), 23 38. https://doi.org/10.1016/j.neuroscience.2005.06.005 2 3 4 5 6 7 8 9 10 11 12

  7. National Institutes of Health. (2024, July). Modeling working memory capacity: Is the magical number four, seven, or does it depend on what you are counting? Frontiers in Psychology, 15, 1262884. 2

  8. Anthropic. (2025, September). Effective context engineering for AI agents. Anthropic Blog. 2

  9. Spivack, N. (2025, October). The geometric nature of consciousness: A new framework connecting physics, information and mind. 2 3 4 5 6 7 8

  10. Max Planck Institute for Neuroscience. (2024, April). From cybernetics to AI: The pioneering work of Norbert Wiener. Max Planck Neuroscience. 2 3 4 5 6 7 8 9 10

  11. Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. MIT Press. 2 3 4

  12. Natarajan, R., Gaspari, T., Nambiar, R., et al. (2024). Associations between circadian alignment and cognitive functioning in a nationally representative sample. Scientific Reports, 14, 13857. https://doi.org/10.1038/s41598 024 64309 9 2

  13. Nielsen Norman Group. (2025, January). Working memory and external memory. Nielsen Norman Group. 2 3 4

  14. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain of thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022). arXiv:2201.11903. 2 3 4

  15. Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. (2023). Tree of thoughts: Deliberate problem solving with large language models. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023). arXiv:2305.10601. 2 3 4 5 6 7

  16. Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2023). Self consistency improves chain of thought reasoning in language models. arXiv preprint, arXiv:2203.11171. https://arxiv.org/abs/2203.11171 2 3 4

  17. Data Nucleus. (2025, September). RAG in 2025: The enterprise guide to retrieval augmented generation. Data Nucleus. 2 3 4 5 6 7 8

  18. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval augmented generation for knowledge intensive NLP tasks. In Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 9459 9474. 2

  19. Wikipedia contributors. (2023, November). Retrieval augmented generation. In Wikipedia, The Free Encyclopedia. 2

  20. Learn Prompting. (2025, March). Few shot prompting. Learn Prompting. https://learnprompting.org/docs/basics/few_shot 2

  21. Schulhoff, S., Ilie, M., Balepur, N., et al. (2024). The prompt report: A systematic survey of prompting techniques. arXiv preprint, arXiv:2401.14423. https://arxiv.org/abs/2401.14423 2 3

  22. Su, J., Tu, Y., & Xie, X. (2024). Zero shot prompting techniques for large language models. arXiv preprint, arXiv:2410.11123. https://arxiv.org/abs/2410.11123

  23. GeeksforGeeks. (2025, July). Role based prompting. GeeksforGeeks. https://www.geeksforgeeks.org/artificial intelligence/role based prompting/

  24. Watkins, J. (2024, August). Mastering advanced prompting techniques for large language models. LinkedIn Pulse. 2

  25. AAAI. (2024, March). Role based prompting in social media contexts. Proceedings of the International AAAI Conference on Web and Social Media, Article 35923.

  26. Wang, S., Li, P., & Wu, Z. (2023). Persona based prompting for large language models. arXiv preprint, arXiv:2308.07702. https://arxiv.org/abs/2308.07702

  27. Relevance AI. (n.d.). Implement ReAct prompting to solve complex problems. Relevance AI. https://relevanceai.com/prompt engineering/implement react prompting to solve complex problems

  28. Intuition Labs AI. (2025, November). Meta prompting: LLM self optimization. Intuition Labs AI. https://intuitionlabs.ai/articles/meta prompting llm self optimization

  29. Learn Prompting. (2024, September). Self refine prompting. Learn Prompting. https://learnprompting.org/docs/advanced/self_criticism/self_refine

  30. Data Unboxed. (2024, December). The complete guide to prompt engineering: 15 essential techniques for 2025. Data Unboxed. https://dataunboxed.io/blog/the complete guide to prompt engineering 15 essential techniques for 2025

  31. Pan, L., Albalak, A., Wang, X., & Wang, W. Y. (2024). Prompt chaining for multi step reasoning. arXiv preprint, arXiv:2405.18369. https://arxiv.org/abs/2405.18369

  32. PromptLayer. (2023). Prompt decomposition. PromptLayer Glossary. https://promptlayer.com/glossary/prompt decomposition

  33. Feng, S., Shi, B., Cai, Z., et al. (2025). Prompt decomposition for complex reasoning tasks. arXiv preprint, arXiv:2402.07927. https://arxiv.org/abs/2402.07927

  34. Machine. (2025, January). Big ideas for 2025: The intention economy. Machine.