ENFR

Tech • IA • Crypto

Today My briefing Videos Top articles 24h Archives Favorites My topics

Obsidian + Claude 4.7 = voici comment créer Le cerveau artificiel parfait !

AIParlons IAApril 24, 2026 at 06:00 AM31:14

Audio player

0:00 / 0:00

TL;DR

The viral Claude Opus 4.7 and Obsidian second brain concepts widely shared online suffer from fundamental implementation flaws, notably lacking effective prompt engineering and proper AI agent architectures, but can be improved significantly by applying Claude's advanced sub-agent system and vectorized search methods.

KEY POINTS

Obsidian as a Knowledge Base

Obsidian is a free Markdown-based note-taking software that allows users to link documents via keywords in double square brackets, creating a network of interconnected files. However, it does not function as a Retrieval-Augmented Generation (RAG) system. Instead, it works by simple keyword matching (regex), not semantic search or vector similarity, making it a raw and limited knowledge base.

Misconceptions Around RAG and Claude 4.7

A RAG system involves a retrieval module querying vector databases to find semantically relevant documents based on cosine similarity, then reranking and integrating them before providing answers from a language model. Obsidian's linking is keyword-based, without semantic comparison or vector embedding, making it not a true RAG setup. Claude 4.7 can read and navigate Markdown directories, but without an advanced retrieval system, its understanding is limited by raw keyword matches.

Challenges with Context Window and Distractors

Large documents overloaded with irrelevant "distractors" significantly reduce AI comprehension by 20-60%. Clean, structured data is critical. Parsing source documents—like PDFs or Word files—must focus on preserving logical text while minimizing irrelevant content. Using tools such as Mistral’s free OCR-powered Document AI allows for cleaning and incorporating images with annotations into Markdown, enhancing data quality.

Limitations of Karpathy’s Prompt and Current Popular Models

Andrej Karpathy proposed a three-tier Markdown folder architecture—raw data, a wiki folder with standardized files, and a claude.md instructions file. However, his shared prompt does not address long document handling efficiently, injecting entire lengthy documents into the model context, which leads to saturation of Claude's context window and rapid token overload, impairing performance.

Agentic Systems and Claude’s Sub-Agent Solution

Claude 4.7 supports creating multiple independent sub-agents, each operating in a separate context window, handling isolated tasks like ingestion, indexing, and querying. This prevents context saturation by only returning results to the main agent, enabling parallelized workflows and efficient knowledge processing, a strategy absent in current influencer-viral solutions.

Why Influencer Approaches Fail at Scale

Influencers promoting Obsidian combined with Claude 4.7 as a second brain often ignore the need for efficient prompt engineering and agent structures. Their methods inject all documents into a single conversation, rapidly exhausting token limits and data budgets, making these approaches impractical beyond very short texts.

Integrating Vector Embeddings for Semantic Search

To enable semantic similarity searching, a vector embedding system is needed. This can be implemented locally with models such as Qwen 3 Embedding, requiring significant GPU RAM (6-12 GB VRAM) and system RAM (~64 GB recommended). Embeddings allow matching related concepts (e.g., “car” and “automobile”) beyond exact keyword matches and are essential for scaling knowledge bases effectively.

Practical Hardware and Setup Recommendations

Efficient local vector search requires a modern GPU (e.g., 6 GB+ VRAM), ample DDR4 RAM (at least 64 GB for larger datasets), and a capable processor. Data must be chunked into digestible segments (up to 4000 tokens each) before indexing to avoid token overflow during queries.

Summary of an Optimized Knowledge System Architecture

An optimal architecture uses:

A raw data folder containing source Markdown files.
A wiki folder with curated, cleaned, and structured documents.
A claude.md file with instructions for how Claude manages and queries the knowledge base.
Multiple sub-agents handling ingestion, indexing (using keyword BM25 and embeddings), and query retrieval to avoid saturating the main model's context.
Vector database integration for semantic retrieval and reranking.

Critique of Viral Marketing and Transparency Issues

The viral marketing around Claude Opus 4.7 and Obsidian as a “second brain” relies heavily on hype rather than technical reality. Plugins for vector databases linked to Obsidian aren’t free and require additional purchases. Much of the excitement lacks acknowledgment of the technical complexities or costs involved.

Training and Expertise in Prompt Engineering

Effective use of Claude 4.7 demands advanced prompt engineering and system design skills, which remain scarce even among developers who excel in machine learning. The simplistic prompts circulating online do not harness Claude’s true capabilities, especially in managing massive knowledge bases or implementing agentic workflows.

Future Directions and Improvements

Properly leveraging Claude 4.7 involves crafting prompt systems that:

Precisely define sub-agent roles and toolsets via YAML configurations.
Enable context forking to isolate processing from the main conversation thread.
Implement advanced keyword indexing combined with vector-based semantic search.
Use parallel sub-agent execution to improve speed and reduce token waste. Building such a setup can transform Obsidian into a genuinely powerful second brain rather than a mere linked Markdown database.

This insight reveals that while Obsidian and Claude 4.7 hold promise for personal knowledge management, critical architectural and engineering refinements are essential to truly realize their potential. Current social media trends often overlook these challenges, propelling impractical methods that strain AI models and resources. Adopting sub-agent workflows, clean data pipelines, and vector semantic retrieval is key to building efficient, scalable AI-assisted knowledge bases.

Full transcript

In this video, we're going to debunk a topic: Claude Opus 4.7. You're hearing about this absolutely viral subject everywhere: 17 million people are flocking to it. Boom! Every influencer is jumping on the bandwagon. Every influencer is talking about the possibility of creating a second brain with Obsidian and Claude 4.7. How to create a more powerful second brain? All based on Karpathy's architecture. Andrej Karpathy, the former founder of OpenAI, dropped a marketing bombshell: what if we replaced RAGs with a local knowledge base on your computer? In this video, we're going to debunk what the influencers are telling you today. Are you able to tell who's telling the truth in the field of AI? Who actually has expertise in AI? And at the end of this video, I'm going to show you the structures to use to develop a knowledge base system by coding systems within Claude Opus 4.7, including the entire structure, prompts, links—in short, you'll learn everything about the Obsidian knowledge base and Claude 4.7 in this video. Let's get started. First point to understand: what is the Obsidian brain? The Obsidian system is very similar to Notion-type knowledge bases. The Obsidian software allows you to create a document, and this document is a writing surface where you can write in Markdown format. It's primarily a Markdown file reader. And it has a unique feature: you can insert keywords using the "double bracket" function. If you use double square brackets and enter the term "house," for example, as a keyword, you can create a brand new document, enter the same term inside, and you've created a system that will link two documents. In other words, the Obsidian system will search for a specific keyword within the document you created. So, how do you use this to create a knowledge base within Claude Opus 4.7? Good news: Obsidian is free, and to download it, simply click the "download" button, then "Download Obsidian." You 'll find all the compatible interfaces, and you can install it. Now, what you need to understand is: is the view you're seeing a RAG-type knowledge base? And this technical point is extremely important to grasp. From a distance, you do see small dots moving around in space. But is it really a RAG system? A RAG is a Retrieval-Augmented Generation system based on what's called semantic search, keyword-based, and cosine distance. This means that at some point, we created vector databases of your content and defined the distance between concepts within that content. This means we created a system composed of three parts. We have the user who asks a question; a system called "retrieval" that queries a vector database, the RAG database; a system that retrieves the most relevant elements to inject into the context, also called "retrieval"; and we can add a function called "reranking," which selects only the best documents and then sends this information to be added to the LLM's responses, which then delivers the final answer. The first thing that happens when you become aware of this system is that it's not a regular expression at all; it's simply the identification of a keyword. So, it's based on what we would call a "regex." It's the identification of a specific keyword that will essentially allow you to create a link. My documents are completely empty; they don't talk about the same thing at all, and yet... Obsidian tells me they're linked. So, first of all, the Obsidian system isn't a RAG system at all; it's a system of words compiled in Markdown format, essentially a raw knowledge base. Now, can we use Claude to navigate this database? Yes, absolutely. When you install a database, you get a directory. This directory has what's called a path. You have the path that appears in "copy your path." So, regardless of whether you're in Claude Code, Claude Cowork, or another system, you have the path where you can work. You can add a directory, create a project, and specify the location of the document you want to use, and therefore the entire knowledge base in Markdown format within the document. Markdown is a structured text format, generally using hashtags, separators, and headings, which makes it relatively easy for AI to understand blocks of text. But the Markdown format is also how we code the instructions for AI systems. For this occasion, you have before you the prompt for Claude Opus 4.7, which was extracted just a few hours ago. Yes, the Opus 4.7 model has just been completely unlocked, and we have access to its internal information. This teaches us a lot . When the Markdown format becomes very long, it's structured with an XML layer. So why do we work with Markdown/XML formatting ? Because the problem with understanding contexts is that the longer the context, the less the model is able to correctly understand what's happening. This is called a loss of context, also known as "lost in the middle" in prompt engineering. We've already addressed this problem in previous videos, and it explains a crucial point. When you introduce documentation or data, if you have a document containing what are called "distractors"—that is, irrelevant elements—four irrelevant elements in an interface can reduce an AI's understanding capabilities by 20 to 60%. And it's even worse when you ask the same model several questions simultaneously. So the first thing to understand is that your data, what you're going to input, must be cleaned as much as possible. We use several methods for this. We use PDF parsers. We clean the entire PDF document to keep only the logical structure of the text. That's the first step. This will allow you to then retrieve the structure and import it properly into your document. So you can create it manually, add the document structure inside, and give your document a title. This way, you already begin to reduce the risk of introducing distractors into your context. This work can be automated with AI. The problem is that doing this consumes a huge number of tokens. Therefore, we need to develop a system that codifies the extraction method and the extraction rules based on the length of your document. If there are 50 pages, if there are 100 pages, that's a lot of tokens that will be used by the system , and as you know, AI is very expensive. So sometimes doing it manually can be much more cost-effective than automating tasks that can take 40 minutes, whereas doing it manually would only take 1 minute 30 seconds. The first issue is that when we use this type of system, we lose all the images. So we can't recover the image information. So how do we do that? Mistral has developed an OCR system for us that allows us to integrate the information Images are stored in the model's knowledge base: tables and images within your document. Mistral offers a completely free system called Document AI that allows you to upload 10 documents at a time and supports PDFs, images, and Word documents. By configuring the interface, the number of pages you want to include, and which images you want to include, you can perform a cleanup phase and retrieve only the information , transforming it into Markdown format with image descriptions embedded. You can further customize the settings by integrating image annotations using structured data models. You create what are called "datasets" where you define the image description, the value annotated within the image, and change the format by specifying whether the information should be mandatory or optional. Adding structured data will allow you to go further and, for example, have complete descriptions, enabling you to incorporate more nuance into your content. The database you create will therefore become increasingly precise and will allow you to integrate what the model would have missed using OCR systems. So, why am I doing this manually? Because it can be automated, but the problem is that the model won't be able to identify, for example, this section here, which is a distractor. As we mentioned, distractors are very rich in keywords but don't actually provide any functionality. And we have two pages of distractors. So, in the configurations, in this specific case, for example, I realize that I can stop at page 2 and remove pages 3 and 4. Therefore, in my configuration, I will only extract pages 1 and 2 from my structure. This will prevent my system from getting cluttered. What I need to do now is download this database directly to my computer and add it to my knowledge base. This system will allow you to have optimized, clean databases. You only need to give your structure a title, but the content integration has already been done by the model, and you can go even further because you also have the retrieval of all the images with their corresponding JSON structures. This means that you can, if you wish, now integrate the entire image description into the knowledge base so that the model can create a much more accurate representation of your content. So we are creating a knowledge system for your AI. And what is very important, as you understand, is the quality of your data. The quality of your data will define what the model will understand. Now, in the representation, you realize something: we can't actually rely on the system's representation at all because the topics we just covered are based on keyword recognition. Since my document calls files, it will only check if the name is present within those files. Again, this doesn't mean that we're dealing with a RAG (Regular Aggregate Question) with Obsidian. It's simply what's called a "regex," the identification of a keyword within content. And as you've seen, we can absolutely create it manually with the double square bracket function, thus integrating relationships within the systems. Now, how do we connect all of this with Claude's functions? What's happening with what we're creating is essentially a knowledge base with a huge amount of context. The point we're going to face is that the longer our context, the more the model's performance will decrease. And For this, we only have one solution: to have an extremely clean base with no distractors, otherwise we will crash the model's capabilities. Secondly, Karpathy proposed a three-tiered architecture. We will therefore consider this three-tiered architecture, see what it actually brings, and how we could improve it. Karpathy tells us: "We will use pure Markdown files , we will extract native links into a file called RAW, and then we will have Claude work to create long-term memory. And to help Claude structure itself, we will create a directory that will be marked with an index." That is to say, we will tell it where the files are located within the folder. So we have a RAW folder, which is the folder with your original file, which is the raw data. It could be a PDF, it could be something else. The "wiki" folder is the layer in which the model has written in a standardized structure. Generally, it's a Markdown file. And we have a "claude.md" section that provides instructions on how the model should work when it uses this directory. So, I'm going to explain the problem with this architecture. It's not good. Here's why. With Opus 4.7 Claude, what you need to understand is that we have an agentic system. This means that Claude is capable of sending AI agents to work in parallel or sequentially, completely independently of the context interface. For those who don't understand what that means, I'll put it another way. When you use the same conversation to write a prompt, think, and use tools, you accumulate context within the same conversation, overloading the model, which automatically results in you working in the model's weakest areas. You load all the documents—20 pages, 40 pages, 50 pages—you activate the tools, it connects to the internet, it analyzes them, and in short, you quickly end up with over 250,000 tokens. And that's where the problem lies because you'll have to switch to the 1 million token model immediately, which costs three times as much. And yet, what will happen is that you'll automatically saturate the context window and find yourself in areas of lowest performance for understanding. So, what does Claude advise us to do? Claude has developed several strategies. We can use the "compact" function strategy, "sub-agents," the "rewind" function, or the "clear" function. Each has a different capability. When you have a lot of documents, which one would I choose and why? What we want to achieve is the ability to search for documentation without crashing the model's response window. And that's possible. Here's how. In the official Claude 4.7 documentation, they explain that the most flexible approach is to have Claude use sub-agents in the conversation. This allows you to use sub-agents to explore the functionality of your request. It's important to understand that each sub-agent will have a defined function and will only return the answer to the main conversation. This type of architecture completely solves the problem of context saturation. Your main work is done in the "parent" conversation. Your sub-agent performs the compaction, search, and comparison functions in your Obsidian system and delivers the answer; only the answer will be added to your discussion. This completely changes the architecture of everything you've been told on social media about how to use Obsidian to create a second brain. So in this segment, we're going to see how influencers have told you to use it, and I'm going to show you how. In fact, you have to code the AI agents to achieve this. "In this video, we're all going to create a second brain and then we're going to take all the text that was on the current link." The current link is Karpathy's initial prompt . And I'm going to tell you why Karpathy's prompt is bad. I don't mind analyzing what it does at all, and I'm going to explain exactly what needs to be fixed. Most people today do n't have prompt engineering skills. What I'm going to show you is that in France, I think we have very good developers, very good machine learning experts, but we don't have very good prompt engineering experts. So, fortunately, as I was going to say, I'm going to give you a rather firm opinion. If you read Karpathy's prompt objectively —and we'll take the time to understand it, it's really important because, for me, Karpathy just threw out an idea today—if you read the prompt, you'll realize the actual level of prompt engineering. Don't worry, I'll give you the solution later. "Copy and paste my prompt into Claude Code, Claude, or Codex. Here's the prompt. Most users' experience is working within a RAG system. You download a collection of LLMs, blah blah blah blah blah. You'll create different folders. You have an architecture with raw source code, a wiki, and a schema. You'll ingest, query, and regularly check the wiki's status. You'll create an index, a log, and I'll give you some tips. And at the bottom, this document is intentionally abstract to describe the concept of a specific implementation." Now, that's not a prompt. Honestly, a system prompt looks like this. It's an extremely codified, rational code structure that balances the freedom of a model, decision-making, and thought processes. Let me explain what's happening. So, he copied and pasted the prompt, and here's the result from GitHub. When you actually look at the structure of the instructions, you realize that the model has only done one thing. It created commands: "Ingest a document and you give me an index file." So where's the problem? Well, it works fine if your document is 800 lines long. But if it's 200 pages long, how will the system work? It won't. So, the other point is when you perform a "query," a search within my document. Will it search your document? Well, not really. What the model will do is inject your entire document into Claude's context. And that's precisely what we don't want. That's exactly what Claude explains. When you have to manage long contexts, what will cause poor performance is injecting contexts that are too large. In that case, you need to create sub-agents so they have their own context window, and therefore their own workspace, to perform all the necessary tasks and then inject the response back. So the problem with Karpathy's prompt is that, ultimately, it doesn't say how to solve the problem. And worse, when you read the generation section, "how to use Claude Sub-agents in Claude Code," which explains how to optimize the use of Claude 4.7 in Claude Code, what they tell you very clearly is that sub-agents are an isolated Claude system with its own context window. It takes on a task, executes it, and returns the result. The idea, then, is to create instances/functions associated with an agent. Each agent will start from scratch without being burdened by the conversation history of other agents. Therefore, several agents can run in parallel, each with different permissions . It's necessary to differentiate between the main agent and the smaller agents. They will execute the tasks we just discussed. When should you use sub-agents ? Well, it's very clear. When context gathering is necessary and dozens of files need to be written. That's exactly our case. We'll have hundreds of documents, dozens of documents. It's therefore absolutely essential to separate the execution function from the agent function. So, you need to create several independent tasks, and in certain architectures, you have to consider whether you can send several agents in parallel, in batches, or simultaneously. I've discussed this in other videos, just in the description. I'd also like to remind you that you have more than 24 hours left for the training. As mentioned last week, I'm going to increase the price of all my training courses. Today, I believe I have one of the highest levels of expertise in the field of prompt engineering. I've included all the skills related to AI agent systems, ChatGPT, Claude, and agentic systems in these training courses. I 'll explain how to create datasets, how to use them for your business, how to optimize the performance of AI agents, and how to develop AI agents for your business. All of this will be more expensive in 24 hours, so it's up to you. What you need to do is set up what 's called a "pipeline" workflow in an agentic system, and I'll show you how to do that later. This means you'll have an agent to explore documents, an agent to examine and then analyze them, which can work in parallel or not. I've made other videos showing you how to send parallel agents to certain areas. The structure, in fact, makes it very simple: we can define three sub-agents that will work on different functions. And to use the optimized system, it needs to be structured this way: integrate a name for the sub-agent, a function description, the tools the model is allowed to use, possibly the reasoning model, and it 's also possible to work in a separate context. These are called "forked" functions. Then you give the instructions to the system. So, don't worry, I'll show you the final result and tell you where to find it at the end of this video. This means one very specific thing. Currently, the Karpathy prompt absolutely does not allow you to create this system. Consequently, the Karpathy prompt leads to this type of architecture, similar to what this influencer did, which unfortunately keeps all the context, all the information, and injects it into the same interface. This is what you shouldn't do. When you have 20 lines, it's fine. When you have 200 pages, it's unmanageable. The problem is that this Karpathy video has been shared 14 million times. And out of those 14 million, I have n't seen many people realize, "But that doesn't work." Everyone jumped straight into a marketing and advertising frenzy. Let me explain what I think about this. "Just validated the use of Obsidian combined with LLM. One of his latest tweets got no less than 17 million views." Yes, 17 million views. I thought it was 14. "All the projects I'm working on, quite simply. Maybe it would even be a separate section of the vault now because my biggest challenge at the moment is being able to synchronize the vault between several documents in the same way as..." Actually, what we showed you was a sphere with dots, but you understood that spheres with dots are just a keyword present in the database. It does n't mean the documents are actually linked. That's the first point you need to understand. Then, you have other people who have gone in the same direction and who... you'll realize, if You thought Claude 4.7 was expensive, but now you're sending entire blocks of context that can take 40 minutes. I don't know which of you won't burn through your data allowance with this method. That's why we need to completely abandon these proposed methods. "So, I'm going to get the prompt. You'll find it in the description. So look, it's here. Knowledge Vault Generator Pilot. There you go." So, here's the link to the prompt. I 'll put all of this in the resources, I'll tell you where to look later. The problem is exactly the same. You don't have any agents, you only have functions, and even then, there aren't even any commands. So when you tell an AI: what are the blind spots, what are the overlaps, what are the redundant elements? How is a model going to compare documents? You realize that for something that's 10 pages long, it's possible, but for documents, three or four documents of several pages, it 's not possible because you'll overload the system again. And that's where the last part of the video comes in, to explain why you need to use vector databases in this type of structure. First, because just the writing process takes 20-30 minutes. So that means you're paying for tokens for 20-30 minutes. In terms of cost, it's exorbitant. Then, since there's no AI agent, everything is done in the same interface. So that means you're constantly using an interface that requires 1 million tokens . So the entire cluster writing phase takes 45 to 65 minutes. If you don't have a $200 account, it's not even worth it. So, I 'm going to explain the solution and tell you where to find it. First, do n't use Karpathy's prompt at all. You understand, it's not good prompt engineering. Second, we're going to apply Claude's exact processes for creating custom systems. This is the official documentation. And what does Claude tell us? Claude explains that to create these systems, you need to configure the sub-agent using the YAML name, description, tool, model structure. That's the basic structure, and then there's the system's workflow function. Within that, you can add effort variables, including tools, which will allow you to add tool functions. And actually, you can add two complementary elements. In addition to tools, you can add hook functions. I'm not going to cover that here because it's technical, but what interests me are context fork functions, that is: make your agent work in a new context and inject only the response. And that's exactly what we're going to use, this template, to work with. Here's exactly what you need to do for your system. You know that in the code structure, what will happen is that you're actually going to create three agents. The Wiki Ingestor system for Claude 4.7, which will work with Obsidian; the Wiki Librarian; and the Wiki subagent, which will be used to create indexing and clean up the databases. When you create agents, you automatically have to create what are also called commands, or trigger systems. It's right here in the bottom left that you have the "command" function, and that's where you trigger your agent's function . So you call an agent for a specific function. And look closely at the structure. That's what's important to understand. When you trigger the Wiki Ingestor, you launch a complete workflow with steps that will execute in a subagent. The subagent is named wikiingestore.m. We're going to find it. Look at how the code is written. You have the name, that's what I said to do. You have the function description, you have the model that will work, the tools, and the famous fork context. And That changes everything. It means all the work will be done in a parallel window. So, all the tokens that will be processed and used to respond to your request will be generated, and only the final response will be injected into the interface. Therefore, your main agent will have an optimized context. That changes the whole AI game. But what I'm about to reveal is the most important point. First, I'll give you the link, and I'll allow you to access it directly in the "receive the basics of AI by email" interface. I'll include all of this in the training on Claude. You'll have all the links and directories that you can copy, paste, and install for your interface. All the code is already written. But I'm going to talk about a much more important and technical topic . What we're doing by doing this is systematically sending large blocks of context. So, in fact, we're going to send entire pages of context. Actually, my main concern—and I'll explain the solution to the problem—is that we don't have a RAG system , and the only way to compare the meaning of words is to use what are called vector bases, and therefore, necessarily similarity values using cosine functions. All of this is very well handled in RAG functions. So, how do we integrate it into this system? I'll explain. What I discovered is that when Karpathy talked about Obsidian and kindly brought up his prompt, he didn't actually solve any of the problems. Not one. My perspective on Obsidian... Well, I think Obsidian is an interesting product for creating Markdown file databases on your computer. The problem is that you can't do what was suggested at all, not at all. And in fact, if you look closely, I think it's a big marketing ploy. But wait, I'll explain how to solve the problem later. Because they came up with a plugin for creating vector graphics, and when you click on it, you realize that the plugin isn't free. You do have access to vector graphics, but you have to pay for the vectorization. Vectorization is n't free. So, if you want to know, I'm wondering, wasn't it Mr. Karpathy who used his public profile and network? 17 million people flocked to the topic. Boom, all the influencers jumped on the bandwagon. However, without really understanding how a RAG, a query, or a context works, because clearly, none of them understood what we were producing by doing what they were doing. And then you realize that if you want to automate it and not bother with it, well, you have to buy the plugin. So, I think it's perhaps a marketing ploy that doesn't admit it, but it's often the case on social media. They launch a topic, and in fact, the product is nothing more than a Markdown database. That's the observation. To optimize it, you need to create AI agents. But if you want to do it yourself, here's what you need to do. In fact, it's mandatory to implement a keyword search system within your document. If you don't do this, it's impossible, impossible for the model to optimize its searches and avoid using up your entire data plan in less than half an hour. So, how do we do this? First, if you want to implement it, I've been coding the system for 22 hours, but I'll explain the principle. 1. You need to implement an indexing system based on a BM25 function, meaning you put keyword detection inside your search system. So in your In the document, you'll detect keywords. It's a regex system. If the model detects the keyword, you retrieve the document and send it to the context. Later, I'll talk about another, even more technical strategy. Then, if it doesn't find it because your keyword isn't exactly the same—for example, you 're looking for a car and it says "automobile"—well, it won't find the term. So in that case, you have to implement a vector system. And in that case, we can do it locally using Qwen 3 Embedding. And in that case, you need a computer. Here are the specs you need to do this: a Qwen 3 Embedding 4B. Honestly, with 6 GB of RAM, it's sufficient for the graphics card. I was going to mention RAM ; I recommend at least 64 GB of DDR4. If you have more, even better. Those with more resources can use 8B embedding. You need at least 10 GB of VRAM, and you install this system to create your database directly on the computer. So, how much space does it take ? Roughly speaking, if you have 100,000 chunks—because that's quite a lot— it takes 600 MB and about 1 to 2 GB of RAM. If you have 1 million chunks, you're looking at roughly 6 GB file size and 8 to 12 GB of RAM. That's why I said 64 GB of DDR4 is usually sufficient. So, this requires computers with a relatively recent processor. The processor isn't the most important factor; what's crucial is having RAM and a graphics card. Then there's a system called a retrieval/rerank. It will select the most relevant sequences to inject into the context. And all of this is done in an AI agent that is separate from the main agent's context because we will always use the fork function to retrieve the context. So you see that an agentic architecture is a working architecture. Regarding the diagram, I'll give you the link if you want to understand the architecture, how it's structured. What you need to understand is that when you create and ingest a file, you can't send 200 pages, not even 40 pages; it's not possible because vectorization systems are limited to 4000 tokens. So, you need to implement a chunking function . Therefore, there's a workflow to set up: how you chunk, on what basis, and thus implement all of that within your machine itself. So you realize that the problem is much more complex when it comes to developing a real brain from the Obsidian base with Claude 4.7. If you leave the basic structure, you'll completely saturate the context window, and Anthropic explains this very clearly. You 'll overload your interface with a million entries. The only solution we have is to create at least one sub-agent system. That's the basic system we need to implement. And if you want to develop a knowledge base behind it, you'll need to add an embedding and retrieval model to search for your data. It will be very fast. Once it's vectorized, it's extremely fast and it really allows you to compare the semantic differences between concepts. So I'll show you all the information in the interface, the basics of AI. And remember, this is the last weekend before the training prices increase. I think you certainly have the best level of AI training available today, enabling you to apply it in a business setting. See you soon, until next time. Subscribe if you haven't already. See you later.

More from AI