
Tech • IA • Crypto
A structured “second brain” built with local databases and AI agents can outperform brute-force large-context approaches, which suffer from cost and accuracy limits.
Expanding datasets alone is unlikely to achieve human-level intelligence, as current language models lack key structural capabilities. Increasing context size introduces diminishing returns, with higher computational costs and declining accuracy beyond certain thresholds. Large inputs can overwhelm systems, leading to inefficiencies rather than better reasoning.
Excessive context usage significantly impacts both cost and model performance. Systems operating beyond a few thousand tokens experience sharp accuracy drops, in some cases falling from around 76% to 36%. This makes indiscriminate document injection impractical for real-world applications, especially when handling large knowledge bases.
Obsidian functions as a Markdown-based knowledge system that allows structured text storage and visual linking between concepts. Unlike traditional RAG (Retrieval-Augmented Generation) pipelines, it does not inherently rely on vector databases. Instead, it emphasizes local, human-readable organization combined with lightweight retrieval mechanisms.
A key architectural shift involves stopping at the “chunking” stage rather than pushing data into vector databases. Retrieved chunks are stored locally and accessed through AI agents. This reduces infrastructure complexity while maintaining fast access to relevant information.
The system relies on three specialized AI agents: a search agent, a clustering function, and a semantic retrieval agent. Initial searches use keyword-based algorithms such as BM25 and TF-IDF, delivering results in milliseconds. If no match is found, semantic indexing takes over to identify relevant content.
A central index file acts as a navigation layer across directories and subfolders. This enables efficient routing and avoids scanning entire datasets repeatedly. Maintaining and updating this index manually is critical, as fully autonomous management remains unreliable.
Instead of loading entire documents, the system retrieves only precise chunks of relevant data. Tests show minimal context usage—around 11–12%—while still delivering accurate results. This approach prevents unnecessary token consumption and keeps operations fast.
Fully autonomous knowledge systems are not yet feasible across diverse domains. Mixing subjects such as finance, marketing, and personal data creates complexity that current models cannot manage independently. Human intervention is required to curate, structure, and guide retrieval processes.
Running AI agents locally or semi-locally reduces reliance on expensive API calls. Previously processed data can be cached, avoiding repeated charges. However, restarting workflows or poorly structured queries can still increase costs significantly.
Building such a system requires substantial effort, with development taking roughly 40–50 hours including testing and optimization. The architecture involves multiple components, including agents, scripts, indexing logic, and validation workflows.
Despite the benefits of local systems, RAG architectures remain the most efficient for precision tasks requiring targeted data injection. They offer stronger accuracy and scalability when properly implemented, though at higher complexity and cost.
Efficient AI knowledge systems depend less on massive context inputs and more on structured retrieval, indexing, and agent-based design, highlighting the need for hybrid approaches over brute-force scaling.
I've been a professor at NU New York University for 23 years. Learning Modeling (LM) is revolutionary, but it's not a direct path to human-level intelligence. It lacks something essential to achieve that kind of intelligence. So, the still widespread idea that simply increasing the size of the data sets is enough to reach human-level intelligence is, in my opinion, completely false. So how do you create a secure second brain for your data? In this video, I'll explain how to create a database with Obsidian in the cloud interface. But what I'll teach you for the cloud is also valid with GPT. The goal is to optimize our system's operation. What you absolutely must avoid is entering the context overload zone. The zone where you overload the model with too much context will exhaust your data plan and, more importantly, will cause a performance hit. We're going to try to code everything together and deploy and test a local Obsidiane second brain system using Cloud Code, and we'll do it together. [music] What's the main difference between Obsidiane and a RAG system? Obsidiane is a Markdown database, meaning it's a system that allows you to read text. The advantage of this system is that it also allows us to create visual representations where we define relationships. But be careful, it's absolutely not an AR. In previous sessions, we looked at RAG systems. We worked with data sources, inserted metadata, and performed chunking. After this step, we sent them to a vector database. This time, we're going to stop at the chunking stage. The goal is to create a chunk database directly on the computer, but we're going to code AI agents that will retrieve this data. So, the challenge, and what we need to avoid now that we've cleaned up our databases, is saturating Claude's context. Claude is an AI that quickly becomes expensive. So, to optimize Claude CLI, we're going to code AI agents. Here's what we 'll code in this video. Don't worry if you don't code. I've already coded everything. But there might be bugs, and I want to show you how we'll fix them, how we'll optimize. It'll be a way to learn. Our system will be designed with three AI agents. One system will search the database. A function called a cluster function is a keyword search launched by Cloud, and we'll use an algorithm, BM25 and TDF. These are two algorithms that respond on the order of milliseconds. It's incredibly fast. If the two models can't find a match in the Obsidian database, an agent will use the indexed structures. When we create our second brain knowledge base with Obsidian, we need to create an index structure. This index will allow us, thanks to semantic recognition (so we're no longer relying on the algorithm), to search for KWords. This way, we'll avoid overloading the interface. We'll test it, deploy it together, and if it works, I'll put the corrected version in the training materials, in the courses. In the previous tutorial, I showed you how to extract data from PDF documents, the identification metadata. Once we've prepared our database, we'll put it directly into the interface. So, we'll find our root index and the structure inside the chunks. What interests us now is deploying the software. For deployment, I'll probably use Cloud. I'm going to start a new conversation. So, you press escape twice, clear the conversation, and then switch to the ringing template. In terms of logic, we'll use Xi, Cloud Sound 46 Xi, and on top of that, I'll add an Avisor system to handle certain issues. And we 'll put Opus 4.7 on top of that. We'll optimize things this way, so basically, I have everything The coded elements are here. We're going to deploy it using Opus. So I'm going to tell it the installation directory and the root directory. To get the directories, you right-click and go to "copy and paste." We'll give it the root directory for the commands, agents, and scripts, which will be the main directory. We're going to do this live, for an installation tutorial. We'll see if it works and if there are things to correct. For a boot system, I always create a Redmi file. This generally allows the model to tell it to follow the steps to install the packages and to ask me questions if anything is unclear. What I advise you to do is switch to plan mode and check how it receives the information. So we'll start there and see what questions it asks me, what it needs. So it sees that we have the Redmi BM25 file. So it will explore the complete structure of the file. So, I always create an installation file. It's a bit like a program where you explain the order in which it needs to run, what packages it needs, and what tests it needs to run at the end. And how it needs to verify its actions. Traditional loops must always consist of these three sequences to work. You always need to be in an agent-based loop system. Every time you work with Lia, what you need to keep in mind is that your prompt must tell it what to retrieve, when, and what action to take. In my opinion, you shouldn't start directly by saying, "Code something for me." I think you should always prepare the entire structure beforehand with the code elements that are already inside, and then tell it to take the installed elements from the directories, and then we check if the test—so we have a test suite—works. If you want to learn how to use artificial intelligence, you have to move beyond the logic of a perfect and workable prompt. What you'll learn in this training is what a dataset is, and how to optimize contexts, memory, and memory. You'll learn to code your own AI agents for your business. All the information is in the description. Whether you're a beginner or advanced, this training is designed to make you the best in the market in less than 15 days. 80 hours of coursework, including updates, to complete at your own pace. All the information is in the description. So, the installation: where to install BM25. Okay, we'll create a BM25 folder, that works for me. So, let's confirm. In your interface, I recommend setting up a caching system like I did here. This allows me to constantly monitor the progress of my contextual memory. As soon as you enter a 7080 zone, you know it's going to be full. So, anticipate the problem, especially with Active Directory. Otherwise, you're still dealing with 1 million tokens. And the problem with 1 million tokens is that on Sony 4.6, we have 76% accuracy, which is pretty good. But on the other hand, that's not the case at all with Claude 4.7. We have a very large drop; it's at 36%. So today, we have to wait for Anthopiic to resolve this accuracy issue in the pop-up window. This means that when you exceed 2500 tokens, you pay more and lose a lot of accuracy, which will inevitably lead to errors. So always work with control over your work window. Another thing to understand is that when you send a system, it will read the entire package. So that means that everything that's coded will be sent to the pop-up window and will be cached. The advantage is that once it's cached, you don't pay for it a second time. Everything that has already been sent into the anthropogenic system will remain in memory. We won't pay for that multiple times. What we will pay for, however, is restarting the system. To tell him, "Now you're going to execute." The crucial starting point is to always begin a system with a plan mode. This allows us to check if he understands what he has to do, to clarify if he doesn't understand, and it's the only way to get workflows, which are going to be lengthy, to run. Right now, if you look closely, he's been working for about 6 or 7 minutes, and when he installs everything, it could take maybe 15 minutes. So, to stabilize the systems, always start with a plan mode. Now, the plan mode system—remember, the function is the tab. You click on it and switch to the plan mode system, which you have here at the bottom. Let me add a quick note. In the settings, you can see that I have access to his entire reasoning system. So, I've activated the reasoning system by default; you don't have it. You have to go into the settings to access it. In the settings, you have a variable called verbosity, and that's where you activate it. This allows you to access its entire reasoning system. So, we're going to check the model's behavior. We'll install the BM25 kit, which includes the RP function algorithm, from the files. It's very important to have coded beforehand. This will save time and allow us to also check the directories. So, it understands my installation directory, the Cloud root file, the source, and the location where it will place the algorithms and libraries. We'll use PowerShell to install the scripts and assets, deploy Cloud to the assets, and substitute the paths within the deployed assets. This is perfect because when you're coding, you can't necessarily specify your own directory. I'll show you a file so you have an idea. When you're coding, you're using generic directories. So, there needs to be a substitution of directories. And this is one of the commands I entered in the code. And normally, there are the environment variables, a verification test at the end. So for me, everything is good, he understood everything. And then you have the code section, here you have the response from `ll`. What I always look at is how he reasoned. I'm interested in understanding that. Once everything is good, well, I can send it and we'll move on to bypass permission. So for the bypass permission functions, I'll just show you the command if you don't know it. You type `cloud-`, so `dash`, `dash dangerously`, skip permission (plural). This actually allows you to put Cloud in unlimited mode on the write plane. Okay, let's go. So we'll start the deployment and installation. For me, well, I'm giving you my opinion based on what I've seen in Obsidian tutorials and all the advice we've seen from influencers. I'm really giving you my opinion on the matter. First, what you understand is that we can't send 58-page documents every time you ask a question because we'll blow our data allowance in four seconds. Second, it doesn't exist—I'm telling you, honestly— a self-managing system, capable of handling Obsidian on its own by storing everything in memory. It's not possible. I won't go into too much technical detail, but the model isn't capable of working with all types of documents, across different subjects. If you only deal with one type of subject, you can configure a system. But if you work on different subjects—accounting, marketing, and maybe even personal documents—you can't create a system that manages everything at once. You could, but it's very complex to set up. So, what I've seen, and in my opinion, is that it's much better to intervene as a human at the right time, to choose to have only two functions and not to overwhelm the machine. The two functions are, one is asking questions, and the other is managing. The indexes. The whole architecture is designed to be functional on a small computer. So there's no rogue mode, but we'll use index captures. So now we'll test it, we'll see what happens. Okay, let's open a new session. I'm going to switch to split terminal so we can see if there are any corrections to make. Regarding the sonnet, this time for the reasoning exercise, I'm going to stay on medium. Now, medium isn't amazing, I told you the documentation says it's the minimum. We'll try it anyway. We 'll see what happens. So what we're going to try now is a wiki query. So I'm going to see if my functions are present. So here it's waiting for my request. So, wiki query. Since the documentation will mention dieps, we'll ask it the question. Now, a point we'll address right away since the functions are rejex functions. So, Rejax functions are functions that search for keywords. We won't be using cache or context at first. We'll send Python algorithms to perform the parallel search. So, what we want to search for will have to be typed in the same way it's probably written, like the English document. I'll give it some words to search for. I'll tell it to search for the author, the date, just these two elements to start. I just want to see if we get a match. You saw what happened, it was incredibly fast. It didn't find a direct match; it switched directly to the AI agent. And we'll see how the agent works. I'll check. I'll see if it looks at the correct file. So, it took the index file. If you press Ctrl+O, you have access to the entire reasoning process, which unfolds. So, we see that it delegated to Wiki Query. This is the function I typed. It sent a search for the author and date. We have the pre-configured values. I set a top K value of 5 and a depth of 3. So, for the first element, it used a glope and grap function. It did its job well, searching the entire directory and finding metadata—of course, it was the original file with the terms "height" and "date." So, it matched two words. That's interesting. That's why I'm telling you that metadata is extremely important. So, we have a match, so it's the original document. I'm thinking about this with you now, how we can optimize it, how we can implement it. What we need to avoid in the end is injecting the entire document. That's not the goal. We won't have saved any time. From the index, it was able to retrieve information, which is the publication date. OK? And from the index, it retrieved result number 4, which is chunk 01. This is exactly where the document is located. So we can go and check it there. So if we go to chunk 01, we see that here there is "height," it's not named, it's "dipsic," and there's the date. So the information is here. So this is where I wanted it to go. That's what it did. OK, we'll do another test. We'll see what happens. So we'll start with a clear and we'll test if it works. You saw in terms of context management, right? Honestly, we didn't load that much. We stayed at 11 %. OK. So it's really light, right? We didn't have 58 pages that went into the interface. So my goal is to ask the same question again. So, normally, all the documents in the R folder are excluded from the solution. But we'll check anyway. So we do a clear. We start with a clean context. We run a wiki query. I'm going to ask it to find the Spars concept. So I'll see if it matches within the documents. What's quite interesting is that the model will still search directly in the file. So the setup doesn't seem too bad. It has excluded all the elements that are in The R file. So that's pretty good. Control O. We'll try to access the rest. Now, what I don't see in the way it works here, I didn't see it start on the index file. So we'll add that. Instruction modification starts on the index file, multiple write index-x-index of the directory. So we'll require it to read the index file at startup. So, as I said, that's how we're going to correct it, how we're going to optimize it to make it more and more efficient, more and more lightweight, because this time it loaded more documents, both for the semantic wiki query and for our wiki query system as well. So, did the model respond to us? Well, look at what it responded to. It found the documentation, it injected the sparse explain method, so it extracted the data, it provided the representation of the documents inside. So we were able to retrieve this information. The mechanics of the Spars system, the mathematical formulas, the advantages. We're at 12%, which isn't huge, but it's pretty good. The only thing that bothers me is that I didn't see the work on the index. So we'll try again, specifically checking the index reading. And what he did is interesting. Look closely at the modification I told him about earlier; I told him to do `index` and `x index`. You have to think about the two possible patterns. So it will look for the main file in the `index.md` directory. Therefore, when you have several indexes, there will be a global index that routes through the different directories. But then, nothing prevents us from... When I say Elia, I mean I'm going to show you what I mean by AI. Sometimes, you shouldn't be foolish, meaning you shouldn't push the AI to waste time unnecessarily. If I tell him to use a query tool that uses the SPARs system, I can do that easily because I know that when you have a database that's starting to get large, it's still good to have an idea of where the elements are located. I can also, once again, to optimize my system, simply give him the general directory. Copy the path, I'll show him the path. So we'll save time, and I'll ask him to explain it to me. I often write in English, sorry, but it's true that it's become a habit to explain the subject to me. So I want to check this time how it works, which correctly switches me to a query on the index in the first state. Let's expand to see what he did. So what did he do? Well, first step, he found the term in several elements if we look closely. And this is the first part where he only uses the algorithm. Okay? So, this costs us nothing. In terms of power, processor, and context. It's really boom, we execute the algorithm. So, regularly when you make changes, you need to update your main index file. This is what allows you to find the location of data when you don't know where it is; this file allows you to route through all the directories. And this is where you define the search depth: level 1, level 2, level 3. If you want to go into subfolders, you give the AI agent more depth so it can search for more data. That's all the advice for this section. Now we need to put it into practice. We've seen two different methods for coding AI agents that will use your working files stored on your computer. What's important, as we've seen, is to prepare your database, maintain it, and clean it. Don't assume that the AI will do it all by itself. It's not possible, it's too complicated to set up. And so that's why, once again, I'm telling you, [clears throat] rag-type databases are expensive to implement. The price is relative, considering the service provided and the precision. The more UNIA allows you to address a business problem, to truly work selectively with your documents, the faster your execution speed becomes. That's a clear return on investment. The more generic responses you create that UNIA can't understand what it needs to retrieve from the data, the more the model will work by default on the LLM training data rather than based on your documents. So, that's the logic you need to understand. RAG (Research, Analysis, and Geospatial Analysis) makes sense when you need to work with your examples, your data, to inject ultra-precise information into specific sequences, and you need a system capable of retrieving and injecting it at the precise point. You can't send dozens of irrelevant documents. So, we have short, precise sequences based on search systems. And the first filtering element, if we're not using systems with a rerivial, is keyword search using a BM25. Behind that, we'll combine a cluster function, which are complementary functions, and on top of that, we'll add a semantic rerivial system that will use indexing to search for sequences in addition to the keywords identified in the documents. So, feel free to tell me if you've found other strategies. Leave them in the comments, and I'll see you soon. In the next video, we'll work on these exciting database topics again. In this section, we'll talk about the program's architecture. Personally, I didn't deploy and install it directly with Claude. I coded it with Chat GPT. Why? Because it's cheaper in terms of tokens. I coded everything with Chat GPT 5.3, and the final pass was done with Chat GPT 5.5. The parameters I used were high reasoning for Chat GPT 5.3 Codex and Chat GPT 5.5 with high reasoning. That was more than enough. So, the project's development time for me was 3 days. Three days means I spent 30 hours on reasoning, and about 8 hours setting everything up, fixing bugs, and developing the system. The complete system is included in the one-to-one coaching. As for the methodology and variables, I'm providing everything. So, from this draft, you'll be able to rebuild everything if you use the prompts with Chat GPT. You just need to follow the structure exactly, which is as follows. I coded using default system directories, and the default environment variables are as follows. In the top cases, I increased it to 8 to have h files because 8 means approximately 2000 fewer tokens to inject. So, it's fine. We can do less, but I think a sample of h files is pretty good. If you need more, you just modify the starting value. By default, the system will create a rag file. That is to say, I don't call it a wiki, I call it a rag, but you can call it whatever you want. The path structure is agents, command, example script, test, package, test config, and system speed config. These are the default names used in Cloud Code. So, for me, I prefer to stick with Cloud's architecture. So, this is the structure that, if you give this structure to Cloud, tells it exactly which folders to build. The logic part is actually there. All the logic is there. It's compressed. This is compressed data. To tell it, you have two functions: a wiki query and a wiki index. The system query is launched by the command in the directory. Its function is to launch index.js and then a subjunct called a semantic query. You have indexing logic with BM25 and the installed libraries for the script functionality that needs to be implemented. are to be installed in the directory. What I'm going to add is the BM25 architecture logic file in ASKI. The ASKI logic file describes how the agents work. It integrates the files into the architecture logic directory. So we're going to create two complementary logic files that will allow you to code these sections. So we'll start in plan mode, we'll launch it, it will certainly ask me for the installation directory. So there, it will summarize the methods used to develop them. So what needs to be set up is the agent structure. The agents, nothing surprising there, as usual, we have the same variables we've seen since the beginning of the videos. It's just who calls what. There you go, the command calls the agent. The query agent calls the subagent. If the BM25 values are less than 3, the semantic query agent will start reading the indexing files. Developing this type of program, you realize it will ultimately take three and a half days between testing, running, and optimization. And if you're selling services or creating automation like this for companies, you have to realize it's a lot of work because 40 hours of work plus updates and modifications easily turn into 50 hours. So you have to realize that this requires expertise. Ultimately, I think if I had to choose, I'd choose RAG over this system. Of course, not everyone wants to use RAG, not everyone wants to export data to databases, but in the end, RAG remains the most efficient. It's a fast, quick tool, but you have to be able to code it. So the first step is to give the GPT Codex to the user. If you're using the cloud, then work with Cloud Opus 4.7 for this type of structure. It's quite complex in the end. This is the original file I use for the system installation. And the two files I'm going to give you are the supplementary files that will provide instructions on how the agents are coded. After that, you'll need to customize the installation directories, of course. If you want to name it something other than RAG, you name it something other than RA. Mine is RA by default. And then, my installation directory is BM25. One of the points I've incorporated, as I mentioned regarding the loops, is that I always run a series of tests to verify that the logic of my workflow, folders, and paths is correct to avoid deploying something that doesn't work. So, this doesn't guarantee 100% that everything will be functional, but it generally allows you to detect minor problems. So, we'll look at the architectures together. So, for example, when you're coding the BM25 architecture section, you simply give it all of that. It's all there. All the information is compressed. It explains that the user query will be launched by the wiki query function. So you'll find the directory, you'll resolve it with a K8, a top K. Search depth 4. The index loop will start with the main index. So you'll have a file called main index. This is the one you'll need to update for all directories. And then you have a search that will be performed using the other relative index files. It will exclude all the folders that are in the RA directories. So the exclusion directories, as we said, are the RA directories. From there, the semantic chunk starts. You have the system, which will be the semantic query agent, that launches once the BM25 has finished with the GREP and it will retrieve only the semantic chunks greater than 85. So I give it a score value; it will retrieve a maximum top K8 final chunk and it will build a lexical index based on the scores of the BM25 and the semantic rivalry. It will then perform a rank fusion and send the synthesis to your system. So all of this... This is done within an external function. So here you have all the internal logic, the architecture, and the constants. You can modify the constants if you want a different value than the semantic channel. I set mine to 85. Generally, it's 82 or 80. Well, 85 is a rather high score to really only get highly relevant files. I'm including the Gon package for the architecture and the logic of how the agents work, the mapping, how it functions. So you have the part that is the query agent, we've already talked about it, we just covered what time it takes, and the index system. We haven't talked about the index as much. The goal is to discover all the index files in a directory and build the main index file. Update the main index file. So the sole purpose of this system is to build and update the main index. This is what the query function will be used for. So you need to update it if you make regular changes. And I'm not in favor of saying Lia will do it for you because it won't. You don't know what it's going to add. So what's important is actually to specify how and when it should update. Well, as you know, when you add new folders, you have to update the system. If you set up an automatic run, you can't even imagine how many times it will scan your entire database and you'll burn through your data plan too quickly. So you have the operation, the tools, the initial conditions, and the environment variables that I've integrated. So you have the whole flow operation, how it works. So this is the structure; if you give this structure to Cloud, it knows exactly which folders it needs to build. The logic part is actually there; all the logic is there, it's compressed. This is compressed data. To tell it, you have two functions: a wiki query and a wiki index. The query system is launched by the command in the directory. Its function is to launch index.js. So, we'll look at the architectures together. For example, when you code the BM25 architecture, you simply give it all this. It's all there, all the information, it's compressed. It explains that the user query will be launched by the wiki query function. So, you'll find the directory, you'll resolve with a K8, a top K, search depth 4. The index loop will start with the main index. So, you'll have a file called main index. This is the one you'll need to update for all directories. And behind that, you have a search that will be performed using the other relative index files. It will exclude all the folders that are in the RA directories. So the exclusion directories, as we said, are the RA directories. From there, you have the semantic chunk that starts. You have the system, which will be the semantic query agent, which launches once the BM25 has finished with the GREP, and it will retrieve only the semantic chunks greater than 85. So I give it a score value; it will retrieve a maximum top K8 final chunk and it will build a lexical index based on the scores of the BM25 and the semantic relative. It will perform a rank fusion and send the synthesis to your system. So all of this is done in an external for loop. So there you have all the internal logic and the logic of how the agents work. How does the mapping work? So you have the part that is the query agent, we've already talked about it, we just covered what time it takes, and the index system. We haven't talked about the index as much. The goal is to discover all the index files in a directory and build the main index file. Update the main index file. The goal is to update the main index file. So the sole purpose The main function of this system is to build and update the primary index. This is what the query function will use. So, it needs to be updated if you make regular changes. And I don't believe in saying Lia will do it for you because she won't. In this section, we'll talk about the program's architecture. Personally, I didn't deploy and install it directly with Claude. I coded it with Chat GPT. Why? Because it's cheaper in terms of tokens. I coded everything with Chat GPT 5.3, and the final pass was done with Chat GPT 5.5. The parameters I used were high reasoning for Chat GPT 5.3 Codex and high reasoning for Chat GPT 5.5. That was more than enough. So, the project's development time for me was 3 days. Three days means I spent 30 hours on reasoning, and about 8 hours setting everything up, fixing bugs, and developing the system and its constraints. So, with the three or four files I gave you, you're capable of coding everything yourself. If you want to learn how to use AI professionally and scale your work, check out "Mastering the Best of AI 2026." You'll learn to become a certified entrepreneur who automates their business in less than 15 days. And your AI CCI today is an extremely dynamic field. You have all the updates included. We're number one in terms of updates. You have over 85 hours of coursework that you can complete at your own pace. All the main AI models are covered in this training. You have Chat GPT, Chat GPT Codex, AI agents, and of course, Claude, Cloud Cow, and Claude CLI. You'll become a pro with a clear career path. At a truly professional level, you'll learn how to feed an agent with your data in a usable, clean, and useful way. Check out the training courses in the description. Just the architecture alone took me three days to think about because you realize what needs optimizing, in what order, and how. Nothing is perfect, don't forget that. For me, what's perfect is rags. Go look them up. If you want to do something well, you need to implement rags. This is a secondary option that's fast, allows you to store data on your computer, and also work with a local LM. Don't forget that. If you want to centralize everything and stop working remotely, you can use a local LM. Be careful with local LMs; you need to choose one that's an agentic function. From experience, I can tell you that, for example, it doesn't work. So, in my opinion, you should try the Quen 3, ideally at least 32B, because anything less will be too weak. You could easily pair it with a small data plan when you go to Surama. If you have an unlimited plan and €200 a month with Claude, that's not a problem. Otherwise, the models that are used are the Quen 3, but in 35B. I would have liked to try it locally in 27 or 35B. I think it has a view of all the Tinking. So that's nice. The local Nemotron. And if you have the data plans, then switch to GLM5 and Deepsic V4 for professional use. So, those are the possibilities. How much does it cost to access? €17 per month. So, I can't tell you in advance if they will work. I haven't tested them all. What I can tell you is that it can lower your bill because we can still run a number of things with AI that has been trained. I think Deeps was trained with Claude. Uh, there's a good chance. So I say this because they actually took, I don't know how many thousands of accounts that used Claude to be able to do reinforcement learning. So I think Dipsy is a very good candidate to be used, but really for the development and agentic part, it has very good scores. So I think you should try it. So code your system, use the interface, whatever it may be Cloud or GPT chat to develop it. You have all the structures, you provide the logic, you give it the files. Code each file in a directory. Code each file following the instructions in the Redmi file. Take one file at a time and code each file. You start this way. You switch to outline mode, you structure it a little, restructure it in sequential Markdown format. So it's good to give it the list of files. However, you switch to outline mode, you give it the four files, it will read them, it will structure them, it will establish an outline, it will start coding, it will run tests and then, you proceed to the installation as we did at the beginning of the video. Yeah.