
Tech • IA • Crypto
A method combining NotebookLM, Obsidian, and a ChatGPT Chrome extension makes it possible to build an optimized “second brain,” provided data quality and structure are well controlled.
The approach relies on integrating several tools: Obsidian to structure data, NotebookLM for analysis via Gemini, and a ChatGPT Chrome extension to automate queries. The goal is to create a personal research system capable of leveraging text, images, and metadata within a unified environment.
Direct file imports, especially PDFs, significantly degrade answer quality. Models lose table structure, mix text and images, and generate inconsistent data. This introduces “distractors” that disrupt model attention.
Studies show that the presence of just four distractors can reduce answer quality by 50 to 60%. All major models are affected, from ChatGPT to Claude to DeepSeek, with different behaviors: Claude remains cautious, while ChatGPT tends to answer even when uncertain.
The solution is to convert documents into clean formats via OCR, producing “RAW” files enriched with metadata. This structuring removes noise and optimizes model usage, a key requirement for an effective RAG system.
Extraction tools like Mistral OCR Document AI can separately recover images and their descriptions. These elements are then reinserted into the database as structured Markdown, with precise metadata indicating content, position, and context.
Once data is cleaned, NotebookLM becomes an advanced search engine capable of processing text, tables, and images simultaneously. Its closed system ensures only provided sources are used, improving answer relevance.
The Chrome extension enables the creation of skills, scripts that automate complex tasks. These agents control a browser, send queries to NotebookLM, retrieve responses, and reinject them into the user interface.
The process involves several steps: opening the interface, sending formatted queries, retrieving responses with citations (pages, images, tables), then delivering structured outputs. The agent acts as an intermediary between the user and the knowledge base.
Each answer can include precise references, such as page or image numbers. This traceability improves information verifiability and strengthens overall system reliability.
Thanks to structuring and automation, the system becomes faster, more accurate, and better suited for professional use. It surpasses the limits of Obsidian alone by fully leveraging Gemini’s native multimodality.
Installing skills carries serious risks. A malicious script can access the browser, local data, or environment variables and exfiltrate sensitive information. Reading and verifying code is essential before any use.
Building an effective second brain depends less on tools than on data quality and structure, which are essential to fully leverage AI models while minimizing risks.
In this video, we'll be using the latest Chrome extension, "ChatGPT Extension," which I showed you how to activate in the European preview in the previous video. We'll use the ChatGPT Chrome Extension to combine a second brain within Codex ChatGPT 5.5. We'll build a database using Obsidian and store it in NotebookLM. The power of this tool lies in combining a personal search database, allowing us to leverage text, images, and the power of Google. Combined with Obsidian, you'll create your first knowledge base, the second brain, using a skill function. I'll show you how to code all of this. The skill I used as a basis for my own code is included in this tutorial. We'll implement AI professionally for businesses and entrepreneurs. And in this tutorial, we'll build a knowledge base. In previous videos, we discussed the possibility of creating Obsidian databases. What happens in an Obsidian system is that the entire database we build is sent to the full context, which will automatically saturate the system. The solution, as we saw earlier, is to build a database system with metadata and then perform a system chunking, which is called a "chunk." So we have all the elements to build a RAG system. But what if we found a way to create a lightweight RAG, an ultra-lightweight RAG, using the NotebookLM ecosystem? That's exactly what we're going to do in this tutorial. We'll use the entire dataset we built in the previous videos and construct a system within NotebookLM. For those using NotebookLM for the first time, here's a very quick overview of the system. We're using an interface that allows you to use a number of queries for free. Google gives us the option to use NotebookLM. Each time you create a system, it's a closed system, meaning you only work on your documents using Google Gemini. The principle is as follows: you click the "Create" button and add sources. These sources can be text, access to Google Drive, information from websites, or imported files. Now, I'll show you the main problem. If I import a PDF (which I just did: the latest DeepSeek V3 study), many people make this mistake: they take a document, a PDF, and upload it to the interface. Now, NotebookLM is a multimodal system, meaning it's designed to gradually extract both images and text. But here's what happens. First, the model fails to structure the data correctly, and you lose all the table formatting. Then, the pages and images are randomly placed within the text. What's happening is extremely dangerous for an AI, and I'll explain why. There are always two crucial issues to resolve with AI: data quality. When you send data, what happens is that, like AIs, which are predictive machines, they predict tokens based on attention tokens. If you introduce information considered distractors—that is, information that isn't coherent (which is exactly what's happening here: we've introduced incomprehensible elements, added images in incomprehensible places, and lost the format of the tables)—then all the data we have here isn't structured data; it's random data. And on top of that, you have various and sundry characters getting into the structure. So everything that You are currently working on this, and everything you've seen in all those tutorials that have never addressed this issue has never mentioned the hallucination induced by unstructured data. This is precisely what you need to consider when creating a knowledge base, a second brain, in ChatGPT, Claude, or Obsidian. This is why distractors, as we just saw, must be absolutely avoided. The Chrome study explains that if you increase the number of distractors to four (and as you've seen, there are many more than four), the models will prioritize the distractors. This is a flaw in the models' behavior. They will focus on these distractors. Four distractors will reduce the quality of an LLM's response by more than 50 to 60%. Whether it's ChatGPT, DeepSeek, or Claude, all the models will be affected. So the first part, which we saw in the previous videos, is about creating structured data optimized for second-brain systems. As a result, the study shows us that the most effective model when faced with distractors remains Claude. Currently, Claude is a cautious model. We'll define it as such. ChatGPT, on the other hand, is the complete opposite. ChatGPT tends to be overconfident. Consequently, even when it doesn't know the answer, it will respond. And that's precisely what we need to avoid. When we create our second- brain system using the ChatGPT Chrome extension, our data must be perfect. The first solution to this problem, as we saw in the previous videos, is a raw file called a RAW file, which is perfectly structured. We extracted it using an OCR system. So you have a document that's no longer a raw PDF but an optimized, structured document for knowledge bases. We've built a knowledge system here. But what we've lost are the images. So, we've added metadata and information about the presence of the images. In this system, what we're going to do is retrieve the images, and I'll show you how. We'll go to the OCR section of Mistral and switch to the Document AI interface. This is where we can upload the files. We'll retrieve our document. And what will interest us this time in the options (we've already used this software, so I'll go through it quickly) is to decompose the image section using the code editor and creating structured responses. We'll extract data targeting the image data, which we'll then insert into our second brain system in Markdown format. So you're going to run this function, and at the output, we'll retrieve both the images and the text. Of course, I haven't configured the JSON section yet; that's because we're showing you each step to help you identify them. But depending on the type of data you have, you'll need to structure the data you want to extract from the images. Here, we've added metadata, but what we want to retrieve now by clicking "Download" is the metadata and the images. And this time, we'll be able to transfer all this information into NotebookLM. If you want to learn how to use AI professionally and scale your work, go to "Mastering the Best of AI 2026." You'll learn how to become a certified entrepreneur who automates their business in less than 15 days. AI is an extremely dynamic field today. You have all the updates included. We're number one when it comes to updates. You have more than 85 hours of classes that you do at your Pace. All the main AI models are covered in this training. You have ChatGPT, ChatGPT Codex, AI agents, and of course Claude, Claude Cowork, and Claude CLI. You'll become a pro with a clear path, a truly professional level. You'll learn how to feed an agent with your data in a usable, clean, and useful way. See the training courses in the description. We're currently building our second brain and will have fixed all the issues we had with previous data extraction. The crucial point to remember is that you should never use a direct source import. You'll contaminate the AI's response. All the tutorials you've seen so far that helped you—"Insert your PDFs, go get information from websites into the interface"—make this mistake, and you absolutely must correct it right now. From now on, I 'm going to show you what will happen to the data now that we've cleaned it, optimized it, and added all the images extracted from Mistral OCR. We now have extremely clean and structured data where we have the image description, the image itself with its descriptive metadata right below it—you see it there—which indicates where it's located, what it contains, what information it holds. From this point on, the model will have the entire knowledge base, and you can see that the quality of the layout is completely transformed. NotebookLM now has highly optimized structured data to understand the entire context. Until now, if we needed to ask a question, we had to type it into the NotebookLM interface. To automate this system, we'll leverage the ChatGPT Chrome Extension's ability to take control of the interfaces and build what's called a skill. We could have used an MCP system, but today we'll focus on the concept of skills. Skills are about giving AI the ability to perform tasks similar to human workflows. We'll automate our work using skills. Our goal is to retrieve questions and answers from the NotebookLM interface, retrieve citations of the document numbers we'll use, and activate a skill that will query our database. So, we're going to transfer all the knowledge we have in the Obsidian interface into NotebookLM, and it will be NotebookLM with the Gemini 3.1 engine that will perform the searches. The advantage now is multimodal search, meaning we can include images, videos, tables, and text. We'll therefore have a much more powerful "second brain" system than with Obsidian because we'll be using Gemini's native multimodality in NotebookLM. I'll show you how it works, and then I'll show you how to code a skill. And I'll give a brief overview of skill structures for those who haven't had a chance to look at how it works. So, the system—I sent it a request to retrieve information from the NotebookLM interface. It will type in questions related to the query I asked it: DeepSeek V3's sparse mode. "Explain the DeepSeek V3 sparse mechanism system to me in detail." It will send a number of keywords to retrieve them from the knowledge base. NotebookLM will retrieve the information, including the source name, page number, tables, and other details. Then, ChatGPT, using a Chrome extension that we'll activate, will retrieve the information and inject it back into The interface. If you want to see how it works, I'll show you right after. Just below, you have the Q&A between the two interfaces. So here, I'm showing you the screen that was launched to make you aware that in the background, you can launch queries with requests. And this is where you see that I have AI agents. It's actually an agent that we're going to code that will launch and query the NotebookLM database on my behalf, thus using the NotebookLM ecosystem to create a knowledge base and return the information. Thanks to the information here, you actually have the reference for the images that appear. Detailed architecture of image 1.4, image 38. So if we want to retrieve the elements from the sources, we are able to retrieve all the information from the context. This means something regarding security. What I just showed you is that I left the screen open because I wanted the skill to indicate that a screen would be open in the background. But this means something extremely important in terms of security. I'm going to give you a link where you can find and download this skill. I've customized it; I've completely changed the entire code structure. I've optimized it for my workflow. But I'll give you that link. Be careful when you install skills because, technically, if you don't know how to read the code, someone can now take control of your browser, your PC, and send information without your knowledge. I say this because there are entire databases of skills that people will click on, download, and very often these people have no coding skills or knowledge of AI. So they 're completely unable to understand what's happening behind their device. So be careful when you download anything; always check the security. What personal data does the system collect? I showed you in the previous video about the Chrome ChatGPT Extension. You can retrieve your system's environment variables and send them remotely using this method. So be careful, don't download skills without first checking the code. Here's the skill I used as a basis for my own code. I got it from this address. And now, I'm going to detail the modifications I made to optimize it, to make it more efficient, faster, and more performant, and to adapt it from Claude to the ChatGPT Codex. First, you'll find much more detail and many more steps in the course. We'll take more time to explain each part of the creation process, but I'll still remind you of the basics of how a skill system works. A skill system consists of an area called metadata, which contains at least the name and description. Then, you can add agent functions. In the section, we have the trigger, and then the workflow steps. So that's the basic structure. When we start to have a slightly more technical skill, which will be the case for what we're going to develop now, we'll integrate a set of processes, meaning we'll call a workflow with steps 1, 2, 3, and 4. We'll tell it, "You'll log in by launching Chromium. Then you'll go to the page, take control of the page, and use PlayWright to take control of the interface. You'll type the questions, which will be in a specific format, and then you'll retrieve the answers and feed them back into the interface." So we'll create an interface agent and optimize the queries. We'll tell it how the questions and answers should be structured at the system level. If you You need to gather all the data to understand how the