How it works¶

The system intervenes at two distinct moments: before sending the question to NotebookLM and after receiving the response. The user doesn't need to do anything differently — both interventions happen behind the scenes.

The three-phase flow¶

Every interaction goes through three phases.

Phase 1: question structuring. The user asks a question naturally. Claude reads the guidelines contained in the ask_question tool description, which specify how to structure the request. Based on these instructions, Claude transforms the question by adding a thematic output format, a citation format, a completeness signal, and a placeholder for missing information.
Phase 2: transit. The MCP server receives the structured question from Claude and passes it to NotebookLM without modification. NotebookLM processes the request by consulting the documents loaded in the selected notebook and returns a source-grounded response.
Phase 3: response control. NotebookLM's response returns to Claude through the MCP server. Before Claude presents it to the user, two control mechanisms kick in. The first is a completeness reminder, automatically appended by the server to every response, which pushes Claude to compare the received response with the user's original question and ask NotebookLM further questions if something is missing or unclear. The second is the faithful presentation instructions, contained in the same guidelines read in phase 1, which tell Claude to present the response without adding external knowledge or "improvements".

What happens outbound: the structuring¶

In a test conducted on the seven official DaVinci Resolve 20 manuals, loaded into a NotebookLM notebook, an intentionally generic request was made:

List the AI-based features of DaVinci Resolve.

Claude transformed this simple question into a structured prompt similar to this:

List the AI-based features of DaVinci Resolve.

Organize the response by thematic topics.
Try to cover all aspects discussed in the documents.
For each topic:
- TOPIC: [identifying title]
- DESCRIPTION: [summary with context, connecting information
  across different documents]
- EVIDENCE: "direct quote" [Source: document]

If a topic appears in multiple documents, show evidence
from each one.
If not present in documents: [NOT FOUND IN DOCUMENTS]

The user didn't write any of this. The structuring was automatic — the original question was preserved as-is, but Claude added a structured output format, citation formatting, and instructions for missing information. The result was an organized catalog of AI features with direct quotes from the documentation.

Recognized question types¶

Claude adapts the prompt structure based on the detected question type. The approach is task-oriented — each question type has a specific pattern that focuses on output structure and cross-references between documents.

Comparison: organizes the response by comparison points with similarities, differences, and cross-references between different documents.
List: organizes by thematic topics with description and citations. If the same item appears in multiple documents, shows all occurrences and any discrepancies.
Analysis: organizes by thematic topics with summaries and connections between different documents.
Explanation: structures the response starting from the base concept with supporting citations, examples from the documents, related concepts, and known limitations.
Extraction: this is the default type for all other questions, organizes by thematic topics with descriptions, citations, and cross-references between sources.

What happens inbound: Claude's added value¶

At this point, one might ask — if the structuring already produces a well-organized, source-grounded response, why not show NotebookLM's response directly?

Because Claude on the return path is not a mere relay. It's in the return phase that Claude's capabilities become a concrete advantage over using NotebookLM directly, provided these capabilities are channeled in the right direction. What Claude adds is the quality of presentation, synthesis, and active research — not external knowledge.

Active research¶

Every NotebookLM response, before reaching Claude, is enriched by the server with a hidden reminder. This reminder asks Claude to stop and evaluate whether the response actually covers everything the user asked, whether something is unclear or incomplete, whether something is missing. Claude can autonomously ask another question to NotebookLM within the same session, without the user needing to intervene.

In practice, a single user question can generate two or three successive queries to NotebookLM, each more targeted than the previous one, before Claude presents a complete response. The user sees only the final result.

Synthesis and organization¶

Claude can rework the received information with its own communication capabilities — from summarizing long responses to reorganizing content into a clearer structure, to highlighting key points. Most importantly, it can combine results from queries to different notebooks into a single coherent analysis, a capability that NotebookLM alone doesn't offer since it always works on one notebook at a time.

The fidelity constraint¶

These capabilities are channeled by a precise constraint — Claude does not add knowledge that doesn't come from the documents. The structuring guidelines contain explicit instructions for the return phase that constrain Claude to faithful presentation of document content. If information is not present in the documents, the response declares it with the placeholder [NOT FOUND IN DOCUMENTS], instead of filling the gap with general knowledge.

The constraint concerns content, not form. Claude can and should use its own organization and synthesis capabilities, but the material it works with remains exclusively what comes from the documents. As described in this article on external notebook access, the more elaborate the output, the more the risk of contamination increases, and the fidelity constraint is the mechanism that prevents it.

Multilingual support¶

The system works with any language supported by Claude. The structuring instructions guide Claude to adapt labels and constraints to the conversation's language, without specific configuration. For example, "TOPIC/DESCRIPTION/EVIDENCE" becomes "ARGOMENTO/DESCRIZIONE/EVIDENZE" in Italian, and "[NOT FOUND IN DOCUMENTS]" becomes "[NON PRESENTE NEI DOCUMENTI]".

The system has been thoroughly tested with Italian.

Language consistency

For optimal results, it's advisable to maintain the same language throughout the conversation. Switching languages during the session may produce unpredictable structuring results.

Verifying what happens behind the scenes¶

NotebookLM saves chats, which means you can open the notebook directly at notebooklm.google.com to see the structured prompt sent by Claude, along with NotebookLM's original response with all internal reference links.

This transparency mechanism allows you to verify that structuring was applied correctly and to compare the original response with the presentation made by Claude.

Under the hood

The system's architecture is based on a precise division of roles. The structuring instructions — which patterns to use, how to format citations, how to handle missing information — are defined in the MCP server code, within the ask_question tool description. When Claude Desktop loads the server, it reads these instructions and applies them. The actual structuring therefore happens in Claude, which acts as the client, but following rules written by the server. The MCP server doesn't touch the questions that transit to NotebookLM because it is the transparent intermediary.

This design choice has several advantages. Multilingual support is automatic since Claude natively handles any language. The structuring logic can be updated by modifying a single file in the server code, without touching Claude. And the system adapts to the conversation context without needing fixed templates for each language or question type.

The return control operates on two distinct and complementary levels. The first is a constant in the server code (FOLLOW_UP_REMINDER) that is concatenated to every NotebookLM response before returning it to Claude, pushing it to verify completeness and ask additional questions if necessary. The second is the "Response Handling" section in the structuring guidelines, which instructs Claude on faithful presentation. The separation between the two levels is intentional — the completeness check works even if the guidelines are modified, and vice versa.

Compared to the original server version, which handled structuring server-side with templates for each language and different enhancement modes, the fork simplified the architecture by moving the structuring logic into the tool description. This eliminated hundreds of lines of code (multilingual templates, language detection, response wrapping) in favor of a lighter and more maintainable approach.

A technical note — NotebookLM doesn't handle decorative lines in prompts well. Character sequences at the beginning of the request such as === or --- cause timeouts. The structuring instructions specify using only plain text headings, avoiding any decorative typography.