← All posts

Save Claude Tokens by Integrating NotebookLM

Offload the heavy document crunching to Google's free infrastructure. Keep Claude tokens for orchestration and polish.

April 16, 2026 7 min read Tips
TL;DR

Install notebooklm-py, give Claude a skill that knows how to drive it, and let NotebookLM handle the big-document Q&A. Claude spends tokens only on orchestration and final polish. The result: a $20/mo Pro plan does what used to cost $200/mo.

"Claude usage limit reached. Your limit will reset at 7pm." If that message shows up at 2pm on a research day, you already know the problem. Claude is amnesic — every doc you feed into the context eats tokens, and a 30-document analysis eats a lot of them.

The fix isn't a bigger plan. It's a bridge. Route heavy analytical work to Google NotebookLM (free, 50 sources per notebook) and keep Claude's tokens for the parts where Claude is uniquely good: orchestration and final polish.

Why the combination works

Claude Code is fantastic at multi-step orchestration — reading a repo, spinning up subagents, writing files, running scripts. It's expensive at bulk document Q&A because everything has to fit through its context window.

NotebookLM is the inverse. It's a RAG tool built for cross-document questions. Upload PDFs, articles, YouTube transcripts, Docs, audio, images. Ask questions and get cited answers across all sources at once. Free tier: 50 sources per notebook. Pro: 300. Processing cost: zero.

The catch: no official API. Browser-only. You can't script it.

The bridge: notebooklm-py

Developer Teng Ling reverse-engineered NotebookLM's internal protocols and shipped notebooklm-py, an open-source CLI. It drives NotebookLM entirely from the terminal — create notebooks, upload sources, run queries, generate slide decks, podcasts, and flashcards.

You need Python 3.10+, a Google account, and a terminal. The repo lives at github.com/teng-lin/notebooklm-py. Follow the README to install and authenticate.

Teaching Claude to use it via skills

A skill is a set of instructions in a SKILL.md file that Claude reads when it detects a matching task. Think of it as a playbook on your machine — Claude loads it automatically when your request matches.

Skills follow an open standard. They work in Claude Code, Cursor, Gemini CLI, Codex, and others. Personal skills live in ~/.claude/skills/skill-name/SKILL.md. Project-scoped skills (shareable via git) live in .claude/skills/skill-name/SKILL.md.

Install the NotebookLM skill in one command:

notebooklm skill install

Verify with notebooklm skill status. Once installed, Claude knows the command syntax without being told each session. Say something like "research B2B sales strategies and put together a report" and Claude picks up the skill automatically. Or invoke it directly with /notebooklm.

Build your own skills: run /skill-creator in Claude Code. Anthropic's meta-skill interviews you, generates the SKILL.md, runs test prompts, and packages the result. From idea to a tested skill in minutes.

Workflow A: research without burning tokens

You need to analyze 30+ documents. Doing it locally kills your token budget. Here's the loop:

  1. Collect sources — PDFs, articles, YouTube transcripts (Claude can pull them with yt-dlp).
  2. Claude creates the notebook: notebooklm create "My Research Project"
  3. Claude uploads every source with notebooklm source add "./file.md" "https://url" "./report.pdf"
  4. Claude queries instead of processing locally: notebooklm ask "what are the three most important themes across all sources?"
  5. Claude generates outputs — slide decks, flashcards, mind maps, data tables, audio — via notebooklm generate commands.
  6. Claude polishes locally. Only this final step spends your Claude tokens.

The expensive analytical work runs on Google's infrastructure. Claude tokens are reserved for orchestration and polish. That's how you stretch a $20 plan into what used to be a $200 month.

Workflow B: build expert agents from web research

You want a custom Claude skill for a specific domain, say B2B sales. Vague prompts produce vague agents. Use NotebookLM's Deep Research for autonomous expert-knowledge collection, then structure it into a skill.

  1. Run Deep Research in the NotebookLM browser UI. Pick the "web" source type. Give it a specific query: "advanced B2B multi-channel outbound sales strategies, retention loops, and re-engagement sequences." Deep Research crawls hundreds of pages and builds a cited report.
  2. Structure the output with the DBS framework: Direction (step-by-step logic, becomes the core SKILL.md), Blueprints (static reference material — templates, tone, rules — becomes companion files), Solutions (deterministic code needs — API calls, formatting, calculations — becomes scripts).
  3. Paste the DBS output into Claude Code and run /skill-creator. Claude generates the full skill package.
  4. Test and deploy. skill-creator stress-tests the new skill with generated prompts and lets you iterate to the quality bar.

Workflow C: memory that survives sessions

You spent three hours teaching Claude your naming conventions and project quirks. You closed the terminal. All gone.

Fix it with a /wrap-up skill that extracts session gains and saves them to a persistent "Master Brain" NotebookLM notebook. Claude consults it at the start of every subsequent session.

The skill extracts four things each time:

Instead of saving to a local file, wrap-up pushes to the Master Brain: notebooklm use master-brain-notebook-id "./session-summary-2026-04-06.md". Then add this line to your CLAUDE.md:

"Before answering questions about project architecture, historical decisions, or my preferences, query the Master Brain notebook via the NotebookLM CLI."

Over weeks the notebook accumulates hundreds of summaries. NotebookLM indexes them and builds semantic links. Claude pulls exactly the context it needs without loading everything into its window. Your agent actually remembers.

Workflow D: visual knowledge management with Obsidian

Claude-generated research, session summaries, and analysis pile up as invisible files in terminal directories. Launch Claude Code from the root of an Obsidian vault and everything shows up in a visual knowledge graph instead.

cd ~/Documents/MyVault
claude

Add a CLAUDE.md at the vault root that defines folder structure, required metadata, link rules (wrap concepts in [[double brackets]] for the Obsidian graph), and formatting standards. Then build vault-specific skills:

As Claude creates files you see them in Obsidian. Miscategorized a note? Fix it and ask Claude to update CLAUDE.md. The feedback loop trains it to your preferences. Pair this with the Karpathy LLM knowledge base pattern for a compounding research layer.

What can go wrong

This stack is powerful. It also rides on an unofficial tool that could break tomorrow. Go in eyes open.

Watch your actual usage. Before and after you install the bridge, check your Claude token usage. The savings are dramatic only if you actually route the heavy work. If you're still pasting 30 PDFs into Claude, the skill isn't firing — fix the description so it matches your requests.

Command quick reference

The bottom line

The heavy lifting doesn't have to happen inside Claude's context window. Route bulk document analysis to NotebookLM, keep Claude for orchestration and polish, and your token budget stops being the bottleneck. Install the bridge, install the skill, and try Workflow A on your next research project. For a compounding knowledge layer to pair with it, read Build a Karpathy-Style LLM Knowledge Base in Obsidian, or jump to the cheatsheet for the full Claude Code reference.

Want the full Claude Code reference? Open the cheatsheet →