Knowledge Base Building Guide
/ 10 min read
Types of Documentation
Linear Notes
Regular text formatting, ranging from plain txt files to Word documents—all fall under linear formatting.
If you’re a programmer, Markdown is still the top recommendation (and if you’re not a programmer? Still go with Markdown, because its syntax is actually quite simple). While other tools can add many special effects, Markdown has the strongest portability. LLMs now primarily use Markdown as their main input/output format, including the latest proposal llmstxt.
MDX is a fusion of Markdown + React components, naturally offering top-tier expressiveness, but with slightly reduced portability. Similar technologies include Markdoc.
Of course, the trade-off between portability and expressiveness is yours to decide.
Currently popular online note-taking software includes:
- Notion (more of a database than a note-taking app, with many features to explore)
- Lark Docs (actually, Lark can handle all the types mentioned later, making it an excellent choice for documentation)
You can also choose frameworks like Astro or Next.js for personal blogs to fully enjoy MDX capabilities.
For local tools, I recommend Logseq and Obsidian. Both support bidirectional linking—Logseq tends to solve everything with bullet lists, while Obsidian is a tool for freely writing Markdown.
We can also use plugins or templates to make Obsidian incredibly fancy, and if you master it and have traffic, you could even make serious money. Many knowledge creators sell Obsidian templates, claiming these templates plus their theories will improve your life.
Freeform
Physical Handwriting
The most primitive and freest form of physical handwriting. But now I can only recall using this format for high school notes—digitization has covered almost all scenarios today.
However, there’s an exception: bullet journaling.
Digital Whiteboards
- Favorites of knowledge creators and international students: iPad handwriting, including PDF annotation.
- Digital doodle-style solution: excalidraw
For reference, here are two stream diagrams I drew before:


-
Tool suitable for programmers to draw UML: diagrams.net
-
Of course, Lark Docs also has whiteboard functionality
Semi-Structured
Now you might ask: with freeform and linear options, why do we need semi-structured?
Mind maps are the epitome of semi-structured notes.
Semi-structured means you can do less, but less doesn’t necessarily mean worse. For example, you can free your brain from node arrangement and efficiently create diagrams. (This specifically refers to digital versions—handwritten mind maps encourage using various colored pens, which is quite challenging but can produce beautiful results, though with high variance in quality.)
Moreover, flexible collapsing and expanding is something that fully freeform note formats don’t have. Additionally, mind maps can naturally display node connections (expressing relationships between nodes) and node summaries. These two operations would look unnatural in dense linear documents and require extra operational costs in freeform whiteboards, while mind maps perfectly bridge the two.
For desktop, I recommend Xmind and MindElixir (contact me if you need the lifetime version—I’ll give you a discount! 🤗). Online, Lark Docs works well too.
Mermaid is a diagram representation language widely used in programmer circles. Compared to mind maps’ “no-code” approach, Mermaid uses a low-code method, making it slightly more hardcore. The advantage is that as a widely accepted diagram language, it can be used directly on platforms like GitHub, and expressing diagrams with characters saves space and makes data retrieval very convenient.
Mermaid Playground: https://mermaid.live/edit
Fragment Recording
The hottest fragment note-taking app right now is probably Flomo, which is basically like a private Twitter that only you can see. I honestly don’t understand how this niche managed to produce such a popular app. Its strength lies in connecting with many tools for import/export workflows, plus plugins for quick information import.
For me, this kind of fragmented recording is indeed necessary—flashes of inspiration are truly fleeting, especially for those with ADHD.
But I choose to record directly in Obsidian, then convert to full versions when I have time, without needing to copy and paste between apps.
If you’re pursuing speed and diverse input methods, the best choice should be your phone’s built-in notes app—these typically support text, drawing, voice, and other input methods.
Functional
Why do we know many principles but still can’t live well? Because just knowing principles is useless—when we need to “apply” these principles, we simply can’t remember them. We just continue down the old path. Some books call this the “default setting.” What we need to do is not just know the principles, but set them as our default, so they actually take effect.
So for things that need constant reminding, I use my phone’s built-in notes app to write them down, then pin them to the home screen—this feature should be available on both Android and iOS. Even more extreme, you can set up timed reminders to make those words pop up… iOS’s Reminders app has this built-in functionality.
Speaking of reminders, another type of functional note is the most commonly used TODO list. Using TODOs can efficiently organize short-term goals and tackle them one by one. Todoist is a great choice—tasks sync across all devices, and basic features are free.
Why Build a Knowledge Base
Dealing with Information Overload
Your mind is for having ideas, not holding them. —David Allen
Building a second brain—the limitations of the human brain are quite obvious. It’s hard to remember all details, and infrequently used details are easily discarded. So externalizing certain details to a knowledge base works wonderfully.
Digital knowledge bases leverage efficient search capabilities—you just need to remember that something exists, and a quick search brings it up. Efficient and convenient!
Recording your brain’s memories externally also reduces your cognitive load, which is a very effective trick for dealing with insomnia.
In short, forgetting is a gift from natural evolution—don’t stuff everything into your brain. Use a knowledge base!
Strengthening Memory
Reading other people’s notes directly is usually not very useful, and similarly, directly reading AI-summarized content isn’t very practical either. Humans always need to spend time thinking things through themselves to better remember facts and principles—in other words, summarizing learned knowledge yourself strengthens memory.
The simplest way to strengthen memory is weaving a web. Humans rarely remember isolated knowledge well—it’s usually better remembered when interwoven with other knowledge.

Image from https://x.com/gapingvoid
Methods and Processes
The Feynman Technique
If you can’t explain a concept in simple language, then you don’t truly understand it.
After spending time collecting information, you still need to parse the information yourself and express it with your own mind. If you find you can’t express it, that means you don’t really understand it and need to continue inputting content. This is the classic theory of output driving input, more commonly known as the Feynman Learning Method.
At a higher difficulty level, the Feynman Learning Method actually requires you to explain using the simplest language possible. For example, if you need to explain concepts to a child with no relevant background knowledge, you must understand very thoroughly, even using everyday analogies to explain advanced concepts to children.
Zettelkasten
Zettel: slip + Kasten: box = Slip Box Method
- Fleeting Notes: Quickly captured raw thoughts and sparks of inspiration, usually recorded in portable tools.
- Literature Notes: Brief records made in your own words about specific content when reading literature (books, articles, etc.), with sources noted. Each literature note should also strive to be atomic.
- Permanent Notes: This is the core of Zettelkasten. They are distilled and refined from fleeting notes and literature notes, using your own language to fully articulate an independent thought and establish links with other related permanent notes.
- Index/Structure Notes (Maps of Content, MOCs): Serving as entry points or hubs for specific topics, they don’t contain specific knowledge but point to a series of related permanent notes through links, helping organize and navigate the knowledge network.

The strongest practice of the web-weaving knowledge base construction mentioned above.
- Capture: Record fleeting notes and literature notes anytime.
- Process: Regularly review fleeting notes and literature notes. For valuable content, think about its core viewpoints and develop them into one or more permanent notes using your own language.
- Link: Find and establish links between newly created permanent notes and existing permanent notes. Think about what relationships exist between these notes (support, refute, extend, analogy, etc.) and briefly explain at the link.
- Create/Update Index: Add new permanent notes to relevant index notes, or create new index notes as needed.
Cornell Note-Taking Method
Invented and promoted by Walter Pauk, an education professor at Cornell University, in the 1950s.

Cornell Process: 5R
- Record: In the main note-taking area (usually the largest area on the right side of the page), record the core content of lectures, readings, or meetings as completely and clearly as possible. Techniques include using concise phrases, symbols, abbreviations, and highlighting key points.
- Reduce: As soon as possible after class or reading, condense the main points from the note-taking area into keywords, questions, or brief cues in the cue column (the narrower area on the left). This step aims to clarify meaning, strengthen memory, and organize logic.
- Recite: Cover the main note-taking area and only look at the cues in the cue column, trying to completely recite the note content in your own words. Then check against the main note-taking area. This is a key step in active recall and self-testing.
- Reflect: Connect the note content with personal experience, existing knowledge, and other courses/topics, engage in critical thinking, and raise your own insights or questions. This step gives notes deeper meaning and prevents knowledge stagnation.
- Review: Regularly (such as daily or weekly) quickly review notes, especially the content in the cue column and summary area, to combat the forgetting curve and consolidate learning outcomes.
Sponge Reading Method
I previously read “The Sponge Reading Method”(Chinese) and found it quite inspiring.
Key points include:
- Progressive note-taking sequence: fragments -> thoughtful reorganization -> mind maps -> reading notes -> systematic reading articles
- Focus on recording new knowledge brought by the book
- Improve information-to-action ratio—take action after learning
Complete version portal: Squeezing Out “The Sponge Reading Method”(Chinese)
Others
- P.A.R.A.:
- Projects (short-term tasks)
- Areas (long-term support)
- Resources (resources supporting the first two)
- Archives (logical deletion of expired content)
- Getting Things Done (GTD):
- Capture: Record everything that catches your attention (regardless of size) into an “Inbox” or other collection tools
- Clarify/Process: Process each item in the inbox one by one, determining what it is and whether action is needed
- Organize: Place clarified items into appropriate lists or locations
- Reflect/Review: Regularly (usually weekly) review the entire system, check all lists, ensure completeness, and make necessary updates and adjustments
- Engage/Do: Based on context, available time and energy, and priorities, select and execute tasks from your lists
Summary
Actually, these methods and processes all have some basic commonalities:
Collect (avoid collector’s fallacy) -> Think/Refine -> Rephrase -> Review/Organize
So understanding the logic is enough—there’s no need to rigidly follow one set of rules. You can use your existing note-taking architecture as a foundation and gradually incorporate the above ideas to build your knowledge base. What suits you is the best.
Note-Taking in the Post-AI Era
- Content Recording Optimization
- In the post-AI era, we should record less information that can be found in documents. After all, LLMs have such strong retrieval capabilities that notes copied and pasted directly from official technical documentation have little meaning. At minimum, we should restructure the entire learning object during the learning process, then appropriately copy and paste key points from the documentation. In other words, focus more on connecting knowledge points that are far apart in the original documents rather than simply condensing based on the original content.
- Record more of your own personal thoughts—maybe future large models can reconstruct you
- There’s already a project that mimics you using chat records: https://github.com/xming521/WeClone
- Build a personalized prompt knowledge base, for example: https://ssshooter.com/notebooklm-prompt/
- Multimodal capabilities rescue handwritten knowledge bases
Research and investigation assistance:
Document reading assistance: