Show HN: Atomic – Local-first, AI-augmented personal knowledge base
TL;DR Highlight
Atomic builds a self-hosted, open-source personal knowledge graph app that automatically embeds, tags, and links notes, web clips, and RSS feeds—supporting semantic search, LLM-powered wiki synthesis, and MCP integration.
Who Should Read
Developers or researchers currently using personal knowledge management tools like Obsidian or Notion who want to self-host AI search and summarization features.
Core Mechanics
- Atomic manages all content—notes, saved articles, web clips—as 'atoms', automatically vector embedding, tagging, and linking them upon addition, eliminating the need for manual folder structuring and enabling self-generating taxonomies.
- The platform provides semantic search functionality, locating conceptually similar notes even without exact keyword matches through vector embeddings.
- Wiki Synthesis automatically generates wiki-style documents from all notes, articles, and web clips under a specific tag, complete with inline citations linking back to original sources and dynamically updating with new content.
- Agentic Chat enables AI to automatically search and retrieve notes during conversations, allowing users to specify search scope (tag range or entire library) and reducing hallucinations through source citations.
- Atomic includes a built-in MCP (Model Context Protocol) server, allowing MCP clients like Claude and Cursor to directly access and interact with the knowledge base for search, reading, and generation without leaving existing workflows.
- The platform offers a Tauri-based desktop app, a self-hostable headless server, an iOS app, a browser extension, and an MCP server, enabling access via web, mobile, and desktop clients when self-hosted.
- Content can be added through various methods, including direct writing, URL input, RSS feeds, web clipping, mobile sharing, Obsidian sync, and a REST API.
- Atomic is open-source under the MIT license and has received 1k stars on GitHub, with recent releases including an iOS app rebuild, MCP toolkit expansion, a CodeMirror6-based markdown editor, and a daily dashboard.
Evidence
- "Concerns arose after Karpathy’s viral tweet sparked a surge in AI-powered knowledge base projects, with one comment warning that the low barrier to entry could lead to incomplete designs becoming standardized like LangChain. Criticism centered on the 'local-first' claim, arguing that core functionality defaults to remote operation, questioning its true local-first nature. Questions were raised about its differentiation from directly connecting Claude to an Obsidian vault, though Atomic’s automated embedding, tagging, and wiki synthesis pipeline appear to be key differentiators. Skepticism was expressed regarding the practical utility of force-directed graph visualization, with doubts about its value in actual workflows. A philosophical objection questioned whether AI-driven thinking and memory synthesis stifle new ideas, and feedback on copywriting quality was received after encountering LLM-generated marketing copy."
How to Apply
- "If you have a large collection of personal research notes or technical documents that are difficult to find, self-host an Atomic server and import existing notes via REST API or Obsidian sync to quickly locate relevant materials using semantic search. If you use AI coding tools like Claude or Cursor and want them to reference your knowledge base, run Atomic’s MCP server locally and connect it to MCP clients for automatic note retrieval and citation during conversations. To consolidate materials from various sources into a single, organized document, tag related notes and run the Wiki Synthesis feature to automatically generate a cited wiki-style document. If you want to quickly add useful articles found during web research to your knowledge base, install the browser extension to clip them or register RSS feeds for automatic embedding, tagging, and linking."
Terminology
Related Papers
Show HN: Airbyte Agents – context for agents across multiple data sources
Airbyte가 Slack, Salesforce, Linear 등 여러 SaaS 시스템의 데이터를 미리 인덱싱해서 Agent가 API를 일일이 뒤지지 않아도 되는 Context Store를 출시했다. 기존 MCP 방식보다 토큰을 최대 90%까지 줄이는 효과를 확인했다.
A polynomial autoencoder beats PCA on transformer embeddings
PCA 인코더에 2차 다항식 디코더를 붙여서 닫힌 형태(closed-form)로 embedding 압축 품질을 크게 개선하는 기법으로, SGD 없이 numpy만으로 구현 가능하다.
From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction
RAG 스타일 텍스트 검색 대신 Schema로 정의된 구조화 레코드에 메모리를 저장하면, 정확한 사실 조회·상태 추적·집계 쿼리에서 압도적으로 높은 정확도를 얻을 수 있다.
We replaced RAG with a virtual filesystem for our AI documentation assistant
Explains how Mintlify overcame RAG chunking limitations by building a virtual filesystem (ChromaFs) on top of Chroma DB that mimics UNIX commands, reducing session boot time from 46 seconds to 100ms.
Chroma Context-1: Training a Self-Editing Search Agent
Chroma's newly released 20B parameter agentic search model claims frontier-LLM-level retrieval performance at 1/10 the cost and 10x the speed — though a significant controversy over failure to cite prior work has emerged in the community.
Show HN: Gemini can now natively embed video, so I built sub-second video search