Creating a Personal Assistant with an LLM

Published: January 25, 2024

I’ve been getting increasingly frustrated with ChatGPT - especially when I want to provide it a meaningful amount of background or context. This week, I set out to setup an LLM locally with as much personal context as I could find. My goal was simple: can I ask basic life/schedule/note-related questions to my own little personal assistant LLM?

@obsdmd vault + @NotionHQ dump + @MistralAI via @LangChainAI ... this should be fun. How smart can my little assistant get? What else can I do to improve the responses?

Can I add this to @raycastapp or something and tie it into my workflow? Is anyone doing this already?
— lane (@laneparton) January 23, 2024

As you can see from my tweet - the premise was simple. Can I write a RAG chatbot based on the following data:

Obsidian - My current, everyday note-taking application
Notion - My previous note-taking application, where I’ve archived a number of old, still relevant notes.

It’s extremely straightforward with LangChain - so straightforward that I can outline it in an ordered list…

Clone the RAG Chroma LangChain example
Point a DirectoryLoader at the file directory where the Markdown files are (1)
…profit? Not quite…

I quickly realized two things. First, Notion doesn’t export any useful date information when you export all your documents. Second, I needed to enrich my Obsidian notes with metadata (like modified date) - where I found/used this. This allowed me to query things like “What’s for lunch on Tuesday?” (referring my Weekly Meal Plan) or “What tasks am I working on tomorrow” (referring to my Work Notepad) without getting results from deeper in my archive.

After posting it on X/Twitter it became apparent that this was just the beginning. Someone mentioned the idea of “re-ranking” - which is totally foreign to me. Googling around landed me at this repo. As you can see, there are a ton of things to think about to improve this. To quote the README:

better document parsing, hybrid search, HyDE enabled search, deep linking, re-ranking, the ability to customize embeddings, and more. The package is designed to work with custom Large Language Models (LLMs) – whether from OpenAI or installed locally.

That’s just a quick recap - leading me to even more ideas and things to research 🙂