This is your brain on vectors
We're in the middle of a Cambrian explosion of tools, ideas, startups, and more in the world of AI.
The funny thing is how much of it is levered simply on the basis of chat being the new interface between humans and language models.
This is something that was both predictable over the very long term a few years ago and also unpredictable that it would happen in such a short time.
As I work on the application side of all this, I find myself taking many notes of projects, announcements, and frameworks, primarily so I can keep track of it all as things unfold.
In my post on Baby-AGI about ten days ago, I concluded with an observation that will take years to play out. The crux of it is this: OpenAI (and others) don't (currently) appear to worry too much about preserving the state of the billions of chats that are happening daily.
That's probably not actually true, I'm sure there is a ton of telemetry and analytics happening on the backend but it is not yet a product or even transparent.
This new round of "AGI" (it's not actually AGI) frameworks, like Baby-AGI, Auto-GPT, etc. and working to fill in those gaps by combining vector databases with agent frameworks, giving these sessions a large degree of autonomy via goal-setting mechanisms and memory of what has occurred.
The thing that will become a big deal once people stop treating these as toys will be two-fold:
- Who owns the data?
- Who sets the goals for the agents and where are the guard rails?
If answering these two questions is not a core part of your AI strategy, you've already f-ed up.
Meanwhile, Pinecone just raised $100M to try to answer the first question for you.
I have nothing against Pinecone as a company, nor anything against their investors but my recommendation is that you do not want to build your intellectual property in AI on a cloud database such as this.
I'm not even sure you really need a full-fledged, specialized vector database for these applications but time will tell on that question.
In the meantime, if you're looking to build agent systems backed by LLMs, take a look at open source alternatives like Chroma or even Postgres.