LLM Engineers around the world are trying out different RAG (Retrieval Augmented Generation) systems, and all these systems is fueled by vector databases. โฝ๏ธ
And since Milvus is the number 1 (one) vector database in the world with 25k stars on Github, I wanted to tell you more about it!
Milvus is so cool, thatโs why Iโm always supporting this project. So much that even in the session, somebody assumed me as a team member of Milvus! ๐
Shout out to my bestie โ
! Heโs my favourite developer advocate at Milvus! ๐ซถThanks to the great people: Frank, Yujian, Christy, Chris, Emily, and Namee the legend for talking about awesome stuff!
Range Search: ๐
The first thing I learned when I jumped into the conversation was range search!
Letโs see what it is:
A range search serves the exact purpose of filtering search results by narrowing down according to the distance between a query vector and database vectors.
After loading your collection into memory (to make Milvus ready to search in memory), you just need these parameters to give to your search method.
param = {
# use `L2` or `IP` as the metric to calculate the distance!
"metric_type": "L2",
"params": {
# search for vectors with a distance smaller than 1.0
"radius": 1.0,
# filter out vectors with a distance smaller <= 0.8
"range_filter" : 0.8}}
So, I just Googled and found out from the Milvus docs. Looks like, instead of getting top_k youโre getting a list of vectors within the range that you define.
Multi-tenancy: ๐๏ธ
It is a concept to serve your different tenants at the same time, while the data and resources are used independently.
In that way, you can isolate your tenants from each other for seeing the results that we donโt want them to see!
There are different ways to do that, you should consider this table below;
Partitioning + Metadata Filtering: ๐ชฉ
Partitioning and metadata filtering are whole different concepts thatโs why Iโm putting under another chapter.
- Partitioning:
From the table, what I can see is under the hood Milvus collections are consists of shards, partitions, and segments. To separate your customers, your users for targeted data reading.
- Metadata Filtering:
As far as I understand from the blog posts and other resources, Milvus doing metadata filtering with boolean expressions. In the documentations it says:
expr
: Boolean expression used to filter attribute. See Boolean Expression Rules for more information.
Zilliz Cloud Cardinal: ๐๏ธ
Since the meeting was about q&a of zilliz cloud, of course I learned a lot about Cardinal.
But before diving into that, have you realized how much I learned from the meeting? before joining into call, I was just thinking about why would I go to the q&a session about cloud platform? (Iโm still broke student ๐ )
It turns out me learning a lot of cool concepts + having great conversations with experts. Itโs all about showing up every day, and learning. At least my lifestyle is that.
Letโs continueโฆ
Zilliz cloud has two different versions (totally different in terms of speed):
Zilliz Cloud old engine,
Zilliz Cloud with cardinal!
and youโre gonna definitely be amazed when you see how optimizations changes the performance! ๐งฎ
Benchmarksss:
Seems like a lot of different kind of optimizations have been implemented in the new engine compared to the old one such as:
Algorithm optimizations
Engineering optimizations
Low-level optimizations
AutoIndex: search strategy selection
Iโm not gonna talk you about what I donโt understand, those different optimizations. but If you understand those kind of stuff; definitely check the blog post about Cardinal, unfortunately i donโt understand kernel-level stuff! ๐ฅฒ
For the x86 platform, Cardinal kernels use AVX-512โs F, CD, VL, BW, DQ, VPOPCNTDQ, VBMI, VBMI2, VNNI, BF16 and FP16 extensions. Also, we are exploring the usage of a new AMX instruction set.
AutoIndex: search strategy selection
Milvusโs crazy team implemented this AI-based dynamic strategy selection mechanism. Looks fantastic! Based on the distribution it selects the best strategy for your query. And also, in the meeting one of the team member emphasized on the using โautoโ is the case most of the time. I got it right now why she said that. ๐
Image from the blogpost, Iโm amazed by this results, how about you?
Itโs getting late for me, 4.08am so I guess I gotta sleep at least this time. ๐ซจ
Dope Regards
Accelerate