LLM Engineers around the world are trying out different RAG (Retrieval Augmented Generation) systems, and all these systems is fueled by vector databases. ā½ļø
And since Milvus is the number 1 (one) vector database in the world with 25k stars on Github, I wanted to tell you more about it!
Milvus is so cool, thatās why Iām always supporting this project. So much that even in the session, somebody assumed me as a team member of Milvus! š
Shout out to my bestie ā
! Heās my favourite developer advocate at Milvus! š«¶Thanks to the great people: Frank, Yujian, Christy, Chris, Emily, and Namee the legend for talking about awesome stuff!
Range Search: š
The first thing I learned when I jumped into the conversation was range search!
Letās see what it is:
A range search serves the exact purpose of filtering search results by narrowing down according to the distance between a query vector and database vectors.
After loading your collection into memory (to make Milvus ready to search in memory), you just need these parameters to give to your search method.
param = {
# use `L2` or `IP` as the metric to calculate the distance!
"metric_type": "L2",
"params": {
# search for vectors with a distance smaller than 1.0
"radius": 1.0,
# filter out vectors with a distance smaller <= 0.8
"range_filter" : 0.8}}
So, I just Googled and found out from the Milvus docs. Looks like, instead of getting top_k youāre getting a list of vectors within the range that you define.
Multi-tenancy: šļø
It is a concept to serve your different tenants at the same time, while the data and resources are used independently.
In that way, you can isolate your tenants from each other for seeing the results that we donāt want them to see!
There are different ways to do that, you should consider this table below;
Partitioning + Metadata Filtering: šŖ©
Partitioning and metadata filtering are whole different concepts thatās why Iām putting under another chapter.
- Partitioning:
From the table, what I can see is under the hood Milvus collections are consists of shards, partitions, and segments. To separate your customers, your users for targeted data reading.
- Metadata Filtering:
As far as I understand from the blog posts and other resources, Milvus doing metadata filtering with boolean expressions. In the documentations it says:
expr: Boolean expression used to filter attribute. See Boolean Expression Rules for more information.
Zilliz Cloud Cardinal: šļø
Since the meeting was about q&a of zilliz cloud, of course I learned a lot about Cardinal.
But before diving into that, have you realized how much I learned from the meeting? before joining into call, I was just thinking about why would I go to the q&a session about cloud platform? (Iām still broke student š )
It turns out me learning a lot of cool concepts + having great conversations with experts. Itās all about showing up every day, and learning. At least my lifestyle is that.
Letās continueā¦
Zilliz cloud has two different versions (totally different in terms of speed):
Zilliz Cloud old engine,
Zilliz Cloud with cardinal!
and youāre gonna definitely be amazed when you see how optimizations changes the performance! š§®
Benchmarksss:
Seems like a lot of different kind of optimizations have been implemented in the new engine compared to the old one such as:
Algorithm optimizations
Engineering optimizations
Low-level optimizations
AutoIndex: search strategy selection
Iām not gonna talk you about what I donāt understand, those different optimizations. but If you understand those kind of stuff; definitely check the blog post about Cardinal, unfortunately i donāt understand kernel-level stuff! š„²
For the x86 platform, Cardinal kernels use AVX-512ās F, CD, VL, BW, DQ, VPOPCNTDQ, VBMI, VBMI2, VNNI, BF16 and FP16 extensions. Also, we are exploring the usage of a new AMX instruction set.
AutoIndex: search strategy selection
Milvusās crazy team implemented this AI-based dynamic strategy selection mechanism. Looks fantastic! Based on the distribution it selects the best strategy for your query. And also, in the meeting one of the team member emphasized on the using āautoā is the case most of the time. I got it right now why she said that. š
Image from the blogpost, Iām amazed by this results, how about you?
Itās getting late for me, 4.08am so I guess I gotta sleep at least this time. š«Ø
Dope Regards
Accelerate






