feat: update docs

milvus-io · Aug 16, 2024 · f34d636 · f34d636
1 parent 86e1f40
commit f34d636
Show file tree

Hide file tree

Showing 1,661 changed files with 364,664 additions and 1,283 deletions.
diff --git a/package.json b/package.json
@@ -3,6 +3,7 @@
 	"version": "1.0.0",
 	"description": "Welcome to Milvus documentation!",
 	"scripts": {
+		"translate": "node scripts/translate.js",
 		"test": "echo \"Error: no test specified\" && exit 1"
 	},
 	"author": "",

diff --git a/site/de/about/comparison.md b/site/de/about/comparison.md
@@ -0,0 +1,82 @@
+---
+id: comparison.md
+title: Comparison
+summary: This article compares Milvus with other vector search solutions.
+---
+
+# Comparing Milvus with Alternatives
+
+When exploring various vector database options, this comprehensive guide will help you understand Milvus's unique features, ensuring you choose a database that best fits your specific needs. Notably, Milvus is a leading open-source vector database, and [Zilliz Cloud](https://zilliz.com/cloud) offers a fully-managed Milvus service. To objectively evaluate Milvus against its competitors, consider using [benchmark tools](https://github.com/zilliztech/VectorDBBench#quick-start) to analyze performance metrics.
+
+## Milvus highlights
+
+- **Functionality**: Milvus goes beyond basic vector similarity search by supporting advanced functionalities like [sparse vector](https://milvus.io/docs/sparse_vector.md), [bulk-vector](https://milvus.io/docs/single-vector-search.md#Bulk-vector-search), [filtered search](https://milvus.io/docs/single-vector-search.md#Filtered-search), and [hybrid search](https://milvus.io/docs/multi-vector-search.md) capabilities.
+
+- **Flexibility**: Milvus accommodates various deployment modes and multiple SDKs, all within a robust, integrated ecosystem.
+
+- **Performance**: Milvus guarantees real-time processing with high throughput and low latency, powered by optimized indexing algorithms such as [HNSW](https://milvus.io/docs/index.md#HNSW) and [DiskANN](https://milvus.io/docs/disk_index.md), and advanced [GPU acceleration](https://milvus.io/docs/gpu_index.md).
+
+- **Scalability**: Its bespoke distributed architecture effortlessly scales, accommodating anything from small datasets to collections exceeding 10 billion vectors.
+
+## Overall comparison
+
+To compare between Milvus and Pinecone, two vector database solutions, the following table is structured to highlight differences across various features.
+
+| Feature | Pinecone | Milvus | Remarks |
+| --- | --- | --- | --- |
+| Deployment Modes | SaaS-only | Milvus Lite, On-prem Standalone & Cluster,  Zilliz Cloud Saas & BYOC | Milvus offers greater flexibility in deployment modes. |
+| Supported SDKs | Python, JavaScript/TypeScript | Python, Java, NodeJS, Go, Restful API, C#, Rust | Milvus supports a wider array of programming languages. |
+| Open-source Status | Closed | Open-source | Milvus is a popular open-source vector database. |
+| Scalability | Scale up/down only | Scale out/in and Scale up/down | Milvus features a distributed architecture for enhanced scalability. |
+| Availability | Pod-based architecture within available zones | Available zone failover and cross-region HA | Milvus CDC (Change Data Capture) enables primary/standby modes for higher availability. |
+| Perf-Cost (Dollar per million queries) | Starts at $0.178 for a medium dataset, $1.222 for a large dataset | Zilliz Cloud starts at $0.148 for a medium dataset, $0.635 for a large dataset; free version available | Refer to [Cost Ranking report](https://zilliz.com/vector-database-benchmark-tool?database=ZillizCloud,Milvus,ElasticCloud,PgVector,Pinecone,QdrantCloud,WeaviateCloud&dataset=medium&filter=none,low,high&tab=2). |
+| GPU Acceleration | Not supported | Support NVIDIA GPU | GPU acceleration significantly enhances performance, often by orders of magnitude. |
+
+## Terminology comparison
+
+Although both serve similar functions as vector databases, the domain-specific terminology between Milvus and Pinecone shows slight variations. A detailed terminology comparison is as follows.
+
+| Pinecone | Milvus | Remarks |
+| --- | --- | --- |
+| Index | [Collection](https://zilliz.com/comparison) | In Pinecone, an index serves as the organizational unit for storing and managing vectors of identical size, and this index is closely integrated with the hardware, known as pods. In contrast, Milvus collections serve a similar purpose but enable handling multiple collections within a single instance. |
+| Collection | [Backup](https://milvus.io/docs/milvus_backup_overview.md#Milvus-Backup) | In Pinecone, a collection is essentially a static snapshot of an index, used mainly for backup purposes and cannot be queried. In Milvus, the equivalent feature for creating backups is more transparent and straightforwardly named. |
+| Namespace | [Partition key](https://milvus.io/docs/use-partition-key.md#Use-Partition-Key) | Namespaces allow the partitioning of vectors in an index into subsets. Milvus provides multiple methods like partition or partition key to ensure efficient data isolation within a collection. |
+| Metadata | [Scalar field](https://milvus.io/docs/boolean.md) | Pinecone's metadata handling relies on key-value pairs, while Milvus allows for complex scalar fields, including standard data types and dynamic JSON fields. |
+| Query | [Search](https://milvus.io/docs/single-vector-search.md) | Name of the method used to find the nearest neighbors for a given vector, possibly with some additional filters applied on top. |
+| Not available	 | [Iterator](https://milvus.io/docs/with-iterators.md) | Pinecone lacks a feature for iterating through all vectors in an index. Milvus introduces Search Iterator and Query Iterator methods, enhancing data retrieval capabilities across datasets. |
+
+## Capability comparison
+
+| Capability | Pinecone | Milvus |
+| --- | --- | --- |
+| Deployment Modes | SaaS-only | Milvus Lite, On-prem Standalone & Cluster,  Zilliz Cloud Saas & BYOC |
+| Embedding Functions | Not available	 | Support with <a href="https://github.com/milvus-io/milvus-model">pymilvus[model]</a> |
+| Data Types | String, Number, Bool, List of String | String, VarChar, Number (Int, Float, Double), Bool, Array, JSON, Float Vector, Binary Vector, BFloat16, Float16, Sparse Vector |
+| Metric and Index Types | Cos, Dot, Euclidean<br/>P-family, S-family | Cosine, IP (Dot), L2 (Euclidean),  Hamming, Jaccard<br/>FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes |
+| Schema Design | Flexible mode | Flexible mode, Strict mode |
+| Multiple Vector Fields | N/A | Multi-vector and hybrid search |
+| Tools | Datasets, text utilities, spark connector | Attu, Birdwatcher, Backup, CLI, CDC, Spark and Kafka connectors |
+
+### Key insights
+
+- **Deployment modes**: Milvus offers a variety of deployment options, including local deployment, Docker, Kubernetes on-premises, Cloud SaaS, and Bring Your Own Cloud (BYOC) for enterprises, whereas Pinecone is limited to SaaS deployment.
+
+- **Embedding functions**: Milvus supports additional embedding libraries, enabling the direct use of embedding models to transform source data into vectors.
+
+- **Data types**: Milvus supports a wider range of data types than Pinecone, including arrays and JSON. Pinecone supports only a flat metadata structure with strings, numbers, booleans, or lists of strings as values, whereas Milvus can handle any JSON object, including nested structures, within a JSON field. Pinecone limits the metadata size to 40KB per vector.
+
+- **Metric and index types**: Milvus supports a broad selection of metric and index types to accommodate various use cases, while Pinecone has a more limited selection. While an index for vector is mandatory in Milvus, an AUTO_INDEX option is available to streamline the configuration process.
+
+- **Schema design**: Milvus offers flexible `create_collection` modes for schema design, including a quick setup with a dynamic schema for a schema-less experience similar to Pinecone and a customized setup with predefined schema fields and indexes akin to a relational database management system (RDBMS).
+
+- **Multiple vector fields**: Milvus enables the storage of multiple vector fields within a single collection, which can be either sparse or dense and may vary in dimensionality. Pinecone does not offer a comparable feature.
+
+- **Tools**: Milvus offers a more extensive selection of tools for database management and utilization, such as Attu, Birdwatcher, Backup, CLI, CDC and Spark and Kafka connector.
+
+## What's next
+
+- **Trial**: Experience Milvus firsthand by starting with the Milvus [quickstart](https://milvus.io/docs/quickstart.md) or [signing up for Zilliz Cloud](https://docs.zilliz.com/docs/register-with-zilliz-cloud).
+
+- **Learn more**: Dive deeper into Milvus's features through our comprehensive [Terminology](glossary.md) and [User Guides](https://milvus.io/docs/manage-collections.md).
+
+- **Explore alternatives**: For a broader comparison of vector database options, explore additional resources on [this page](https://zilliz.com/comparison).
diff --git a/site/de/about/limitations.md b/site/de/about/limitations.md
@@ -0,0 +1,85 @@
+---
+id: limitations.md
+title: Milvus Limits
+related_key: Limitations
+summary: Learn about the limits while using Milvus.
+---
+# Milvus Limits
+
+Milvus is committed to providing the best vector databases to power AI applications and vector similarity search. However, the team is continuously working to bring in more features and the best utilities to enhance user experience. This page lists out some known limitations that the users may encounter when using Milvus.
+
+## Length of a resource name
+
+| Resource      | Limit  |
+| ----------- | ----------- |
+| Collection      | 255 characters      |
+| Field   | 255 characters        |
+| Index   | 255 characters       |
+| Partition   | 255  characters      |
+
+## Naming rules
+
+The name of a resource can contain numbers, letters, and underscores (\_\). A resource name must start with a letter or an underscore (\_\).
+
+## Number of resources
+
+| Resource      | Limit |
+| ----------- | ----------- |
+| Collection     | 65,536       |
+| Connection / proxy   | 65,536        |
+
+## Number of resources in a collection
+
+| Resource     | Limit|
+| ----------- | ----------- |
+| Partition      | 4,096       |
+| Shard   | 16        |
+| Field   | 64        |
+| Index   | 1        |
+| Entity   | unlimited        |
+
+## Length of a string 
+| Data type      | Limit  |
+| ----------- | ----------- |
+| VARCHAR      | 65,535       |
+
+
+
+## Dimensions of a vector
+| Property      | Limit |
+| ----------- | ----------- |
+| Dimension      | 32,768       |
+
+## Input and Output per RPC
+| Operation      | Limit |
+| ----------- | ----------- |
+| Insert      | 64 MB    |
+| Search   | 64 MB     |
+| Query   | 64 MB      |
+
+## Load limits
+In current release, data to be load must be under 90% of the total memory resources of all query nodes to reserve memory resources for execution engine.
+
+## Search limits
+| Vectors      | Limit |
+| ----------- | ----------- |
+| <code>topk</code> (number of the most similar result to return)   | 16,384       |
+| <code>nq</code> (number of the search requests)    | 16,384       |
+
+## Index limits on different search types
+
+The following table provides an overview of the support for various search behaviors across different index types.
+
+|                                      | HNSW | DISKANN | FLAT | IVF_FLAT | IVF_SQ8 | IVF_PQ | SCANN | GPU_IFV_FLAT | GPU_IVF_PQ | GPU_CAGRA | GPU_BRUTE_FORCE | SPARSE_INVERTED_INDEX | SPARSE_WAND         | BIN_FLAT | BIN_IVF_FLAT |
+|--------------------------------------|------|---------|------|----------|---------|--------|-------|--------------|------------|-----------|-----------------|-----------------------|---------------------|----------|--------------|
+| Basic search                         | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes                   | Yes                 | Yes      | Yes          |
+| Partition search                     | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes                   | Yes                 | Yes      | Yes          |
+| Basic search with raw data retrieved | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes                   | Yes                 | Yes      | Yes          |
+| Basic search with pagination         | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes                   | Yes                 | Yes      | Yes          |
+| Filtered search                      | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes                   | Yes                 | Yes      | Yes          |
+| Range search                         | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | No           | No         | No        | No              | No                    | No                  | Yes      | Yes          |
+| Grouping search                      | Yes  | No      | Yes  | Yes      | No      | No     | No    | No           | No         | No        | No              | No                    | No                  | No       | No           |
+| Search with iterator                 | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | No           | No         | No        | No              | No                    | No                  | No       | No           |
+| Hybrid search                        | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes(Only RRFRanker)   | Yes(Only RRFRanker) | Yes      | Yes          |
+| Query/Get                            | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | Yes          | Yes        | Yes       | Yes             | Yes                   | Yes                 | Yes      | Yes          |
+| Query with iterator                  | Yes  | Yes     | Yes  | Yes      | Yes     | Yes    | Yes   | No           | No         | No        | No              | Yes                   | Yes                 | Yes      | Yes          |
diff --git a/site/de/about/milvus_adopters.md b/site/de/about/milvus_adopters.md
@@ -0,0 +1,13 @@
+---
+id: milvus_adopters.md
+title: Milvus Adopters
+related_key: Milvus adopters
+summary: Learn about companies that have adopted Milvus in production applications.
+---
+# Milvus Adopters
+
+Milvus stands as the leading vector database among enterprise users, boasting endorsements from prominent companies such as Nvidia, Roblox, AT&T, and more. Below is a list of enterprises that have successfully integrated Milvus into their applications.
+
+![Milvus Adopters](../../../assets/milvus-adopters.png "Milvus adopters.")
+
+Visit our [Milvus use case page](https://milvus.io/use-cases) to discover how industry leaders leverage Milvus to power up their AI applications and accelerate business growth.