infiniflow.orgAI tool

InfinityFlow

infiniflow.org
Pricing plans

Detailed pricing plans are not available yet for this tool.

Detailed overview
Skip to main contentCore FeaturesInfinity offers top performance, flexibility, easy usability, and advanced features for future AI application challenges.Incredibly fastAchieves 0.1 milliseconds query latency on million-scale vector datasets.Up to 15K QPS on million-scale vector datasets.Powerful searchSupports a hybrid search of dense embedding, sparse embedding, tensor, and full text, in addition to filtering.Supports several types of rerankers including RRF, weighted sum, and ColBERT.Rich data typesSupports a wide range of data types including strings, numerics, vectors, and more.Ease-of-useIntuitive Python API.A single-binary architecture with no dependencies, making deployment a breeze.Join Our CommunityTwitterGithubDiscord --- Skip to main contentInfinityVersion: v0.7.0-dev4📄️ QuickstartA quickstart guide.📄️ Build from SourceBuild Infinity from source, build and run unit/functional tests.📄️ Deploy Infinity using binaryDeploy the Infinity database using binary. --- Skip to main contentInfinity v0.2 delivers the most comprehensive hybrid search solution to date, including vector search, full-text search, sparse vector search, and tensor search. It also provides three fusion reranking methods: RRF, Weighted Sum, and ColBERT Reranker. How effective are these search and ranking solutions in practice? This blog article delves into the details for you.Infinity v0.2 was released, offering the most comprehensive and fastest multi-way retrieval in the industry. This blog post explains how Infinity achieves this. Infinity is a database with sophisticated designs at both storage engine and execution engine levels. The following diagram illustrates the workflow of Infinity's execution engine: after binding the API queries, the execution plan is compiled into a pipeline execution plan. This mechanism differs from those commonly seen in modern data warehouses. Pipelines in data warehouses are designed mainly for parallel query execution; Infinity's pipeline serves both parallel querying and concurrent query execution to optimize scheduling strategies and CPU affinity for query operators during high-concurrency execution, and avoid overhead caused by invalid context switches. This optimization in design translates to reduced end-to-end query overhead and an overall query latency comparable to latencies running a single retrieval library.Infinity v0.2 was released, introducing two new data types: Sparse vector and Tensor. Besides full-text search and vector search, Infinity v0.2 offers more retrieval methods. As shown in the diagram below, users can now do retrieval from as many ways as they wish (N ≥ 2) in a hybrid search, making Infinity the most powerful database for RAG so far.Since the open-sourcing of Infinity, it has received a wide positive response from the community. Regarding the essential RAG technology we promote - multiple recall (vector recall, full-text search, and structured data query), some friends mentioned that simply using vectors can also meet the requirements. What we traditionally refer to as vector retrieval is a type of query based on dense vector data, known as Dense Embedding. There is another type of vector data, sparse vector, known as Sparse Embedding, which can provide the precise queries necessary for RAG. By combining these two types of vector data, multi-path recall can be achieved (2 paths of recall). With Sparse Embedding, there is no need for full-text search; BM25 can be completely replaced (BM25 is a common full-text indexing and sorting method, which can be seen as a variation of TF/IDF). Let's see if this is really possible. Dense Embedding refers to vectors where the dimensions may not be very high, but each dimension is numerically represented as a certain weight. Sparse Embedding refers to most dimensions of the vector being zero, with only a few dimensions having values; the overall vector dimension can be very high."Is Infinity just another vector database? Since there are already many vector databases available, why bother creating another one from scratch?" "Traditional databases can easily incorporate vector search capabilities, so why reinvent the wheel?" "Elasticsearch already has decent support for what you refer to as multiple recall. Then, what sets Infinity apart?"On January 4, 2024, CMU professor Andy Pavlo, known for his acclaimed database lectures, published his 2023 database review, primarily focusing on the rise of vector databases. 2023 saw notable advancements in this field with significant investments made in April. By 2023Q3, vector databases were used as external memory for large language models. In 2023Q4, this approach started to gain popularity and became widely known as Retrieval-Augmented Generation (RAG), with some even predicting that 2024 would be the "Year of RAG." Drawing from Andy's viewpoints and the challenges facing RAG, we would like to provide our own evaluation of the future prospects for vector databases.After extensive development, the AI-native database Infinity was officially open-sourced on December 21, 2023. Infinity is specifically designed to cater to large models and is primarily used for Retrieval Augmented Generation (RAG). --- Skip to main contentInfinityVersion: v0.7.0-dev4On this pageQuickstart A quickstart guide. Prerequisites​ CPU: x86_64 with AVX2 support. OS: Linux with glibc 2.17+. Windows 10+ with WSL/WSL2. MacOS Python: Python 3.10+. Deploy Infinity using Docker​ This section provides guidance on deploying the Infinity database using Docker, with the client and server as separate processes. Install Infinity server​ Linux x86_64 & MacOS x86_64Windowssudo mkdir -p /var/infinity && sudo chown -R $USER /var/infinitydocker pull infiniflow/infinity:nightlydocker run -d --name infinity -v /var/infinity/:/var/infinity --ulimit nofile=500000:500000 --network=host infiniflow/infinity:nightlyIf you are on Windows 10+, you must enable WSL or WSL2 to deploy Infinity using Docker. Suppose you've installed Ubuntu in WSL2: Follow this to enable systemd inside WSL2. Install docker-ce according to the instructions here. If you have installed Docker Desktop version 4.29+ for Windows: Settings > Features in development, then select Enable host networking. Pull the Docker image and start Infinity: sudo mkdir -p /var/infinity && sudo chown -R $USER /var/infinitydocker pull infiniflow/infinity:nightlydocker run -d --name infinity -v /var/infinity/:/var/infinity --ulimit nofile=500000:500000 --network=host infiniflow/infinity:nightly Install Infinity client​ pip install infinity-sdk==0.7.0.dev4 Run a vector search​ import infinityinfinity_obj = infinity.connect(infinity.NetworkAddress("", 23817)) db_object = infinity_object.get_database("default_db")table_object = db_object.create_table("my_table", {"num": {"type": "integer"}, "body": {"type": "varchar"}, "vec": {"type": "vector, 4, float"}})table_object.insert([{"num": 1, "body": "unnecessary and harmful", "vec": [1.0, 1.2, 0.8, 0.9]}])table_object.insert([{"num": 2, "body": "Office for Harmful Blooms", "vec": [4.0, 4.2, 4.3, 4.5]}])res = table_object.output(["*"]) .match_dense("vec", [3.0, 2.8, 2.7, 3.1], "float", "ip", 2) .to_pl()print(res) NOTEFor detailed information about the capabilities and usage of Infinity's Python API, see the Python API Reference. NOTEIf you wish to deploy Infinity using binary with the server and client as separate processes, see the Deploy infinity using binary guide. Build from Source​ If you wish to build Infinity from source, see the Build from Source guide. Try our Python examples​ Try the following links to explore practical examples of using Infinity in Python: Create table, insert data, and search Import file and export data Delete or update data Conduct a vector search Conduct a full-text search Conduct a hybrid search Python API reference​ For detailed information about the capabilities and usage of Infinity's Python API, see the Python API Reference.PrerequisitesDeploy Infinity using DockerInstall Infinity serverInstall Infinity clientRun a vector searchBuild from SourceTry our Python examplesPython API reference