向量数据库迎来高性能部署选项，支持更苛刻工作负载-编程阁

Vector database startup Pinecone Systems Inc. today announced a new, high-performance deployment option for customers that need to support the most demanding enterprise use cases.
向量数据库初创公司Pinecone Systems Inc.今日宣布推出一款全新的高性能部署选项，旨在满足需要支持最苛刻企业用例的客户需求。

It’s called Dedicated Read Nodes or DRN, and it’s now available in public preview, giving customers access to reserved capacity for low-latency queries with predictable performance and cost. The company explained that DRNs allow it to support a wider range of use cases that have extreme but variable performance requirements.
该选项名为“专用读取节点”（DRN），目前已开放公开预览，使客户能够使用预留容量进行低延迟查询，并获得可预测的性能和成本。该公司解释说，DRN使其能够支持性能要求极端且多变的更广泛用例。

Pinecone is the creator of an advanced vector database that can dynamically store, transform and index billions of high-dimensional data points, enabling it to respond rapidly and accurately to queries such as nearest-neighbor search.
Pinecone是一家先进向量数据库的创建者，该数据库能够动态存储、转换和索引数十亿个高维数据点，从而能够快速准确地响应诸如最近邻搜索之类的查询。

Unlike relational databases, which store data in rows and columns, vector databases represent unstructured data as high-dimensional data points, each representing a vector or an array of numbers. One of the primary functions of a vector database is to perform similarity searches, which can quickly find vectors that are most similar to a given query vector using measures such as cosine similarity or Euclidean distance. Vector databases are seen as essential for artificial intelligence workloads, as large language models need rapid access to vast amounts of unstructured data.
与以行和列存储数据的关系型数据库不同，向量数据库将非结构化数据表示为高维数据点，每个点代表一个向量或数字数组。向量数据库的主要功能之一是执行相似性搜索，可以通过余弦相似度或欧几里得距离等度量方法，快速找到与给定查询向量最相似的向量。向量数据库被视为人工智能工作负载的关键，因为大型语言模型需要快速访问海量的非结构化数据。

In a blog post, Pinecone explained that AI systems have complex requirements. Some applications, such as RAG, AI agents, model prototypes and scheduled jobs have “bursty” workloads, where they maintain a low and steady flow of traffic most of the time, before suddenly bursting into life when there are spikes in query volume. In such cases, Pinecone’s standard on-demand database is ideal, providing a combination of simplicity, elasticity and usage-based pricing.
在一篇博客文章中，Pinecone解释说，AI系统有着复杂的需求。一些应用，例如RAG、AI智能体、模型原型和计划任务，具有“突发性”工作负载特征，即大部分时间维持较低且稳定的流量，但在查询量激增时会突然活跃起来。对于这种情况，Pinecone标准的按需数据库是理想选择，它结合了简单性、弹性和基于使用量的定价。

However, some applications require consistent high throughput, operate at larger scales and can be extremely sensitive to latency. For instance, billion-vector-scale semantic searches, real-time recommendation systems and user-facing assistants with tight service-level objectives demand a more consistent level of performance, along with predictable costs at scale.
然而，另一些应用则需要持续的高吞吐量、大规模运行，并且对延迟极其敏感。例如，数十亿向量规模的语义搜索、实时推荐系统以及具有严格服务水平目标的面向用户的助手，都要求更稳定的性能水平以及大规模下可预测的成本。

Better performance without limits
突破极限的更好性能

This is why Pinecone is introducing DRNs, a new deployment option where queries run on isolated, provisioned nodes that are dedicated to these kinds of workloads. With these nodes, the data stays “warm” in the system’s memory and on a local solid-state drive.
正因如此，Pinecone推出了DRN。这是一种新的部署选项，查询将在专门为此类工作负载分配的、隔离的预配置节点上运行。通过这些节点，数据在系统内存和本地固态硬盘中保持“温热”状态。

这意味着可以快速访问数据而无需“冷启动”——冷启动是由于需要先从对象存储中获取信息而导致的。由于节点专用于每个工作负载，因此不存在“吵闹邻居”、共享队列和查询限制的问题。

DRNs scale along two dimensions, with replicas ensuring maximum throughput and availability to improve resilience, and shards used to expand storage capacity. Users can add as many replicas and shards as they desire to ensure their workloads can scale. To ensure predictable costs, pricing is based on an hourly rate per node.
DRN沿着两个维度进行扩展：副本确保最大的吞吐量和可用性以提高弹性，分片用于扩展存储容量。用户可以根据需要添加任意数量的副本和分片，以确保其工作负载能够扩展。为了保证成本可预测，定价基于每个节点的每小时费率。

Pinecone said customers will benefit from the lowest possible latency and guaranteed high throughput to ensure more consistent performance for high query-per-second workloads. DRNs can also scale indefinitely, and the company further claims that customers will see lower, more predictable costs compared to its on-demand nodes, which are based on a per-request pricing model.
Pinecone表示，客户将受益于尽可能低的延迟和有保障的高吞吐量，从而为高每秒查询量的工作负载确保更稳定的性能。DRN还可以无限扩展，该公司进一步声称，与基于按请求定价模型的按需节点相比，客户将看到更低、更可预测的成本。

DRNs are a deployment option for the most demanding use cases, where companies require performance isolation, predictable low-latency under heavy loads and linear scaling as demand grows. In addition to billion vector-scale search and recommendation systems, DRNs can also be useful for mission-critical AI applications, large enterprise or multitenant platforms that require isolation to prevent one workload impacting on another, and other applications that need performance at scale.
DRN是为最苛刻的用例设计的一种部署选项，适用于那些需要性能隔离、重负载下可预测的低延迟以及随着需求增长能线性扩展的公司。除了数十亿向量规模的搜索和推荐系统外，DRN也适用于关键任务型AI应用、需要隔离以防止工作负载相互影响的大型企业或多租户平台，以及其他需要大规模性能的应用。

Pinecone said its DRNs have proven their reliability under real-world conditions for several early adopters. One customer is using DRN to support metadata-filtered real-time媒体搜索 on its design platform, and was able to sustain 600-queries-per-second performance with latency of just 45 milliseconds across 135 million vectors. The same customer also pushed it to the limit, running a load test that saw its node reach an impressive 2,200 queries per second with a P50 latency of just 60 milliseconds.
Pinecone表示，其DRN已经在对几家早期采用者的实际环境测试中证明了其可靠性。一位客户正在其设计平台上使用DRN来支持基于元数据过滤的实时媒体搜索，能够在处理1.35亿个向量时，维持每秒600次查询的性能，延迟仅为45毫秒。该客户还进行了极限测试，运行负载测试使其节点达到了令人印象深刻的每秒2200次查询，P50延迟仅为60毫秒。

In another example, a customer running a large e-commerce marketplace deployed its recommendation engine on Pinecone’s DRNs to support 5,700 queries per second with a P50 latency of just 26 milliseconds across a database of 1.4 billion vectors.
在另一个例子中，一家运营大型电子商务市场的客户在Pinecone的DRN上部署了其推荐引擎，以支持在包含14亿向量的数据库上实现每秒5700次查询，P50延迟仅为26毫秒。
更多精彩内容请关注我的个人公众号公众号（办公AI智能小助手）或者我的个人博客 https://blog.qife122.com/
对网络安全、黑客技术感兴趣的朋友可以关注我的安全公众号（网络安全技术点滴分享）

向量数据库迎来高性能部署选项，支持更苛刻工作负载

学长亲荐！专科生必用TOP10一键生成论文工具测评

USBlyzer抓包机制深度剖析：系统学习数据过滤策略

IoT测试：连接设备的质量四维挑战

推理评测量化一步到位，结果可视化展示更直观

住宿餐饮：酒店预订系统API集成测试报告‌

一锤定音.sh脚本解读：自动化下载与部署的核心逻辑剖析