news 2026/6/10 9:13:01

向量数据库迎来高性能部署选项,支持更苛刻工作负载

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
向量数据库迎来高性能部署选项,支持更苛刻工作负载

Vector database startup Pinecone Systems Inc. today announced a new, high-performance deployment option for customers that need to support the most demanding enterprise use cases.
向量数据库初创公司Pinecone Systems Inc.今日宣布推出一款全新的高性能部署选项,旨在满足需要支持最苛刻企业用例的客户需求。

It’s called Dedicated Read Nodes or DRN, and it’s now available in public preview, giving customers access to reserved capacity for low-latency queries with predictable performance and cost. The company explained that DRNs allow it to support a wider range of use cases that have extreme but variable performance requirements.
该选项名为“专用读取节点”(DRN),目前已开放公开预览,使客户能够使用预留容量进行低延迟查询,并获得可预测的性能和成本。该公司解释说,DRN使其能够支持性能要求极端且多变的更广泛用例。

Pinecone is the creator of an advanced vector database that can dynamically store, transform and index billions of high-dimensional data points, enabling it to respond rapidly and accurately to queries such as nearest-neighbor search.
Pinecone是一家先进向量数据库的创建者,该数据库能够动态存储、转换和索引数十亿个高维数据点,从而能够快速准确地响应诸如最近邻搜索之类的查询。

Unlike relational databases, which store data in rows and columns, vector databases represent unstructured data as high-dimensional data points, each representing a vector or an array of numbers. One of the primary functions of a vector database is to perform similarity searches, which can quickly find vectors that are most similar to a given query vector using measures such as cosine similarity or Euclidean distance. Vector databases are seen as essential for artificial intelligence workloads, as large language models need rapid access to vast amounts of unstructured data.
与以行和列存储数据的关系型数据库不同,向量数据库将非结构化数据表示为高维数据点,每个点代表一个向量或数字数组。向量数据库的主要功能之一是执行相似性搜索,可以通过余弦相似度或欧几里得距离等度量方法,快速找到与给定查询向量最相似的向量。向量数据库被视为人工智能工作负载的关键,因为大型语言模型需要快速访问海量的非结构化数据。

In a blog post, Pinecone explained that AI systems have complex requirements. Some applications, such as RAG, AI agents, model prototypes and scheduled jobs have “bursty” workloads, where they maintain a low and steady flow of traffic most of the time, before suddenly bursting into life when there are spikes in query volume. In such cases, Pinecone’s standard on-demand database is ideal, providing a combination of simplicity, elasticity and usage-based pricing.
在一篇博客文章中,Pinecone解释说,AI系统有着复杂的需求。一些应用,例如RAG、AI智能体、模型原型和计划任务,具有“突发性”工作负载特征,即大部分时间维持较低且稳定的流量,但在查询量激增时会突然活跃起来。对于这种情况,Pinecone标准的按需数据库是理想选择,它结合了简单性、弹性和基于使用量的定价。

However, some applications require consistent high throughput, operate at larger scales and can be extremely sensitive to latency. For instance, billion-vector-scale semantic searches, real-time recommendation systems and user-facing assistants with tight service-level objectives demand a more consistent level of performance, along with predictable costs at scale.
然而,另一些应用则需要持续的高吞吐量、大规模运行,并且对延迟极其敏感。例如,数十亿向量规模的语义搜索、实时推荐系统以及具有严格服务水平目标的面向用户的助手,都要求更稳定的性能水平以及大规模下可预测的成本。

Better performance without limits
突破极限的更好性能

This is why Pinecone is introducing DRNs, a new deployment option where queries run on isolated, provisioned nodes that are dedicated to these kinds of workloads. With these nodes, the data stays “warm” in the system’s memory and on a local solid-state drive.
正因如此,Pinecone推出了DRN。这是一种新的部署选项,查询将在专门为此类工作负载分配的、隔离的预配置节点上运行。通过这些节点,数据在系统内存和本地固态硬盘中保持“温热”状态。

这意味着可以快速访问数据而无需“冷启动”——冷启动是由于需要先从对象存储中获取信息而导致的。由于节点专用于每个工作负载,因此不存在“吵闹邻居”、共享队列和查询限制的问题。

DRNs scale along two dimensions, with replicas ensuring maximum throughput and availability to improve resilience, and shards used to expand storage capacity. Users can add as many replicas and shards as they desire to ensure their workloads can scale. To ensure predictable costs, pricing is based on an hourly rate per node.
DRN沿着两个维度进行扩展:副本确保最大的吞吐量和可用性以提高弹性,分片用于扩展存储容量。用户可以根据需要添加任意数量的副本和分片,以确保其工作负载能够扩展。为了保证成本可预测,定价基于每个节点的每小时费率。

Pinecone said customers will benefit from the lowest possible latency and guaranteed high throughput to ensure more consistent performance for high query-per-second workloads. DRNs can also scale indefinitely, and the company further claims that customers will see lower, more predictable costs compared to its on-demand nodes, which are based on a per-request pricing model.
Pinecone表示,客户将受益于尽可能低的延迟和有保障的高吞吐量,从而为高每秒查询量的工作负载确保更稳定的性能。DRN还可以无限扩展,该公司进一步声称,与基于按请求定价模型的按需节点相比,客户将看到更低、更可预测的成本。

DRNs are a deployment option for the most demanding use cases, where companies require performance isolation, predictable low-latency under heavy loads and linear scaling as demand grows. In addition to billion vector-scale search and recommendation systems, DRNs can also be useful for mission-critical AI applications, large enterprise or multitenant platforms that require isolation to prevent one workload impacting on another, and other applications that need performance at scale.
DRN是为最苛刻的用例设计的一种部署选项,适用于那些需要性能隔离、重负载下可预测的低延迟以及随着需求增长能线性扩展的公司。除了数十亿向量规模的搜索和推荐系统外,DRN也适用于关键任务型AI应用、需要隔离以防止工作负载相互影响的大型企业或多租户平台,以及其他需要大规模性能的应用。

Pinecone said its DRNs have proven their reliability under real-world conditions for several early adopters. One customer is using DRN to support metadata-filtered real-time媒体搜索 on its design platform, and was able to sustain 600-queries-per-second performance with latency of just 45 milliseconds across 135 million vectors. The same customer also pushed it to the limit, running a load test that saw its node reach an impressive 2,200 queries per second with a P50 latency of just 60 milliseconds.
Pinecone表示,其DRN已经在对几家早期采用者的实际环境测试中证明了其可靠性。一位客户正在其设计平台上使用DRN来支持基于元数据过滤的实时媒体搜索,能够在处理1.35亿个向量时,维持每秒600次查询的性能,延迟仅为45毫秒。该客户还进行了极限测试,运行负载测试使其节点达到了令人印象深刻的每秒2200次查询,P50延迟仅为60毫秒。

In another example, a customer running a large e-commerce marketplace deployed its recommendation engine on Pinecone’s DRNs to support 5,700 queries per second with a P50 latency of just 26 milliseconds across a database of 1.4 billion vectors.
在另一个例子中,一家运营大型电子商务市场的客户在Pinecone的DRN上部署了其推荐引擎,以支持在包含14亿向量的数据库上实现每秒5700次查询,P50延迟仅为26毫秒。
更多精彩内容 请关注我的个人公众号 公众号(办公AI智能小助手)或者 我的个人博客 https://blog.qife122.com/
对网络安全、黑客技术感兴趣的朋友可以关注我的安全公众号(网络安全技术点滴分享)

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/6/10 12:58:53

学长亲荐!专科生必用TOP10一键生成论文工具测评

学长亲荐!专科生必用TOP10一键生成论文工具测评 2025年专科生论文写作工具测评:为何需要这份榜单? 对于专科生而言,撰写论文不仅是学业的重要环节,更是提升学术能力的关键实践。然而,面对时间紧张、资料查…

作者头像 李华
网站建设 2026/6/10 12:54:45

USBlyzer抓包机制深度剖析:系统学习数据过滤策略

深入内核的 USB 通信透视镜:解析 USBlyzer 抓包与过滤机制在嵌入式开发和设备调试的世界里,USB 接口几乎无处不在。从一块小小的传感器模块,到复杂的工业控制器,再到我们每天使用的键盘、鼠标、U盘——它们都依赖于 USB 协议进行数…

作者头像 李华
网站建设 2026/6/10 12:56:59

IoT测试:连接设备的质量四维挑战

——面向测试工程师的实战指南 引言 物联网设备渗透率在2026年达到历史峰值(全球超350亿台),但同步增长的故障率正引发行业危机。本文从硬件交互、软件分层、网络拓扑及安全攻防四个维度,解构测试工程师必须攻克的28项关键挑战&am…

作者头像 李华
网站建设 2026/5/13 12:26:54

推理评测量化一步到位,结果可视化展示更直观

推理评测量化一步到位,结果可视化展示更直观 在大模型技术飞速发展的今天,越来越多的团队面临一个共性难题:如何快速、准确地评估一个新模型是否值得投入资源进行部署?传统流程中,开发者往往需要在 Hugging Face 下载权…

作者头像 李华
网站建设 2026/6/10 15:11:07

住宿餐饮:酒店预订系统API集成测试报告‌

API集成测试在酒店行业的重要性 酒店预订系统(如基于微服务架构的OTA平台)高度依赖API集成,涉及预订、支付、房态同步等多模块交互。API集成测试通过验证接口间数据流和业务逻辑,确保系统无缝运行。在住宿餐饮领域,测试…

作者头像 李华
网站建设 2026/6/5 12:10:32

一锤定音.sh脚本解读:自动化下载与部署的核心逻辑剖析

一锤定音.sh脚本解读:自动化下载与部署的核心逻辑剖析 在大模型技术飞速普及的今天,一个现实问题摆在开发者面前:如何让复杂的模型训练、微调和部署流程变得像“打开即用”那样简单?无论是高校研究者尝试新架构,还是企…

作者头像 李华