Aquant delivers scalable, expert-level service intelligence with Pinecone

In the service and maintenance industry, solving problems quickly isn’t just a necessity, it’s a competitive advantage. From complex medical equipment to heavy machinery, organizations rely on fast, informed decisions to reduce downtime, cut costs, and keep customers happy. Aquant, a leader in AI-driven service optimization, helps companies do exactly that by delivering expert-level guidance across the entire service journey.

This guidance is delivered through Aquant AI, an agentic AI platform that empowers service teams to act with confidence by delivering fast, trusted, and context-aware answers across the entire service lifecycle—from field technicians to call centers agents to service leaders and even the end-customer. The knowledge is all drawn from vast repositories of a company’s structured and unstructured service data—including service manuals, technician notes, subject matter expertise, among many other sources.

But making domain-specific, high-quality answers available in real time, especially at scale, is no small feat. Aquant’s team knew that simply layering generative AI on top of traditional infrastructure wouldn’t get the job done. They needed a retrieval foundation built specifically for AI, and they found it in Pinecone.

Challenge

Scaling real-time, context-aware intelligence

Aquant had already been tackling the core problems of modern service delivery: increasingly complex machines, a shrinking pool of skilled workers, and a deluge of fragmented service data. The company’s AI platform was built to surface the right answers at the right time, whether a technician was troubleshooting in the field or a customer support agent needed insight into past repair histories.

As the product evolved, it became clear that accurate, fast, and scalable retrieval of domain-specific content would be critical. Aquant’s early vector search infrastructure, built in-house on top of PostgreSQL extensions and blob storage, worked well for internal tools and offline analytics. But it struggled under the demands of real-time service applications.

Search was slow. Retrieval quality was inconsistent. And managing the infrastructure took valuable time away from product development. The team explored newer and more established providers, and compared them to Pinecone:

Aquant Vector Database Evaluation

Criteria	Newer Vector Database Entrants	Bolt-on Database Providers	Pinecone
Enterprise Maturity & Support	Lacked enterprise readiness; limited support	Immature vector features despite established parent company	Enterprise-ready with robust support and SLAs
Security & Performance	Faced data isolation risks, multi-tenancy performance issues	Too tightly integrated into existing systems, limiting flexibility and performance	Cloud-agnostic with customer-specific multi-tenancy that doesn’t degrade performance
Filtering Capabilities	Filtering mechanisms were underdeveloped or immature	Basic or inflexible filtering tied to host ecosystem	Advanced, rich metadata filtering suitable for complex RAG use cases
Ease of Use & Deployment	Hard to deploy and integrate into enterprise workflows	Deployment complexity due to tight coupling with larger ecosystems	Developer-friendly with streamlined, purpose-built deployment
Customization for RAG & Agentic Use Cases	Limited flexibility for use-case specific customization like RAG	Not built for specialized needs; often generic or rigid	Built from the ground up for vector search, supporting nuanced use cases like RAG and agentic workflows
Infrastructure Model	Often lacked robust managed services or cloud independence	Tied to specific cloud or ecosystem platforms	Fully managed, cloud-agnostic infrastructure layer optimized for scale and reliability

After a rigorous evaluation process involving the data science, engineering, and product R&D teams, Aquant chose Pinecone to power semantic retrieval across its AI platform. The decision was based on both technical performance and a clear path to scale with a managed, enterprise-ready service.

Solution

The retrieval backbone behind enterprise-grade AI intelligence

Pinecone is now an anchor of Aquant’s agentic toolchain. Content—including customer service manuals and documentation; voice recordings; videos; repair records, free-text technician notes, part catalogs, schematic drawings; call transcripts; and machine logs—is embedded using Aquant’s domain-specific models and indexed in Pinecone for fast, semantic search.

The integration enabled Aquant to:

Deliver sub-100ms latency for semantic search, improving response time for users
Index tens of millions of vectors across customer-specific namespaces without degrading performance
Implement rich metadata filtering to tailor responses by asset type, document source, or issue category
Eliminate operational overhead, allowing teams to focus on product development rather than infrastructure tuning
Scale multi-tenancy, serving customers securely and efficiently across diverse industries

Pinecone also played a critical role in helping Aquant move from static knowledge delivery to dynamic, agentic AI. With vector search performance no longer a bottleneck, Aquant expanded its use of intelligent agents to surface next-best actions, generate workflows, and provide in-the-moment assistance based on evolving service conditions.

This agentic architecture, a core part of Aquant AI, enables real-time decisioning that adapts dynamically to both the asset’s condition and the user’s intent. This turns passive knowledge bases into proactive digital assistants.

Pinecone is a critical part of our agentic architecture; it powers the retrieval backbone of Aquant AI, including our knowledge agent, which delivers real-time, context-aware guidance to service professionals. Its performance and scalability allow us to serve our customers in production at enterprise scale, without compromising speed or accuracy. That’s enabled us to move beyond static answers and toward dynamic, AI-driven service intelligence. — Oded Sagie, vice president of product and R&D at Aquant.

result

Better performance, higher engagement, real-world impact

Pinecone has enabled Aquant to scale its AI platform while delivering measurable gains across technical metrics, user experience, and customer outcomes.

Platform Performance & Technical Improvements:

Response start time dropped to 2.89 seconds, more than 2x faster than before—making answers feel nearly instantaneous
Full response delivery time decreased from ~24s to ~13.7s
No-response rate (i.e., number of queries with no valid output) dropped by 53%, increasing answer reliability
Retrieval accuracy now consistently exceeds 98% in internal benchmarks

These backend improvements led to meaningful user behavior shifts. Weekly question volume grew by 48%, reflecting increased trust and reliance on the AI system. The platform now operates reliably at a tens-of-thousands scale, with growing adoption among new and returning users.

Business Impact for Aquant’s Customers:

19% reduction in cost per service case
62% reduction in parts replacement costs
10–20% improvement in remote resolution rates
49% reduction in average time-to-resolution
50% faster onboarding and knowledge transfer, cutting time-to-proficiency for new hires in half

The performance and quality improvements unlocked new capabilities within Aquant’s platform, such as dynamic document tagging. With Pinecone, content tagging at Aquant goes beyond static document classification; it's a context-aware, customer-specific layer that dynamically assigns semantic metadata to diverse content types (documents, technician notes, schematics, parts data) based on usage patterns, asset types, and service context. This level of customizable, fine-grained tagging is critical for enabling precise filtering and optimized retrieval in RAG pipelines, especially given the high variability in language and workflows across customers.

Expanding retrieval intelligence across more languages and use cases

Aquant continues to evolve how it measures and improves retrieval. One area of innovation is a proprietary, preprocessing-agnostic evaluation framework that avoids reliance on brittle, annotated datasets. Instead, it uses structural text similarity techniques to assess whether retrieved chunks align meaningfully with target answers, regardless of document formatting, chunking, or language differences.

This approach has already helped Aquant improve retrieval performance in languages like German, Dutch, and Japanese without requiring custom evaluation pipelines or repeated re-annotation.

Going forward with Pinecone, Aquant plans to further refine its use of metadata-based filtering, explore routing mechanisms for more personalized AI outputs, and expand its AI-powered workflows across a broader set of customer environments.