Nvidia and DataStax just made generative AI smarter and leaner — here’s how

Date:

Credit: VentureBeat made with Midjourney

Credit: VentureBeat made with Midjourney

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Nvidia and DataStax launched new technology today that dramatically reduces storage requirements for companies deploying generative AI systems, while enabling faster and more accurate information retrieval across multiple languages.

The new Nvidia NeMo Retriever microservices, integrated with DataStax’s AI platform, cuts data storage volume by 35 times compared to traditional approaches — a crucial capability, as enterprise data is projected to reach more than 20 zettabytes by 2027.

“Today’s enterprise unstructured data is at 11 zettabytes, roughly equal to 800,000 copies of the Library of Congress, and 83% of that is unstructured with 50% being audio and video,” said Kari Briski, VP of product management for AI at Nvidia, in an interview with VentureBeat. “Significantly reducing these storage costs while enabling companies to effectively embed and retrieve information becomes a game changer.”

Nvidia’s NeMo Retriever technology delivers a 35x improvement in data storage efficiency, as illustrated in a comparison of raw text storage, baseline vector embeddings, and reduced embedding dimensions. This breakthrough underpins the scalability of generative AI across enterprise applications. (Credit: Nvidia)

The technology is already proving transformative for Wikimedia Foundation, which used the integrated solution to reduce processing time for 10 million Wikipedia entries from 30 days to under three days. The system handles real-time updates across hundreds of thousands of entries being edited daily by 24,000 global volunteers.

“You can’t just rely on large language models for content — you need context from your existing enterprise data,” explained Chet Kapoor, CEO of DataStax. “This is where our hybrid search capability comes in, combining both semantic search and traditional text search, then using Nvidia’s re-ranker technology to deliver the most relevant results in real time at global scale.”

Enterprise data security meets AI accessibility

The partnership addresses a critical challenge facing enterprises: how to make their vast stores of private data accessible to AI systems without exposing sensitive information to external language models.

“Take FedEx — 60% of their data sits in our products, including all package delivery information for the past 20 years with personal details. That’s not going to Gemini or OpenAI anytime soon, or ever,” Kapoor explained.

The technology is finding early adoption across industries, with financial services firms leading the charge despite regulatory constraints. “I’ve been blown away by how far ahead financial services firms are now,” said Kapoor, citing Commonwealth Bank of Australia and Capital One as examples.

The next frontier for AI: Multimodal document processing

Looking ahead, Nvidia plans to expand the technology’s capabilities to handle more complex document formats. “We’re seeing great results with multimodal PDF processing — understanding tables, graphs, charts and images and how they relate across pages,” Briski revealed. “It’s a really hard problem that we’re excited to tackle.”

For enterprises drowning in unstructured data while trying to deploy AI responsibly, the new offering provides a path to make their information assets AI-ready without compromising security or breaking the bank on storage costs. The solution is available immediately through the Nvidia API catalog with a 90-day free trial license.

The announcement underscores the growing focus on enterprise AI infrastructure as companies move beyond experimentation to large-scale deployment, with data management and cost efficiency becoming critical success factors.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share post:

Subscribe

Popular

More like this
Related

Players rebuke clumsy ad strategies, even in popular games | Mobile Premier League

December 20, 2024 5:54 PM Candy Crush Saga has $20...

Arm lawsuit against Qualcomm ends in mistrial and favorable ruling for Qualcomm

December 20, 2024 4:17 PM Arm Total Design is aimed...

Players invested 8.34B hours into Blizzard titles in 2024, says studio

December 20, 2024 2:10 PM Image Credit: Blizzard Entertainment Blizzard this...

Hugging Face shows how test-time scaling helps small language models punch above their weight

December 20, 2024 12:46 PM Image credit: VentureBeat with Ideogram Join...