Storage Unpacked 259 - Sustainable Storage in the World of AI with Shawn Rosemarin (Sponsored)

In this episode, Chris discusses the topic of building sustainable storage solutions with Shawn Rosemarin, Global VP of Customer Engineering at Pure Storage. AI and specifically Generative AI (GenAI) has become a hot topic over the past 12 months. Businesses are looking at projects to use AI internally for productivity gains, but also to drive additional business.

However, AI is still relatively expensive and requires huge volumes of training data. Training is an ongoing process that must react to changes in the data landscape, such as rights and permissions, and government regulation. With AI hardware being so expensive, it’s important to get the storage piece right, and that means having a scalable and cost effective solution. Shawn details how Pure Storage has focused on two aspects. First, the hardware, where DFMs (direct flash modules) have reached 75TB, with commitments to deliver 150TB and 300TB drives in the next few years. Second, the software management capability delivered through Purity, the operating system of Pure Storage hardware.

It’s clear that building cost and power-efficient flash devices will be a challenge for the wider industry, where the focus lies with consumer devices. Pure Storage believes it is well positioned to help customers and potentially hyper-scalers in their goals to deliver efficient storage for AI.

As Shawn highlights, this topic and more will be discussed at Pure Accelerate, to be held in Las Vegas from 18-21 June 2024. Check out the website where you can learn more.

Elapsed Time: 00:52:08

Timeline

00:00:00 – Intros
00:01:44 – We’ve been quiet on the topic of AI
00:03:10 – AI has become cost-effective (sort of)
00:04:00 – Efficient AI is a 10-15 year journey
00:05:22 – AI technology needs to be efficient due to the resource demands
00:06:41 – Data growth is currently growing at 30% per annum
00:07:31 – Early mover may not be the best move with AI
00:08:16 – 149 foundational models were released in 2023
00:09:10 – Businesses will want to merge public and private data
00:10:40 – Results accuracy is super-important
00:13:30 – Trusted AI will be adopted in areas like security & vehicle evasive manoeuvres
00:15:10 – Where will AI models be developed?
00:16:37 – Model retraining will be required due to changing data ownership & permissions
00:18:30 – Model training also needs to be resource efficient
00:19:49 – $100 million to do the basic training of an AI model
00:22:26 – How do you feed GPUs with adequate data to run at 100%
00:24:10 – Edge devices could be used for AI processing
00:25:18 – How will data centres need to evolve for AI?
00:28:08 – Sustainability, regulation and jobs will all be issues in AI deployment
00:31:05 – With HPC, many users built bespoke systems and that’s a problem for AI
00:33:45 – How will businesses “industrialise” their AI projects?
00:36:46 – Storage density will help resolve the operational issues of AI storage
00:38:19 – SSD vendors’ main market is 2TB consumer SSDs
00:39:32 – 300TB drives are great, but how will software manage the hardware?
00:41:51 – Pure Storage DFMs will grow exponentially in capacity
00:42:43 – Hardware engineering is cool again!
00:45:15 – How will the hyper-scalers deal with massive storage growth?
00:51:30 – Wrap Up

Related Podcasts & Blogs

Podcast: Play in new window | Download

Podcast (storageunpacked): Play in new window | Download