Microsoft Announces Azure Data Lake, A Data Repository For Big Data Analytics

Microsoft today announced Azure Data Lake, a new data repository for big data analytics workloads, during its Build developer conference keynote.

The idea behind Data Lake is — as the name implies — to give developers a single place to store all of their structured and semi-structured data in its native format without having to worry about storage and capacity limitations on individual files.

Data Lake is compatible with the Hadoop File System, so it will play nicely with all of the standard Hadoop big data tools like Spark, Storm and Kafka, as well as services from Hortonworks, Cloudera and Microsoft’s own Azure HDInsight. Indeed, Microsoft’s corporate vice president for its data platform T.K. “Ranga” Rengarajan described this as an “important commitment to the Hadoop ecosystem” when I talked to him about today’s announcements earlier this week.

data_lake_2

Rengarajan also stressed that Microsoft wants to allow developers to use the tools they already use, whether that’s SQL and SQL Server or Hadoop. “We want developers to be able to use the tools and frameworks they are familiar with and still be able to do all of this data processing in a friendly way,” he noted.

2I8A9232 The service, Rengarajan explained to me, was built on top of Azure’s hyperscale network and supports both single files that can be multiple petabytes in size, as well as high volumes of small writes and with very low latency. Because of this, the service should work well for real-time website, Internet of Things and sensor analytics, as well as for more batch-oriented big data services. Overall, though, the service is optimized for big-data analytics workloads that required developers to run massively parallel queries.

“We’re living in the golden age of data,” Rengarajan told me. “I think of it as everyone in the world getting drunk on data. And for the first time, it’s economically feasible to get value out of the data we used to throw away.” Because it allows developers to work with a wide variety of data formats, Data Lake is Microsoft’s attempt to allow developers and data scientists to store all of this data in a central repository and then analyze it with tools they are already familiar with.

Techcrunch event

San Francisco, CA | October 13-15, 2026

The service is now in private preview and interested developers can sign up here.

Topics

Azure, big data, build2015, data lake, developers, Microsoft, microsoft-azure, TC

Frederic Lardinois

Editor

Frederic was with TechCrunch from 2012 through 2025. He also founded SiliconFilter and wrote for ReadWriteWeb (now ReadWrite). Frederic covers enterprise, cloud, developer tools, Google, Microsoft, gadgets, transportation and anything else he finds interesting.

View Bio

Topics

More from TechCrunch

Microsoft Announces Azure Data Lake, A Data Repository For Big Data Analytics

Disrupt 2026: The tech ecosystem, all in one room

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $400.

Save up to $300 or 30% to TechCrunch Founder Summit

The AI skills gap is here, says AI company, and power users are pulling ahead

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

Kentucky woman rejects $26M offer to turn her farm into a data center

Someone has publicly leaked an exploit kit that can hack millions of iPhones

Disrupt 2026: The tech ecosystem, all in one room

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $400.

Save up to $300 or 30% to TechCrunch Founder Summit

Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

Delve accused of misleading customers with ‘fake compliance’

An exclusive tour of Amazon’s Trainium lab, the chip that’s won over Anthropic, OpenAI, even Apple

Microsoft Announces Azure Data Lake, A Data Repository For Big Data Analytics

Disrupt 2026: The tech ecosystem, all in one room

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $400.

Save up to $300 or 30% to TechCrunch Founder Summit

Most Popular

The AI skills gap is here, says AI company, and power users are pulling ahead

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

Kentucky woman rejects $26M offer to turn her farm into a data center

Someone has publicly leaked an exploit kit that can hack millions of iPhones

Disrupt 2026: The tech ecosystem, all in one room

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $400.

Save up to $300 or 30% to TechCrunch Founder Summit

Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

Delve accused of misleading customers with ‘fake compliance’

An exclusive tour of Amazon’s Trainium lab, the chip that’s won over Anthropic, OpenAI, even Apple