October 22nd 2025

Pinterest has big data, and it knows how to share

Pinterest is no National Security Agency, but the company, which identifies itself as a “visual discovery tool,” has grown into a collector of plenty information. Like Twitter, Facebook, Google, and other web giants, Pinterest has developed sophisticated systems for storing the data, but it’s also built a tool that lets lots of employees get at it.


In a blog today, Pinterest data engineer Mohammad Shahangian sheds light on the “self-serve platform” he and his colleagues have created for accessing data in Pinterest’s Hadoop clusters sitting in the Amazon Web Services public cloud.

That storage system “enables us to put the most relevant and recent content in front of users through features such as Related Pins, Guided Search, and image processing,” Shahangian wrote. “It also powers thousands of daily metrics and allows us to put every user-facing change through rigorous experimentation and analysis.”

But the team’s self-serve tool is much more than just the widely used Hadoop open-source technology for storing and analyzing lots of different kinds of data. It’s the sort of thing other companies might want to try out, so that more employees in more departments can use data to improve products and make smarter decisions. That concept has gained credence as startups like Platfora and Trifacta have gotten funding while seeking to simplify various stages of the Hadoop data analysis workflow.

Thanks to the efforts of Shahangian and his team, different people at Pinterest can create Hadoop clusters for different needs. That way precious Pinterest data scientists can focus on things other than just getting data out of Hadoop for their colleagues.

Oh, and if you’re wondering how much data Pinterest is dealing with, the company currently throws in 20 TB of new data every day, and about 10 PB of data are in Amazon’s S3 service for persistent storage.

And the trend has been that Pinners have been processing more and more data in Hadoop.

Source: venturebeat.com