What is Big Data?

Big data is a very large collection of structured and unstructured data that is stored and accessed very quickly to facilitate analysis and understanding of business issues. From relational data typically stored in databases to textual data and log files stored in newer structures like columnar data bases, big data includes every bit of information captured by an enterprise.



While big data is typically stored and accessed quickly, it’s not designed to be updated or changed very rapidly. In other words, consider it more of a data lake.

And that lake is growing exponentially. The amount of data available for analysis has exploded with the proliferation of technology and devices. That means companies who capture, store and analyze this data in order to gain business insights have a serious competitive advantage.

What is a Data Science?

In order to properly analyze this wealth of information for your business, you need a data scientist, a strong mathematician or statistician who can create models and apply them to stored data. This means the data scientist needs to know how to set up data stores, access them, structure data and analyze it using sophisticated statistical modeling tools. Deploying these models will reveal new insights to drive the business forward in new ways and this is the crux of Data Science.

Who Needs a Data Scientist?

So why don’t all companies have a team of data scientists? It’s because most don’t even know they need them.

Most companies think traditional Business Intelligence (BI) in which data is collected in warehouses, models are created based on business criteria and results are visualized through reports is sufficient. While this is true if your only concern is to answer basic questions like which customers are more profitable, it is not enough to deliver transformative business change like Data Science can.

Data Science takes a different approach than BI in that insights and models are derived from the data through the application of statistical and mathematical techniques by Data Scientists. The data drives the modeling and insights. When you let the data guide you – you are less likely to try to use the data to support wrong predispositions or conclusions.