October 26th 2025

Hortonworks buys better Hadoop data flow management

Hortonworks' newest acquisition is a prelude to creating an open-source-based data flow management product

Hadoop vendor Hortonworks, fresh off releasing a new version of its distribution, has acquired a company with a framework Hortonworks wants for handling how data moves into, out of, and next to Hadoop.

The company is Onyara, and the framework (of which Onyara is a commercial supporter) is the Apache NiFi project, a system for graphically diagramming how data can move through a system.

Hortonworks sees NiFi as a way to create a new data platform for Hadoop

Read more ...

Automating Metadata

At the micro level, integration of Big Data and other sources of data involve a degree of Metadata management that is similarly automated. Metadata’s penchant for providing context to different types of data is invaluable when integrating time-sensitive Big Data with other data types. Metadata requirements pertain to regulations, specific business processes and application requirements including those across (and specific to) business units. As Cerrato noted, the point is to:

“…not just manage technical Metadata, but to place it in the larger context of the enterprise—of the different aspects of process, of organization, of governance, of metrics. That is a real true differentiator that is adding value to how people can manage that information.”

Read more ...

Integrating Big Data the Right Way

The true value in incorporating a Big Data initiative into an overall Enterprise Data Management scheme comes from integrating, and in some cases aggregating, external Big Data with more conventional sources of data.

Doing so correctly involves accounting for issues of Data Governance, Metadata Management, traceability, and Semantic consistency that frequently require more than simply dumping data into a single repository as a data lake—which incurs the risk of creating the proverbial data swamp.

The crux of the matter, due to Big Data’s ascending popularity and the efforts of vendors to capitalize on it, is that there are “…

Read more ...

Mesosphere's new big data solution: Add Spark, hold the Hadoop

A data-processing solution from Mesosphere leverages Spark, Kafka, and Cassandra -- but eschews Hadoop -- for enterprise level real-time big-data needs

Mention big-data tools like Spark and Kafka to most enterprise users, and the other big-data tool that comes to mind along with them is Hadoop. But does it need to?

Mesosphere, corporate backers of the Apache Mesos cluster-management project, are ginning up a big-data stack that eschews Hadoop, but embraces Spark (and Kafka, and Cassandra, and the Akka event framework) for real-time processing.

Read more ...

Hortonworks looks to 'Internet of Anything' with acquisition

The open source Hadoop distribution vendor aims to bring together streaming analytics and rich historical analytics with the acquisition of the key contributor to the Apache NiFi project and the creation of Hortonworks DataFlow.

Open source Hadoop distribution specialist Hortonworks wants to close the loop on predictive analytics, allowing it to turn what it calls the "Internet of Anything" into actionable insights. To get there, it announced today that it has signed a definitive agreement to acquire Onyara, creator and key contributor to the top-level Apache NiFi open source project.

Read more ...

MapR adds Apache HBase to free online training as big data booms

MapR has extended its free Hadoop on-demand training program to include the Apache HBase design and development curriculum.

Justin Bock, MapR ANZ manager, says the free training classes are open to IT professionals, including resellers, and can lead to HBase certification.

Bock says the company has seen ‘amazing’ momentum with the free Hadoop online on-demand training.

“Now we are adding a complete Apache HBase design and development curriculum that will provide courses that can lead to HBase certification.

Read more ...

CoolaData offers behavorial analytics as a cloud service

Based on the Google Cloud Platform, CoolaData provides a complete data warehouse as an online service

Want to harness the power of business intelligence but don't have a six-figure budget to build a data warehouse? Why not try a cloud service instead?

Israel-based CoolaData is promising to ease the headaches, and start-up costs, of running a data warehouse with a cloud-based analysis service, one based on components of the Google Cloud Platform.

The service is aimed chiefly at mid-sized organizations, ones too small to have a dedicated team of data scientists and administrators to run data warehouses in-house,

Read more ...

Microsoft's Cortana Analytics looks to democratize big data

Cortana Analytics will unite a number of different Microsoft data processing technologies, including machine learning and stream processing

Microsoft has fused a number of its new data processing technologies into a single package designed to help organizations get more value from their growing piles of data.

The new suite, called Cortana Analytics, is designed to "democratize big data," said Microsoft CEO Satya Nadella, during the opening keynote of the company's World Wide Partner Conference in Orlando, Florida.

The package is designed to "allow any business to transform itself through the power of data," Nadella said.

Read more ...