Automating Metadata
At the micro level, integration of Big Data and other sources of data involve a degree of Metadata management that is similarly automated. Metadata’s penchant for providing context to different types of data is invaluable when integrating time-sensitive Big Data with other data types. Metadata requirements pertain to regulations, specific business processes and application requirements including those across (and specific to) business units. As Cerrato noted, the point is to:
“…not just manage technical Metadata, but to place it in the larger context of the enterprise—of the different aspects of process, of organization, of governance, of metrics. That is a real true differentiator that is adding value to how people can manage that information.”
Accounting for enterprise-wide integration of Big Data with Metadata enables organizations to take a holistic approach to that integrative process.Furthermore, contemporary governance solutions for Big Data can streamline that process by having Metadata operate as the basis upon which policies are founded—and in turn automating those rules.
Standards-Based Semantics
At the macro level, Big Data integration is based on the rules and responsibilities that are critical to Data Governance. At the micro level, those policies for governance are largely determined by the Metadata that provides a critical context for the integrated data. At a granular level, that integration is widely predicated upon standards-based Semantics, like many other critical applications and technologies at the forefront of Data Management today. The various aspects of Metadata between Big Data and other sources of data are able to be integrated in an orderly fashion upholding principles of governance because of the Semantics approach of more competitive solutions of Big Data Governance. Additionally, Semantics creates a degree of visibility within data elements that allows IT personnel to see, at a granular level, the various business terms and their definitions that relate to a data element and impact its integration with others. From this perspective, one of the fundamental aspects of Big Data integration involves a standards-based Semantics Metadata repository. Such a repository is essential for providing a degree of lineage and transparency with that integration, which helps to reinforce effective Governance.
Business Rules and Traceability
The relationship between business rules, Semantics, and effective integration is a pivotal one, especially when applied to huge quantities of Big Data. As the basis for the context and business terms that influence the integration of data elements, different aspects of Semantics provide a degree of visibility from the business down to the IT, and even from the IT back up to the business. The degree of specificity to data elements that Semantics provides includes, according to Cerrato, answers to such questions as, “What taxonomy applies to that? Are there Semantics? Are there ontologies and concept models that are relevant, maybe industry standards?” The answers to all of these questions merely create additional ways of “representing business context and doing gap analysis against industry standards,” Cerrato noted.
The overall effect is increased traceability of data and transparency within a data lake or some other means of integrating Big Data. When it comes to determining the movement of data across any number of different systems, technologies, and applications, this sort of lineage is extremely useful for providing a structured means of keeping track of data which may itself be unstructured. This fact becomes even more important when it is used to guide and ensure adherence to business rules and governance policies.
The Benefit of Integration: Operational Data
The four different aspects of Big Data integration outlined in this article (and in Cerrato’s presentation) ultimately create the means for enterprises to combine external and internal data sources in a timeframe that enhances operational data. Those aspects of integration include Data Governance, Metadata Management, Semantics, and the traceability of business rules. Furthermore, they enhance operational data in a way that substantially adds to the meaning of Big Data and the value it can produce when leveraged with traditional data sources. As Cerrato observed:
“All of these things come into play in a way that starts to bring together both small and Big Data worlds…The different areas of focus around Metadata Management, around governance, around ontology management, the management of business rules and decision processing. And all of that brings into play whatever operational sources you have whether they are legacy mainframe, whether they are XML schemas, relational structures, Big Data structures…Being able to layer that whole governance framework on top of any kind operational data you might have through all the different technologies that you might be using.”
Source: http://www.dataversity.net/integrating-big-data-the-right-way/

