The “big data” phenomenon is driving transformational, technological, scientific, and economic changes and "Information taming" technologies are driving down the cost of creating, capturing, managing and storing information
We’ve all seen how organisations have an insatiable desire for more data as they believe that this information will radically change their businesses.
They are right – but it’s only the effective exploitation of that data, turning it into really useful information and then into knowledge & applied decision making that will realise the true potential of this vast mountain of data.
Incidentally, do you have any idea how much data 1.8 zettabytes really is?  It’s about the same amount of data if every person in the world sent twenty tweets an hour for the next 1200 years!
Data by itself is useless, it has to be turned into useful information & then have effective business intelligence applied to realise its true potential.
The problem is that big data analytics push the limit of traditional data management.  Allied to this the most complex big data problems start with huge volumes of data in disparate stores with high volatility of data.  Big data problems aren’t just about volume though; there’s also the volatility of the data sources & rate of change, the variety of the data formats and the complexity of the individual data types themselves.  So is it always the most appropriate route to pull all this data into yet another location for its analysis?  
Unfortunately though many organisations are constrained by traditional data integration approaches that can slow adoption of big data analytics.
Approaches which can provide high performance data integration to overcome data complexity & data silos will be those which win through.  These need to integrate the major types of “big data” into the enterprise.  The typical “big data” sources include:
- Key/value Data Stores such as Cassandra,
- Columnar/tabular NoSQL Data Stores such as Hadoop & Hypertable,
- Massively Parallel Processing Appliances such as Greenplum & Netezza, and
- XML Data Stores such as CouchDB & MarkLogic.
Fortunately approaches such as Data Federation / Data Virtualisation are stepping up to meet this challenge.
Finally & of utmost importance is managing the quality of the data.  What’s the use of this vast resource if its quality and trustworthiness is questionable?  Thus, driving your data quality capability up the maturity levels is key.
Data Quality Maturity – 5 levels of maturity
| Level 1 - Initial  | Level 2 - Repeatable  | Level 3 - Defined  | Level 4 - Managed  | Level 5 - Optimised  | 
| Limited   awareness within the enterprise of the importance of information   quality.  Very few, if any, processes in place to measure quality   of information. Data is often not trusted by business users. | The   quality of few data sources is measured in an ad hoc manner. A number of   different tools used to measure quality. The activity is driven by a projects   or departments.   Limited understanding of good versus bad   quality.  Identified issues are not consistently managed. | Quality   measures have been defined for some key data sources.  Specific tools   adopted to measure quality with some standards in place. The processes for   measuring quality are applied at consistent intervals.  Data issues are   addressed where critical. | Data   quality is measured for all key data sources on a regular basis. Quality   metrics information is published via dashboards etc.  Active management   of data issues through the data ownership model ensures issues are often   resolved. Quality considerations baked into the SDLC. | The   measurement of data quality is embedded in many business processes across the   enterprise. Data quality issues addressed through the data ownership   model. Data quality issues fed back to be fixed at source. | 
 
 
I do appreciate the hard work & thought behind with big data problems. Great job, keep posting interesting articles here. More info on information management systems.
ReplyDeleteGreat blog! That was amazing. Your thought processing is wonderful. Mobile Patrol Vancouver
ReplyDelete