Are Master Data Management and Hadoop a Good Match?

Master Data is the critical electronic information about the company we cannot afford to lose. Accordingly, we should sanitise it, look after it, and store it safely in several separate places that are independent of each other. The advent of Big Data introduced the current era of huge repositories ?in the clouds?. They are not, of course but at least they are remote. This short article includes a discussion about Hadoop, and whether this is a good platform to back up your Master Data.

About Hadoop

Hadoop is an open-source Apache software framework built on the assumption that hardware failure is so common that backups are unavoidable. It comprises a storage area and a management part that distributes the data to smaller nodes where it processes faster and more efficiently. Prominent users include Yahoo! and Facebook. In fact more than half Fortune 50 companies were using Hadoop in 2013.

Hadoop – initially launched in December 2011 ? has survived its baptism of fire and became a respected, reliable option. But is this something the average business owner can tackle on their own? Bear in mind that open source software generally comes with little implementation support from the vendor.

The Hadoop Strong Suite

  • Free to download, use and contribute to
  • Everything you need ?in the box? to get started
  • Distributed across multiple fire-walled computers
  • Fast processing of data held in efficient cluster nodes
  • Massive scaleable storage you are unlikely to run out of

Practical Constraints

There is more to Hadoop than writing to WordPress. The most straightforward solutions are uploading using Java commands, obtaining an interface mechanism, or using third party vendor connectors such as ACCESS or SAS. The system does not replace the need for IT support, although it is cheap and exceptionally powerful.

The Not-Free Safer Option

Smaller companies without in-depth in-house support are wise to engage with a technical intermediary. There are companies providing commercial implementations followed by support. Microsoft, Amazon and Google among others all have commercial versions in their catalogues, and support teams at the end of the line.

Check our similar posts

Migrating from CRM to Big Data

Big data moved to centre stage from being just another fad, and is being punted as the latest cure-all for information woes. It may well be, although like all transitions there are pitfalls. Denizon decided to highlight the major ones in the hope of fostering better understanding of what is involved.

Accurate data and interpretation of it have become increasingly critical. Ideas Laboratory reports that 84% of managers regard understanding their clients and predicting market trends essential, with accelerating demand for data savvy people the inevitable result. However Inc 5000 thinks many of them may have little idea of where to start. We should apply the lessons learned from when we implemented CRM because the dynamics are similar.

Be More Results Oriented

Denizon believes the key is focusing on the results we expect from Big Data first. Only then is it appropriate to apply our minds to the technology. By working the other way round we may end up with less than optimum solutions. We should understand the differences between options before committing to a choice, because it is expensive to switch software platforms in midstream. data lakes, hadoop, nosql, and graph databases all have their places, provided the solution you buy is scalable.

Clean Up Data First

The golden rule is not to automate anything before you understand it. Know the origin of your data, and if this is not reliable clean it up before you automate it. Big Data projects fail when executives become so enthused by results that they forget to ask themselves, ?Does this make sense in terms of what I expected??

Beware First Impressions

Big Data is just that. Many bits of information aggregated into averages and summaries. It does not make recommendations. It only prompts questions and what-if?s. Overlooking the need for the analytics that must follow can have you blindly relying on algorithms while setting your business sense aside.

Hire the Best Brains

Big Data?s competitive advantage depends on what human minds make with the processed information it spits out. This means tracing and affording creative talent able to make the shift from reactive analytics to proactive interaction with the data, and the customer decisions behind it.

If this provides a d?j? vu moment then you are not alone. Every iteration of the software revolution has seen vendors selling while the fish were running, and buyers clamouring for the opportunity. Decide what you want out first, use clean data, beware first impressions and get your analytics right. Then you are on the way to migrating successfully from CRM to Big Data.

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK
The Rights of Individuals Under The General Data Protection Regulation

The General Data Protection Regulation or GDPR is a European Union law reinforcing the rights of citizens concerning the confidentiality of their information, and confirming that they own it. We thought it would be interesting to examine the GDPR effective 25 May 2018 from an Irish citizen?s perspective. This article is a summary of information on the Data Protection Commissioner?s website, but as viewed through a businessperson?s lens.

How the Office Defines Data Protection

The Office believes that organisations receiving personal details have a duty to keep them private and safe. This applies inter alia to information that individuals supply to government, financial institutions, insurance companies, medical providers, telecoms services, and lenders. It also applies to information provided when they open accounts.

This information may be on paper, on computers, or in video, voice, or photographic records. The true owners of this information, the individuals have a right:

  • To make sure that it is factually correct
  • To the assurance that it is shared responsibly
  • That all with access only use it for stated purposes

Any organisation requesting personal information must state who they are, what the information is for, why they need to have it, and to whom else they may provide it.

Consumer Rights to Access Their Personal Information

Private persons have a right under the GDPR to a copy of all their information held or processed by a business. The regulation refers to such businesses as ?data controllers? as opposed to owners, which is interesting. They have to provide both paper and digital data, and ‘related information?.

Data controller fees for this are discretionary within limits. The request may be denied under certain circumstances. The data controller may release information about children to parents and guardians, only if it considers a minor too young to understand its significance. Other third parties such as attorneys must prove they have consent.

Consumer Rights to Port Their Data to Different Services

Since the personal information belongs to the individual, they have a right not only to access it, but also to copy or move it from one digital environment to another. The GDPR requires this be ?in a safe way, without hindrance to usability?. An application could be a banking client that wants to upload their transaction history to a third party price comparison website.

However, the right to data portability only applies to data originally provided by the consumer. Moreover, an automated method must be available for porting. Data controllers must release the information in an open format, and may not charge for the porting service.

Consumer Rights to Complain About Personal Data Abuse

Individuals have a right under the General Data Protection Regulation to have their information rectified if they discover errors. This right extends to an assurance that third parties know about the changes – and who these third party entities are. Data controllers must respond within one month. If they decline the request, they must inform the complainant of their right to further remedial action.

If a data controller refuses to release personal information to the owner, or to correct errors, then the Data Protection Office has legal power to enforce the consumer?s rights. The complainant must make full disclosure of the history of their complaint, and the steps they have taken themselves to attempt to set things right.

Further Advice on Getting Things Ready for 25 May 2018

The General Data Protection Regulation has the full force of law from 25 May 2018 onward, and supersedes all applicable Irish laws, regulations, and policies from that date. We recommend incorporating rights of data owners who are also your customers into your immediate plans. We doubt that forgetting to do so will cut much sway with the Data Commissioner. Remember, you have one month to respond to consumer requests, and only one more month to close things out subject to the matter being complex.

Directions Hadoop is Moving In

Hadoop is a data system so big it is like a virtual jumbo where your PC is a flea. One of the developers named it after his kid?s toy elephant so there is no complicated acronym to stumble over. The system is actually conceptually simple. It has loads of storage capacity and an unusual way of processing data. It does not wait for big files to report in to its software. Instead, it takes the processing system to the data.

The next question is what to do with Hadoop. Perhaps the question would be better expressed as, what can we do with a wonderful opportunity that we could not do before. Certainly, Hadoop is not for storing videos when your laptop starts complaining. The interfaces are clumsy and Hadoop belongs in the realm of large organisations that have the money. Here are two examples to illustrate the point.

Hadoop in Healthcare

In the U.S., healthcare generates more than 150 gigabytes of data annually. Within this data there are important clues that online training provider DeZyre believes could lead to these solutions:

  • Personalised cancer treatments that relate to how individual genomes cause the disease to mutate uniquely
  • Intelligent online analysis of life signs (blood pressure, heart beat, breathing) in remote children?s hospitals treating multiple victims of catastrophes
  • Mining of patient information from health records, financial status and payroll data to understand how these variables impact on patient health
  • Understanding trends in healthcare claims to empower hospitals and health insurers to increase their competitive advantages.
  • New ways to prevent health insurance fraud by correlating it with claims histories, attorney costs and call centre notes.

Hadoop in Retail

The retail industry also generates a vast amount of data, due to consumer volumes and multiple touch points in the delivery funnel. Skillspeed business trainers report the following emerging trends:

  • Tracing individual consumers along the marketing trail to determine individual patterns for different demographics and understand consumers better.
  • Obtaining access to aggregated consumer feedback regarding advertising campaigns, product launches, competitor tactics and so on.
  • Staying with individual consumers as they move through retail outlets and personalising their experience by delivering contextual messages.
  • Understanding the routes that virtual shoppers follow, and adding handy popups with useful hints and tips to encourage them on.
  • Detecting trends in consumer preferences in order to forecast next season sales and stock up or down accordingly.

Where to From Here?

Big data mining is akin to deep space research in that we are exploring fresh frontiers and discovering new worlds of information. The future is as broad as our imagination.?

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK

Ready to work with Denizon?