Are Master Data Management and Hadoop a Good Match?

Master Data is the critical electronic information about the company we cannot afford to lose. Accordingly, we should sanitise it, look after it, and store it safely in several separate places that are independent of each other. The advent of Big Data introduced the current era of huge repositories ?in the clouds?. They are not, of course but at least they are remote. This short article includes a discussion about Hadoop, and whether this is a good platform to back up your Master Data.

About Hadoop

Hadoop is an open-source Apache software framework built on the assumption that hardware failure is so common that backups are unavoidable. It comprises a storage area and a management part that distributes the data to smaller nodes where it processes faster and more efficiently. Prominent users include Yahoo! and Facebook. In fact more than half Fortune 50 companies were using Hadoop in 2013.

Hadoop – initially launched in December 2011 ? has survived its baptism of fire and became a respected, reliable option. But is this something the average business owner can tackle on their own? Bear in mind that open source software generally comes with little implementation support from the vendor.

The Hadoop Strong Suite

  • Free to download, use and contribute to
  • Everything you need ?in the box? to get started
  • Distributed across multiple fire-walled computers
  • Fast processing of data held in efficient cluster nodes
  • Massive scaleable storage you are unlikely to run out of

Practical Constraints

There is more to Hadoop than writing to WordPress. The most straightforward solutions are uploading using Java commands, obtaining an interface mechanism, or using third party vendor connectors such as ACCESS or SAS. The system does not replace the need for IT support, although it is cheap and exceptionally powerful.

The Not-Free Safer Option

Smaller companies without in-depth in-house support are wise to engage with a technical intermediary. There are companies providing commercial implementations followed by support. Microsoft, Amazon and Google among others all have commercial versions in their catalogues, and support teams at the end of the line.

Check our similar posts

IT Systems Implementation

Are you ready to find out how your newly accepted IT system fares in the real world? Although a rigorous Acceptance testing process can spot a wide spectrum of flaws in a newly constructed IT system, there is no way it can identify all possible defects. The moment the IT system is delivered into the hands of actual end users and other stakeholders, it is effectively stepping out of a controlled and secure environment.

Thus, it is during this phase wherein issues having direct impact on the business can arise.

It is our duty to ensure that the Systems Implementation phase is carried out as thoroughly, professionally, and efficiently as possible.

Thoroughly, because we need to include all relevant data and other deliverables, eliminate hard-to-detect miscalculated results, and substantially reduce the probability of business and mission critical issues popping up in the future;

Professionally, because it is the best way to address the sensitive process of turning over a new system to users who have gotten used to the old one;

And efficiently, because we want to minimise the duration over which all stakeholders have to adapt to the new system and allow them to move on to the process of growing the business.

Preparation

Louis Pasteur once said, “Luck favours the mind that is prepared.”

While we certainly won’t leave anything to chance, we do put substantial weight on the Preparation stage of Systems Implementation. We’re so confident with the strategies we employ in Preparation, that we can assure you of an utterly seamless Deployment and Transition phase.

By this we mean that issues that may arise during Deployment and Transition will be handled smoothly and efficiently because your people will know exactly what to do.

Here’s how we will prepare your organisation for Deployment:

  • Identify all key players for the Systems Implementation phase and orient them on their specific roles. We’ll make sure they know what possible hitches may come their way and how to deal with them.
  • Identify all end users and their corresponding functions, then assign appropriate access rights.
  • Draw multi-layered contingency plans to capture and address each possible concern that may crop up during Deployment.
  • Prepare a systematic step-by-step procedure and checklist for the entire Deployment stage. Both of them should have been copied from a similar procedure and checklist used in the Acceptance testing phase.
  • Make all stakeholders understand the conditions required before Deployment can commence.
  • Set the appropriate environment so that all stakeholders know what to expect and when to expect them the moment Deployment commences.
  • Prepare Technical Services and Technical Support personnel for the gruelling mission ahead.
  • Make sure all communication processes are well coordinated so that everyone affected will know who to contact and how to get in touch with them when a problem arises.
  • Plan and schedule training sessions so that they can be conducted “just in time”. Training sessions conducted way ahead of Deployment are often useless because the trainees tend to forget about what they learned when the time comes to apply them. Similarly, training sessions conducted way after Deployment also become useless because trainees are seldom able to internalise instructions delivered during crash courses.

Deployment

There are two sets of issues to keep an eye on during Deployment:

  1. Issues directly related to the technology itself, e.g. application functionality and data integrity, and
  2. Issues emanating from the end users, i.e., their unwillingness to use the new system. One reason may be because they find the interface and procedures too confusing. Another would be due to other inconveniences that come with adapting to a new set of procedures.

Despite all the meticulous scrutiny employed during Acceptance testing, there are just some problems that are made obvious only during Deployment. Issues belonging to the first set are dealt with easily because of the plans and procedures we put in place during the Preparation stage. As an added measure, our team will be on hand to make sure contingency plans are executed accordingly.

While the second set of issues is often neglected by many IT consultancy companies, we choose to meet it head on.

We fully understand that end users are most sensitive to the major changes that accompany a new system. It is precisely for this reason why our training activities during Deployment are designed not only to educate them but also to make them fully appreciate the necessity of both the new system and the familiarisation phase they will need to go through.

The faster we can bring your end users to accept the new system, the faster they can refocus on your company’s business objectives.

Here’s what we’ll do to guarantee the smoothest Deployment process you’ve ever experienced.

  • Employ the procedure and checklist formulated during the Preparation stage.
  • Ensure all end users are well acquainted with any additional tasks they would need to perform (e.g. filling up manual logs).
  • Assess which legacy systems can still be used alongside the new technology and which ones have to be retired.
  • Supervise the installation and optimal configuration of all supporting hardware and software to make sure the likelihood of errors originating from them are brought to near-zero levels.
  • Supervise the installation and optimal configuration of the products themselves.
  • Carry out data migration tasks if necessary.
  • Organise and oversee parallel runs to check for data and report inconsistencies.
  • Conduct training sessions in a professional and well-timed manner to eliminate end-users’ feelings of agitation and to take advantage of memory absorption and retention duration as with regards to their assigned duties and responsibilities.

Transition

Do you often feel uneasy whenever the reins to a newly purchased IT system are handed over to you? Perhaps there are some issues that you feel haven’t been fully settled but, at the same time, find it too late to back out, having already invested so much time and resources.

Alright, so maybe the thought of “backing up” never crossed your mind. However, the concern of being “not yet ready” is raised by many organisations towards the tail end of most Deployment stages. This usually drags the Deployment stage into a never-ending process.

Our team of highly experienced specialists will make sure you reach this point with utmost confidence to proceed on your own.

To wrap up our comprehensive IT Systems Implementation offering, we’ll take charge of the following:

  • Verify that all deliverables, including training materials and other technical documentation, are accomplished and expected outcomes are realised.
  • Make sure all technical documentation are placed in a secure and accessible location.
  • Institute best practices to ensure the IT system becomes fully utilised and to reduce its exposure to avoidable risks.
  • Establish open communication lines with the Technical Support team to enable quick resolution of issues.
  • Ensure complete knowledge transfer has been fully achieved so that your people will spend less time calling Technical Support and more on operations contributory to business growth.
Data Leakage Prevention – Protecting Sensitive Information

When DuPont lost $400 million in intellectual property, it wasn’t because a hacker from the other side of the world infiltrated their system. The information was simply stolen by a former employee. Alarmingly, data loss incidents are not always caused by deliberate actions.

A file containing personal information accidentally attached to an email and sent to multiple recipients; financial data stored in a USB pen drive, accidentally left in a restaurant; or bank account data of colleagues, inadvertently posted on a company website – these are also some of the everyday causes of data loss.

A report done by research company Infowatch regarding global data leaks in 2010 showed that there were actually more accidental data leaks in that year compared to intentional ones. Accidental leaks comprised 53%, while intentional leaks comprised 42% (the rest were unidentified).

But even if they ?only? happened accidentally, breach incidents like these can still be very costly. The tens of thousands of dollars that you could sometimes end up paying in civil penalties (as in the case when you lose other people?s personal information) can just be the beginning. More costly than this is the loss of customer and investor confidence. Once you lose those, you could consequently lose a considerable portion of your business.

Confidential information that may already be leaking out right under your nose

With all the data you collect, process, exchange, and store electronically every day, your IT system has surely now become a storehouse of sensitive information. Some of them, you may be even taking for granted.

But imagine what would happen if any of the following trade secrets fell into the wrong hands: marketing plans, confidential customer information, pricing data, product development strategies, business plans, supplier information, source codes, and employee salaries.

These are not the only kind of data that you should be worried about. You could also get into trouble if your sloppy IT security fails to protect employee or client personal information such as their names; social security numbers; drivers license numbers; or bank account numbers and credit/debit card numbers along with their corresponding PINs.

In some countries, you could face onerous data breach notification requirements and heavy fines when these kind of data are involved.

There are now more holes to plug

It’s not just the different varieties of sensitive electronic information that you have to worry about. Because these data can take on different forms, i.e. data-at-rest, data-in-motion, and data-at-the-endpoints, you also need to take aim at different areas in your IT system.

Sensitive information can be found ?at rest? in each of your employees? hard disks, in your servers, storage disks, and in off-site backup disks. They can also be found ?in motion? in email, instant messaging, social networking messaging, P2P file sharing, ftp, http, and so on.

That’s not all. Your highly mobile workforce may have already introduced yet another high-risk area into your system: data-at-the-endpoints. This includes USB flash-disks, laptops, portable hard disks, CDs, and even smartphones.

The main challenge of data leak prevention

Having been made aware of the various aspects of data leakage, have you already come to grips with the extent of the task at hand?

There are two major things you need to do here to prevent data leakage.

One, you need to identify what data you have that can be considered as sensitive/confidential information. Of course you have financial information and employee salaries in your files. But do you also store personally identifiable information? Do you have trade secrets that are stored in electronic form?

Two, you need to pinpoint their locations. Are they only on your hard disks and laptops? Or have they made their way to flash drives, CDs/DVDs, or portable HDDs? Are they being transmitted through email or any other file transfer media?

The reason why you need to know what your sensitive data are as well as where they are is because you would like all efforts of securing them to be as efficient and unobtrusive as possible.

Let’s say, as a way of protecting your data, you decide to implement encryption. Since encryption can consume a lot of storage space and significantly reduce performance, it may be impractical to encrypt your entire database or all your files. For the same reason, you wouldn’t want to encrypt every single email that you send.

Thus, the best way would be to encrypt only the data that really need encryption. But again, you need to know what data needs to be encrypted and where those data can be found. That alone is no simple task.

Not only will you need to deal with the data you already have, you will also have to worry about the data that will go through your systems during the course of your day-to-day transactions.

Identifying sensitive data as it enters or leaves your system, goes through your network, or gets stored in your file system or database, and then applying the necessary security actions should be done automatically and intelligently. Otherwise, you could end up spending on a lot of man-hours or, worse, wasting them on a lot of false positives and negatives.

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK
Are Master Data Management and Hadoop a Good Match?

Master Data is the critical electronic information about the company we cannot afford to lose. Accordingly, we should sanitise it, look after it, and store it safely in several separate places that are independent of each other. The advent of Big Data introduced the current era of huge repositories ?in the clouds?. They are not, of course but at least they are remote. This short article includes a discussion about Hadoop, and whether this is a good platform to back up your Master Data.

About Hadoop

Hadoop is an open-source Apache software framework built on the assumption that hardware failure is so common that backups are unavoidable. It comprises a storage area and a management part that distributes the data to smaller nodes where it processes faster and more efficiently. Prominent users include Yahoo! and Facebook. In fact more than half Fortune 50 companies were using Hadoop in 2013.

Hadoop – initially launched in December 2011 ? has survived its baptism of fire and became a respected, reliable option. But is this something the average business owner can tackle on their own? Bear in mind that open source software generally comes with little implementation support from the vendor.

The Hadoop Strong Suite

  • Free to download, use and contribute to
  • Everything you need ?in the box? to get started
  • Distributed across multiple fire-walled computers
  • Fast processing of data held in efficient cluster nodes
  • Massive scaleable storage you are unlikely to run out of

Practical Constraints

There is more to Hadoop than writing to WordPress. The most straightforward solutions are uploading using Java commands, obtaining an interface mechanism, or using third party vendor connectors such as ACCESS or SAS. The system does not replace the need for IT support, although it is cheap and exceptionally powerful.

The Not-Free Safer Option

Smaller companies without in-depth in-house support are wise to engage with a technical intermediary. There are companies providing commercial implementations followed by support. Microsoft, Amazon and Google among others all have commercial versions in their catalogues, and support teams at the end of the line.

Ready to work with Denizon?