Are Master Data Management and Hadoop a Good Match?

Master Data is the critical electronic information about the company we cannot afford to lose. Accordingly, we should sanitise it, look after it, and store it safely in several separate places that are independent of each other. The advent of Big Data introduced the current era of huge repositories ?in the clouds?. They are not, of course but at least they are remote. This short article includes a discussion about Hadoop, and whether this is a good platform to back up your Master Data.

About Hadoop

Hadoop is an open-source Apache software framework built on the assumption that hardware failure is so common that backups are unavoidable. It comprises a storage area and a management part that distributes the data to smaller nodes where it processes faster and more efficiently. Prominent users include Yahoo! and Facebook. In fact more than half Fortune 50 companies were using Hadoop in 2013.

Hadoop – initially launched in December 2011 ? has survived its baptism of fire and became a respected, reliable option. But is this something the average business owner can tackle on their own? Bear in mind that open source software generally comes with little implementation support from the vendor.

The Hadoop Strong Suite

  • Free to download, use and contribute to
  • Everything you need ?in the box? to get started
  • Distributed across multiple fire-walled computers
  • Fast processing of data held in efficient cluster nodes
  • Massive scaleable storage you are unlikely to run out of

Practical Constraints

There is more to Hadoop than writing to WordPress. The most straightforward solutions are uploading using Java commands, obtaining an interface mechanism, or using third party vendor connectors such as ACCESS or SAS. The system does not replace the need for IT support, although it is cheap and exceptionally powerful.

The Not-Free Safer Option

Smaller companies without in-depth in-house support are wise to engage with a technical intermediary. There are companies providing commercial implementations followed by support. Microsoft, Amazon and Google among others all have commercial versions in their catalogues, and support teams at the end of the line.

Check our similar posts

A Small External Enterprise Development Team is Cheaper than Your Own

Time is money in the application development business. We have to get to market sooner so someone else does not gazump us, and pip us at the post. We increase the likelihood of this with every delay. Moreover, the longer your in-house team takes to get you through the swamp, the higher the project cost to you.

Of course, in theory this should not be the case. Why bring in a team from outside, and pay more to support their corporate structure? Even going for a contract micro team ought not to make financial sense, because we have to fund their mark-up and their profit taking. Our common sense tells us that this is crazy. But, hold that thought for a minute. What would you say if a small external enterprise development team was actually cheaper? To achieve that, they would have to work faster too.

The costs of an Enterprise Internal Development Team

Even if you were able to keep your own team fully occupied ? which is unlikely in the long term ? having your own digital talent pool works out expensive when you factor in the total cost. Your difficulties begin with the hiring process, especially if you do not fully understand the project topic, and have to subcontract the hiring task.

If you decide to attempt this yourself, your learning curve could push out the project completion date. Whichever way you decide to go, you are up for paying advertising, orientation training, technical upskilling, travel expenses, and salaries all of which are going to rob your time. Moreover, a wrong recruitment decision would cost three times the new employee?s annual salary, and there is no sign of that changing.

But that is not all, not all by far. If want your in-house team to keep their work files in the office, then you are going to have to buy them laptops, plus extra screens so they can keep track of what they are doing. Those laptops are going to need desks, and those employees, chairs to sit in. Plus, you are going to need expensive workspace with good security for your team?s base.

If we really wanted to lay it on, we would add software / cloud costs, telephony, internet access, and ongoing technical training to the growing pile. We did a quick scan on PayScale. The median salary of a computer programmer in Ireland is ?38,000 per year and that is just the beginning. If you need a program manager for your computer software, their salary will be almost double that at ?65,000 annually.

Advantages of R&D outsourcing

The case for a small externally sourced enterprise development team revolves around the opportunity cost ? or loss to put in bluntly ? of hiring your own specialist staff for projects. If you own a smaller business with up to 100 people, you are going to have to find work for idle digital fingers, after you roll out your in-house enterprise project. If you do not, you head down the road towards owning a dysfunctional team lacking a core, shared objective to drive them forward.

Compared to this potential extravagance, hiring a small external enterprise development team on an as-needed basis makes far more sense. Using a good service provider as a ?convenience store? drives enterprise development costs down through the floor, relative to having your own permanent team. Moreover, the major savings that arise are in your hands and free to deploy as opportunities arise. A successful business is quick and nimble, with cash flow on tap for R & D.

8 Best Practices To Reduce Technical Debt

When past actions in software development return to haunt you…

Is your business being bogged down by technical debt? Let’s look at measures that you can take to reduce it and scale your operations without the weight pulling you back. 

 

Work with a flexible architecture.

Right from the word go, you want to use architecture whose design is malleable, especially with the rapid rate of software evolution witnessed today. Going with an architecture that keeps calling for too much refactoring, or whose design won’t accommodate future changes will leave you with costly technical debt. Use scalable architecture that allows you to modify or add new features in future releases. While on this, complex features required in the final product should be discussed at the planning stage, that way simplified solutions that will be easier to implement can be identified, as this will lead to less technical debt in the long run. 

 

The Deal with Refactoring 

This is basically cleaning up the code structure without changing its behaviour. With the updates, patches, and new functionalities that are added to the systems and applications, each change comes with the threat of more technical debt. Additionally, organisations are increasingly moving their IT infrastructure from on-premises facilities to colocation data centres and deploying them on the cloud. In such scenarios, some workarounds are often needed to enable the systems to function in the new environments, which they hadn’t been initially developed to accommodate. Here, you will need to take some time to refactor the existing system regularly, streamlining the code and optimizing its performance – and this will be key to pay down the tech debt. When working with a flexible architecture from the start, the amount of work that goes into this will be reduced, meaning there’ll be less tech debt involved. 

 

Run discovery tests

Discovery testing essentially takes place even before a line of code is written for the system or application. This takes place at the product definition stage, where human insight software is used to understand the needs of the customer and is particularly helpful in setting priorities for the development work that will be carried out. It gives your business the opportunity to minimize the technical debt by allowing customers to give you a roadmap of the most pertinent features desired from the product. 

 

Routine code review

Getting a fresh look at the product or application from different sets of eyes in the development team will improve the quality of the code, thus reducing technical debt. There’s a catch though – this should be planned in a convenient way that doesn’t end up becoming a burden for the developers. Here are suggestions:

Break down pull requests

Instead of having complex pull requests where numerous changes in the code are introduced at a go, have this broken down into smaller manageable pull requests, each with a brief title and description about it. This will be easier for the code reviewer to analyse. 

● Define preferred coding practices

Documenting the preferred coding style will result in cleaner code, meaning the developers will focus their effort on reviewing the code itself, not losing time on code format debates.

 

Test automation

Relying only on scheduled manual testing opens you up to the risk of technical debt accruing rapidly, and not having sufficient resources to deal with the accumulated problems when they are identified. Automated testing on the other hand enables issues to be uncovered quicker, and with more precision. For instance, you can have automated unit tests that look at the functioning of the individual components of a system, or regression testing where the focus is on whether the code changes that have been implemented have affected related components of the system. However, establishing and maintaining automated testing will require quite some effort – making it more feasible for the long-term projects.

 

Keep a repository that tracks changes made

Do you have a record of changes made in the software? Keeping one in a repository that is accessible by the development team will make it easy to pin-point problems at their source. For instance, when software is being migrated to a new environment, or legacy software is in the process of being modernised, you will want to have an accurate record of changes that are being introduced, that way if there is an undesired impact on the system this it will be easier to zero-down on the cause.

 

Bring non-technical stakeholders on board

Does this conversation sound familiar?

Development Team: “We need to refactor the messy code quickly”

Product Team: “We have no idea what you are saying”

On one hand, you have the management or product team defining the product requirements, creating a project roadmap, and setting its milestones. On the other hand, there’s the software development/engineering that’s primarily focused on the product functionality, technical operations and clearing the backlog in code fixes. Poor communication between the two teams is actually a leading cause of technical debt.

For you to take concrete steps in managing your technical debt, the decision-makers in the organisation should understand its significance, and the necessity of reducing it. Explain to them how the debt occurred and why steps need to be taken to pay it down – but you can’t just bombard them with tech phrases and expect them to follow your thought process. 

So how do you go about it? Reframe the issues involved with the technical debt and explain the business value or impact of the code changes. Basically, the development team should approach it from a business point of view, and educate the management or production team about the cost of the technical debt. This can include aspects such as expenses in changing the code, salaries for the software engineers especially when the development team will need to be increased due to the workload piling up, as well as the revenue that is lost when the technical debt is allowed to spiral. 

The goal here is to show the management or production team how issues like failing to properly define the product requirements will slow down future software development, or how rushing the code will affect the next releases. That way, there will be better collaboration between the teams involved in the project. 

 

Allocate time and resources specifically for reducing technical debt

With management understanding that working with low-quality code is just like incurring financial debt and it will slow down product development, insist on setting time to deal with the debt. 

For instance, when it comes to the timing of application releases, meetings can be conducted to review short- and longer-term priorities. These meetings – where the development team and product team or management are brought together, the developers point out the software issues that should be resolved as a priority as they may create more technical debt. Management then ensures that budgets and plans are put in place to explicitly deal with those ongoing maintenance costs.

 

Retire old platforms

While most of the resources are going into developing new applications and improving the systems being used, the organisation should also focus on retiring the old applications, libraries, platforms, and the code modules. It’s recommended that you factor this into the application release plans, complete with the dates, processes and costs for the systems involved. 

 

Total overhaul

When the cost and effort of dealing with the technical debt far outweighs the benefits, then you may have to replace the entire system. At this tipping point, you’re not getting value from the technical debt, and it has become a painful issue that’s causing your organisation lots of difficulties. For instance, you may be dealing with legacy software where fixing it to support future developments has simply become too complicated. The patches available may only resolve specific issues with the system, and still leave you with lots of technical debt. Here, the best way out is to replace the system in its entirety. 

 

Final thoughts

Every software company has some level of tech debt. Just like financial debt, it is useful when properly managed, and a problem when ignored or allowed to spiral out of control. It’s a tradeoff between design/development actions and business goals. By taking measures to pay down your organization’s debt and address its interest as it accrues, you will avoid situations where short term solutions undermine your long-term goals. This is also key to enable your business to transition to using complex IT solutions easier, and even make the migration between data centres much smoother. These 8 measures will enable you to manage your technical debt better to prevent it from being the bottleneck that stifles your growth.

Data Leakage Prevention – Protecting Sensitive Information

When DuPont lost $400 million in intellectual property, it wasn’t because a hacker from the other side of the world infiltrated their system. The information was simply stolen by a former employee. Alarmingly, data loss incidents are not always caused by deliberate actions.

A file containing personal information accidentally attached to an email and sent to multiple recipients; financial data stored in a USB pen drive, accidentally left in a restaurant; or bank account data of colleagues, inadvertently posted on a company website – these are also some of the everyday causes of data loss.

A report done by research company Infowatch regarding global data leaks in 2010 showed that there were actually more accidental data leaks in that year compared to intentional ones. Accidental leaks comprised 53%, while intentional leaks comprised 42% (the rest were unidentified).

But even if they ?only? happened accidentally, breach incidents like these can still be very costly. The tens of thousands of dollars that you could sometimes end up paying in civil penalties (as in the case when you lose other people?s personal information) can just be the beginning. More costly than this is the loss of customer and investor confidence. Once you lose those, you could consequently lose a considerable portion of your business.

Confidential information that may already be leaking out right under your nose

With all the data you collect, process, exchange, and store electronically every day, your IT system has surely now become a storehouse of sensitive information. Some of them, you may be even taking for granted.

But imagine what would happen if any of the following trade secrets fell into the wrong hands: marketing plans, confidential customer information, pricing data, product development strategies, business plans, supplier information, source codes, and employee salaries.

These are not the only kind of data that you should be worried about. You could also get into trouble if your sloppy IT security fails to protect employee or client personal information such as their names; social security numbers; drivers license numbers; or bank account numbers and credit/debit card numbers along with their corresponding PINs.

In some countries, you could face onerous data breach notification requirements and heavy fines when these kind of data are involved.

There are now more holes to plug

It’s not just the different varieties of sensitive electronic information that you have to worry about. Because these data can take on different forms, i.e. data-at-rest, data-in-motion, and data-at-the-endpoints, you also need to take aim at different areas in your IT system.

Sensitive information can be found ?at rest? in each of your employees? hard disks, in your servers, storage disks, and in off-site backup disks. They can also be found ?in motion? in email, instant messaging, social networking messaging, P2P file sharing, ftp, http, and so on.

That’s not all. Your highly mobile workforce may have already introduced yet another high-risk area into your system: data-at-the-endpoints. This includes USB flash-disks, laptops, portable hard disks, CDs, and even smartphones.

The main challenge of data leak prevention

Having been made aware of the various aspects of data leakage, have you already come to grips with the extent of the task at hand?

There are two major things you need to do here to prevent data leakage.

One, you need to identify what data you have that can be considered as sensitive/confidential information. Of course you have financial information and employee salaries in your files. But do you also store personally identifiable information? Do you have trade secrets that are stored in electronic form?

Two, you need to pinpoint their locations. Are they only on your hard disks and laptops? Or have they made their way to flash drives, CDs/DVDs, or portable HDDs? Are they being transmitted through email or any other file transfer media?

The reason why you need to know what your sensitive data are as well as where they are is because you would like all efforts of securing them to be as efficient and unobtrusive as possible.

Let’s say, as a way of protecting your data, you decide to implement encryption. Since encryption can consume a lot of storage space and significantly reduce performance, it may be impractical to encrypt your entire database or all your files. For the same reason, you wouldn’t want to encrypt every single email that you send.

Thus, the best way would be to encrypt only the data that really need encryption. But again, you need to know what data needs to be encrypted and where those data can be found. That alone is no simple task.

Not only will you need to deal with the data you already have, you will also have to worry about the data that will go through your systems during the course of your day-to-day transactions.

Identifying sensitive data as it enters or leaves your system, goes through your network, or gets stored in your file system or database, and then applying the necessary security actions should be done automatically and intelligently. Otherwise, you could end up spending on a lot of man-hours or, worse, wasting them on a lot of false positives and negatives.

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK

Ready to work with Denizon?