Are Master Data Management and Hadoop a Good Match?

Master Data is the critical electronic information about the company we cannot afford to lose. Accordingly, we should sanitise it, look after it, and store it safely in several separate places that are independent of each other. The advent of Big Data introduced the current era of huge repositories ?in the clouds?. They are not, of course but at least they are remote. This short article includes a discussion about Hadoop, and whether this is a good platform to back up your Master Data.

About Hadoop

Hadoop is an open-source Apache software framework built on the assumption that hardware failure is so common that backups are unavoidable. It comprises a storage area and a management part that distributes the data to smaller nodes where it processes faster and more efficiently. Prominent users include Yahoo! and Facebook. In fact more than half Fortune 50 companies were using Hadoop in 2013.

Hadoop – initially launched in December 2011 ? has survived its baptism of fire and became a respected, reliable option. But is this something the average business owner can tackle on their own? Bear in mind that open source software generally comes with little implementation support from the vendor.

The Hadoop Strong Suite

  • Free to download, use and contribute to
  • Everything you need ?in the box? to get started
  • Distributed across multiple fire-walled computers
  • Fast processing of data held in efficient cluster nodes
  • Massive scaleable storage you are unlikely to run out of

Practical Constraints

There is more to Hadoop than writing to WordPress. The most straightforward solutions are uploading using Java commands, obtaining an interface mechanism, or using third party vendor connectors such as ACCESS or SAS. The system does not replace the need for IT support, although it is cheap and exceptionally powerful.

The Not-Free Safer Option

Smaller companies without in-depth in-house support are wise to engage with a technical intermediary. There are companies providing commercial implementations followed by support. Microsoft, Amazon and Google among others all have commercial versions in their catalogues, and support teams at the end of the line.

Check our similar posts

Network Security

The easiest way for an external threat to get to your private data is through your network. The easiest way to eliminate that threat? Get your data out of the network. Of course, we know you wouldn’t want to do that. We also know that while you may want to sniff every packet for anything suspicious, you wouldn’t want your network to crawl either.

That’s why we’re offering to put up the most efficient checkpoints on every route that leads into and out of your system.

So what can you expect from our brand of network security?

  • Review of your policies and processes for weaknesses – If we see a loophole, we’ll recommend modifications wherever necessary.
  • Protection for your applications and infrastructure – Since we’re familiar with both software and hardware-based protection systems, we can recommend which type is best suited for your setup.
  • Automated identification of business and mission critical applications – They’ll be given priority in your network to ensure bandwidth allocation is optimised.
  • Automated network audits and vulnerability management – Tired of getting prompted by pesky vulnerability notices and don’t know what to do with them? Well, that’s why we’re here.
  • Customisable security reports that contain only relevant and accurate data.

We can also help you with the following:

The Better Way of Applying Benford’s Law for Fraud Detection

Applying Benford’s Law on large collections of data is an effective way of detecting fraud. In this article, we?ll introduce you to Benford’s Law, talk about how auditors are employing it in fraud detection, and introduce you to a more effective way of integrating it into an IT solution.

Benford’s Law in a nutshell

Benford’s Law states that certain data sets – including certain accounting numbers – exhibit a non-uniform distribution of first digits. Simply put, if you gather all the first digits (e.g. 8 is the first digit of ?814 and 1 is the first digit of ?1768) of all the numbers that make up one of these data sets, the smallest digits will appear more frequently than the larger ones.

That is, according to Benford’s Law,

1 should comprise roughly 30.1% of all first digits;
2 should be 17.6%;
3 should be 12.5%;
4 should be 9.7%, and so on.

Notice that the 1s (ones) occur far more frequently than the rest. Those who are not familiar with Benford’s Law tend to assume that all digits should be distributed uniformly. So when fraudulent individuals tinker with accounting data, they may end up putting in more 9s or 8s than there actually should be.

Once an accounting data set is found to show a large deviation from this distribution, then auditors move in to make a closer inspection.

Benford’s Law spreadsheets and templates

Because Benford’s Law has been proven to be effective in discovering unnaturally-behaving data sets (such as those manipulated by fraudsters), many auditors have created simple software solutions that apply this law. Most of these solutions, owing to the fact that a large majority of accounting departments use spreadsheets, come in the form of spreadsheet templates.

You can easily find free downloadable spreadsheet templates that apply Benford’s Law as well as simple How-To articles that can help you to implement the law on your own existing spreadsheets. Just Google “Benford’s law template” or “Benford’s law spreadsheet”.

I suggest you try out some of them yourself to get a feel on how they work.

The problem with Benford’s Law when used on spreadsheets

There’s actually another reason why I wanted you to try those spreadsheet templates and How-To’s yourself. I wanted you to see how susceptible these solutions are to trivial errors. Whenever you work on these spreadsheet templates – or your own spreadsheets for that matter – when implementing Benford’s Law, you can commit mistakes when copy-pasting values, specifying ranges, entering formulas, and so on.

Furthermore, some of the data might be located in different spreadsheets, which can likewise by found in different departments and have to be emailed for consolidation. The departments who own this data will have to extract the needed data from their own spreadsheets, transfer them to another spreadsheet, and send them to the person in-charge of consolidation.

These activities can introduce errors as well. That’s why we think that, while Benford’s Law can be an effective tool for detecting fraud, spreadsheet-based working environments can taint the entire fraud detection process.

There?s actually a better IT solution where you can use Benford’s Law.

Why a server-based solution works better

In order to apply Benford’s Law more effectively, you need to use it in an environment that implements better controls than what spreadsheets can offer. What we propose is a server-based system.

In a server-based system, your data is placed in a secure database. People who want to input data or access existing data will have to go through access controls such as login procedures. These systems also have features that log access history so that you can trace who accessed which and when.

If Benford’s Law is integrated into such a system, there would be no need for any error-prone copy-pasting activities because all the data is stored in one place. Thus, fraud detection initiatives can be much faster and more reliable.

You can get more information on this site regarding the disadvantages of spreadsheets. We can also tell you more about the advantages of server application solutions.

Why Spreadsheets can send the Pillars of Solvency II Crashing Down


Solvency II is now fast approaching and while it may provide added protection to policy holders, its impact on the insurance industry is not all a bed of roses. Expect insurance companies to restructure, increase manpower, and raise spending on actuarial operations and risk management initiatives. Those that cannot, will have to go. But what have spreadsheets got to do with all these?

Well, spreadsheets aren’t really the main casts in this blockbuster of a regulatory exercise but they certainly have a significant supporting role to play. Pillar I of Solvency II, which calls for improved supervision on internal control, risk management, and corporate governance, and Pillar II, which tackles supervisory reporting and public disclosure of financial and other relevant information, both affect systems that have high-reliance on spreadsheets.

A little background about spreadsheets might help.

Who needs an IT solution when you can have spreadsheets?

Everyone in any organisation just love spreadsheets; from the office clerk to the CEO. Because they’re so easy to use (not to mention they’re a staple in office computers), people employ them for processing numbers and as an all-around tool for planning, forecasting, reporting, complex modelling, market data analysis, and so on. They make such tasks faster and easier. Really?

You probably haven’t heard of spreadsheet hell

Unfortunately, spreadsheets do have certain shortcomings. Due to their inherent structure and lack of controls, it is so easy to commit simple errors like an accidental copy paste, an omission of a negative sign, an incorrect data input, or an unintentional deletion. Such shortcomings may seem harmless until your shareholders discover a multi-million discrepancy in your financial report.

And because spreadsheet errors can go undetected for a long time, they are constant targets of fraudsters. In other words, spreadsheets are high risk applications.

Solvency II Impact on Spreadsheet-based Financial and IT Systems

Regulations like Solvency II, are aimed at reducing risks to manageable levels. Basically, Solvency II is a risk-based system wherein a company?s capital requirements will depend on its measured riskiness. If companies want to avoid facing onerous capital requirements, they have to comply.

The three pillars of Solvency II have to be in place. Now, since spreadsheets (also known as User Developed Applications or UDAs) are high-risk applications with weak control features and prone to produce inaccurate reports, companies will have a lot of work to do to establish Pillars II and III.

There are at least 8 articles that impact spreadsheets in the directive. Article 82, for example, which requires firms to ensure a high level of data quality and accuracy, strikes at the very core of spreadsheets? weakness.

A whitepaper by Raymond Panko entitled ?Spreadsheets and Sarbanes-Oxley: Regulations, Risks, and Control Frameworks? mentioned that 94% of audited real world operational spreadsheets that were included in his study were found to have errors and that an average of 5.2% of all cells in the audited spreadsheets had errors.

Furthermore, many articles in the directive call for the enforcement of better documentation. This is one thing that’s very tedious and almost unrealistic to do with spreadsheets because just about anyone uses them. Besides, with different ‘versions? of the same data existing in different workstations throughout the organisation, it would be extremely difficult to keep track of them all.

Because of spreadsheets you now need an IT solution

It is clear that, with the growing number of regulations and the mounting complexity of tasks needed for compliance, spreadsheets no longer belong in this era. What you need is a server-based solution that allows for seamless collaboration, data reliability, data consistency, increased security, automatic consolidation, and all the other features that make regulation compliance more doable.

One important ingredient for achieving Solvency II compliance is sound data risk management. Sad to say, the ubiquitous spreadsheet will only expose your data to more risks.

More Spreadsheet Blogs


Spreadsheet Risks in Banks


Top 10 Disadvantages of Spreadsheets


Disadvantages of Spreadsheets – obstacles to compliance in the Healthcare Industry


How Internal Auditors can win the War against Spreadsheet Fraud


Spreadsheet Reporting – No Room in your company in an age of Business Intelligence


Still looking for a Way to Consolidate Excel Spreadsheets?


Disadvantages of Spreadsheets


Spreadsheet woes – ill equipped for an Agile Business Environment


Spreadsheet Fraud


Spreadsheet Woes – Limited features for easy adoption of a control framework


Spreadsheet woes – Burden in SOX Compliance and other Regulations


Spreadsheet Risk Issues


Server Application Solutions – Don’t let Spreadsheets hold your Business back


Why Spreadsheets can send the pillars of Solvency II crashing down

Advert-Book-UK

amazon.co.uk

Advert-Book-USA

amazon.com

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK

Ready to work with Denizon?