Why ETL Tools Might Hobble Your Real-Time Reporting & Analytics

Companies report large investments in their data warehousing (DW) or Business Intelligence (BI) systems with a large portion of the software budget spent on extract, transformation and load (ETL) tools that provide ease of populating such warehouses, and the ability to manipulate data to map to the new schemas and data structures.

Companies do need DW or BI for analytics on sales, forecasting, stock management and so much more, and they certainly wouldn’t want to run the additional analytics workloads on top of already taxed core systems, so keeping them isolated is a good idea and a solid architectural decision. Considering, however, the extended infrastructure and support costs involved with managing a new ETL layer, it’s not just the building of them that demands effort. There’s the ongoing scheduling of jobs, on-call support (when jobs fail), the challenge of an ever-shrinking batch window when such jobs are able to run without affecting other production workloads, and other such considerations which make the initial warehouse implementation expense look like penny candy.

So not only do such systems come with extravagant cost, but to make matters worse, the vast majority of jobs run overnight. That means, in most cases, the best possible scenario is that you’re looking at yesterday’s data (not today’s). All your expensive reports and analytics are at least a day later than what you require. What happened to the promised near-realtime information you were expecting?
Contrast the typical BI/DW architecture to the better option of building out your analytics and report processing using realtime routing and transformation with tools such as IBM MQ, DataPower and Integration Bus. Most of the application stacks that process this data in realtime have all the related keys in memory –customer number or ids, order numbers, account details, etc. – and are using them to create or update the core systems. Why duplicate all of that again in your BI/DW ETL layer? If you do, you’re dependent on ETLs going into the core systems to find what happened during that period and extracting all that data again to put it somewhere else.

Alongside this, most organizations are already running application messaging and notifications between applications. lf you have all the data keys in memory, use a DW object, method, function or macro to drop the data as an application message into your messaging layer. The message can then be routed to your DW or BI environment for Transformation and Loading there, no Extraction needed, and you can get rid of your ETL tools.

Simplify your environments and lower the cost of operation. If you have multiple DW or BI environments, use the Pub/Sub capabilities of IBM MQ to distribute the message. You may be exchanging a nominal increase in CPU for eliminating problems, headaches and costs to your DW or BI.

Rethinking your strategy in terms of EAI while removing the whole process and overhead of ETL may indeed bring your whole business analytics to the near-realtime reporting and analytics you expected. Consider that your strategic payoff. Best regards in your architecture endeavors!
Image by Mark Morgan.

Breach Etiquette: Target's Responsibility

Just as retailers were in the throes of the holiday madhouse, Target – the second largest retailer in the US – was breached. Forbes recently posted an article outlining seven lessons that could be learned from the way Target handled the situation.
The link to the Forbes article is here – Target’s Worst PR Nightmare: 7 Lessons From Target’s Well-Meant But Flawed Crisis Response – but what do you think?
What I always find surprising in these cases in which consumer portal sites are breached/hacked is that there’s always so much talk about how to handle the consequences. But what about an explanation of what will be done to prevent this from happening again? The same issue happened last year with the PlayStation Network, when millions of credit-card numbers and customer information was exposed. Another scenario was the ObamaCare website: The site went down because it wasn’t properly architected and stress tested. We heard a lot about “why” but not a lot about the “what” is being done to prevent it from happening all over again.
Obviously, when you open your business to the world, you’re now exposed to a world of attacks. You can only do your best to prevent a hacker’s attack. However, your best must include an ongoing and robust test plan, executed by an experienced team that keeps up with the latest technologies, methods of attacks, and the ever-changing demographics of user communities and methods of access.
TxMQ has expert infrastructure architects, portal architects and load-testing expertise to help companies address these issues through cost-effective, consulting engagements.
Find out more. Email our consulting leaders in confidence, [email protected], for more information.