Information Management

Category Archives

EAI (Enterprise Application Integration): Enterprise Service Bus (ESB) vs Service Oriented Architecture (SOA)

Let’s evaluate the following question: Is an Enterprise Service Bus (ESB) a technical tool implementation that aids in delivering a Service Oriented Architecture (SOA)?

This article looks at ESB and SOA from a conceptual perspective as opposed to practical implementation. Determining the right solution means for these concepts to be fully investigated and applied to the specific enterprise and its landscape. It is therefore by no means the final say on the subject and does not prescribe a specific tool.

In order to evaluate this statement, one needs to consider what  ESB and SOA respectively are, given a proper definition of the global concept of Enterprise Application Integration (EAI).

Enterprise Application Integration (EAI) – Useful to clearing up the confusion of integration semantics:

eaiLet’s firstly clarify that Sirius does not consider EAI to be an appraoch like SOA or ESB. We consider it to be the broader concept called Enterprise application integration (EAI) since, by definition, EAI is the use of technologies and services across an enterprise to enable the integration of software applications and hardware systems. EAI can therefore be related to middleware as well as distributed technologies. Despite our definition, many vendors offer ‘EAI suites’ that provide cross-platform, cross-language integration solutions.The sharing of data and business processes between applications are the primary purposes for these solutions while also prescribing a set of principles for integration of multiple systems for communication architectures, such as message-oriented middleware  (MOM). In our opinion, this blurs the line between EAI (a concept/framework), which spans various layers of application, business process, data, information and integration architectures, and ESB which is essentially a tool to enablling EAI. Hence, such EAI suites should really be called ESB tools.

Developing EAI technologies involve web service integration, service-oriented architecture, content integration and business process workflow integration. EAI is usually challenged from a technology perspective by different operating systems, database architectures and/or computer languages, as well as other situations where legacy systems are no longer supported by the original manufacturers. Many types of business software such as supply chain management applications, ERP systems, CRM applications for managing customers, business intelligence applications, payroll and human resources systems typically cannot communicate with one another in order to share data or business rules. .

EAI aims to establish the required communication by meeting these challenges through the fulfillment of three purposes, as follows:

  • Data Integration: Ensures consistent information across different systems.
  • Vendor Independence: Business policies or rules regarding specific business applications do not have to be re-implemented when replaced with different brand applications.
  • Common Facade: Developers and users are not required to learn new or different applications because a consistent software application access interface is provided.

The advantages of EAI are clear:

  • Enabling real-time information access
  • Streamlining processes
  • Accessing information more efficiently
  • Transferring data and information across multiple platforms
  • Simplified development and maintenance

Management strategies to enterprise integration must address the four different domains depicted below. Each of these domains represents different challenges.

Relation between the architecture models, viewpoints and integration distribution models from an Enterprise Architecture (EA) perspective.

 

Service Oriented Architecture (SOA):SOA_Metamodel

A service-oriented architecture is essentially a collection of services. These services communicate with each other. The communication can involve either simple data passing or it could involve two or more services coordinating some activity. Some means of connecting services to each other is needed. At a more technical level, SOA is an evolution of distributed computing based on the request/reply design paradigm for synchronous and asynchronous applications which utilise these services to communicate and to share information and functions. SOA can be applicable to a single technology (eg. SAP), with its various modules that communicate via prescibed types of services on fixed standards, or it can be a collection across an enterprise for various applications to integrate via services. SOA developed as an approach to allow scalabiality following the early main-frame computing era where integration challenges resulted into hardly supportable ‘spaghetti’ code, performing non-real-time, flat file processing in order to accomplish integration.

Hence, SOA is one approach to enabling EAI. Clearly, different types of frameworks can be developed to enable EAI. In theory, perfect EAI can only be enabled by tightly controlled frameworks. Frameworks imply governance, which is a big factor of SOA, however,usually lacks completely by design from SOA development. There are different definitions for SOA, however, we persist that no single SOA operates with complete guidance for integration unless clearly stipulated and therefore does not suffice under the definition of a fully fledged framework, catering for all integration that might realistically be required to enable fully fledged EAI. We therefore view SOA as a principle approach to enable EAI requirements. Various different integration technologies and integration methods may still be at play to enable SOA. One may argue that Service orientation is based on the software development concept of Object Orientation (OO). Just like OO can enable re-usability and scalability of code, SOA can enable re-usability and scalability of live, compiled services to remove duplication in information flow across the enterprise. However, just like OO cannot guarantee that code gets re-used, similarly SOA cannot guarantee that services get re-used, since SOA’s set of services could potentially all still just be enabling point-to-point integration as opposed to agile integration (plug-and-play). This is where an Enterprise Service Bus (ESB) might come in handy to either compliment or take over from the distributed, standardised services composing your SOA model.

 

Enterprise Service Bus (ESB):

gartner-ipaas-2016An Enterprise Service Bus (ESB) is fundamentally an architecture. However, numerous tools have now been developed to practically enable it as a freestanding technology to solve EAI challenges. It is a set of rules and principles for integrating numerous applications together over a bus-like infrastructure which can also house the various services contained in SOA, together with other integration mechanisms in order to centralise and standardise integration across the organisation. Essentially, ESB provides Application Programming Interfaces (APIs) for developers to create services and send messages between services with real-time integration. Various ESB products enable users to build this type of architecture, but vary in the way that they do it and the capabilities that they offer. The core concept of the ESB architecture is that you integrate different applications by putting a communication bus between them and then enable each application to talk to the bus. This decouples systems from each other, allowing them to communicate without dependency on or knowledge of other systems on the bus. The concept of ESB was born out of the need to move away from point-to-point integration as well as improvements to SOA, which can become brittle and hard to manage over time. Point-to-point integration results in custom integration code being spread among applications with no central way to monitor or troubleshoot. This is often referred to as “spaghetti code” and does not scale because it creates tight dependencies between applications. ESB solves this by forcing centralised control for integration testing. Essentially, ESB should be intended to form the backbone of any SOA architecture to enable centralised and standardised control, in turn meeting the challenges and gaining the advantages of properly governed EAI. ESB properly spans the various domains of integration as opposed to only grouping services. We need to place a cautionary note here that no ESB will cater for every aspect of integration across all technologies and at maximum offers control for the majority of integration points within the ringfenced control of an organisation. Some forms of batch processing might still be required internally due to complexity, skill or speed constraints. Let’s discuss the overlap of ESB and ETL for big data briefly in order to eliminate confusion on what this now means in the cloud- and big data context.

esb[1]

 

ESB and SOA in the new Big Data Context: APIs and ESBs  can be used to Simplify Data Integration for ETL in the cloud computing domain

The increase in popularity of Application Programming Interfaces (APIs) has made it much easier to create connectivity. With APIs, developers can access endpoints and build connections without having in-depth knowledge of the system itself, simplifying processes tremendously. As Extract-Transform-Load (ETL) data tools remained focused more towards BI and big data solutions, and as traditional operational data integration methods become outdated with the rise in popularity of cloud computing, ESBs become better options to solving connectivity requirements.

ETLAn enterprise service bus (ESB) provides API-based connectivity with real-time integration. Unlike traditional ETL tools used for data integration, an ESB isolates applications and databases from one another by providing a middle service layer. This abstraction layer reduces dependencies by decoupling systems and provides flexibility. Developers can utilize pre-built connectors to easily create integrations without extensive knowledge of specific application and database internals, and can very quickly makes changes without fear of the entire integrated system falling apart. Shielded by APIs, applications and databases can be modified and upgraded without unexpected consequences. In comparison to utilizing ETL tools for operational integration, an ESB provides a much more logical and well defined approach to take on such an initiative. On the down-side, batch data transfers via ESB can become painful and ETL developers such as Business Intelligence (BI) developers, who are used to typically accessing large quantities of data directly, may be prone to circumvent the ESB, therefore forcing integration architecture back to only being conceptual and responsive rather than physical and in control. It therefore remains imperative for collaboration between BI and integration team leads, since the most large scale ESB tools now include options to connect to relational databases as well as emerging Big Data platforms. For instance, take a look at the following comparison between Biztalk and Informatica.

In Summary

  • SOA is an architectural approach where we expose ‘service’ in a coarse-grained manner whereas ESB is a technical implementation that aids in delivering a SOA.
  • SOA brings cost effective, reusable and low lead time solutions to an organization whereas ESB enables low cost integration and benefits companies with limited IT resources.
  • SOA is a way of building the next generation of applications from ‘lego blocks’ called services whereas ESB is a piece of infrastructure software that provides APIs for developers to create services and send messages between services.
  • SOA is just like a car and ESB is like a road on which this car runs.
  • SOA is an architectural model for implementing loosely coupled service based applications whereas ESB is a piece of infrastructure software that helps developers to develop services, and communicate between services through suitable APIs.
  • Certain ESB tools are now allowing SOA to be implemented while also enabling big data ETL via the necessary exposed layers for data enquiry

 

A senior advisor will contact you as soon as possible. Contact: advisory@siriussa.comRequest Assistance Here!


Data Science: Topoligical Data Analysis – Slaying the big data dragon for analytics

Topological Data Analysis (TDA) is a recent field that emerged from various works in applied (algebraic) topology and computational geometry during the past 15 years. This article aims to simplify TDA to the business intelligence and data science community. Despite assuming some level of mathematical and analytical insight from the reader it provides both an oversimplified technical explanation as well as a completely oversimplified non-technical explanation of what TDA is and can achieve in business terms. It aims to be useful at an executive and technical level not limited to but including CIO’s, CDO’s, actuaries, data analysts, data scientists, business intelligence developers as well as any business role player interested in deriving value from big data.

TDA – The problem statement

big-data-samurai-siriusAlthough one can trace back geometric approaches for data analysis quite far in the past, TDA started with the pioneering works of Edelsbrunner et al. (2002) and Zomorodian and Carlsson (2005) in persistent homology (genetic sciences) and was popularized in a landmark paper in 2009 Carlsson (2009). Carlsson noted that an important feature of modern science and engineering is that data of various kinds is being produced at an unprecedented rate (see Moore’s law applied to big data) and that our ability to analyze this data, both in terms of quantity and the nature of the data, is clearly not keeping pace with the data being produced.

Topology as a real solution to deriving knowledge from big data?

topological-data-analysis-sirius“On Guard!”. Alternative analytical methods to traditional data analysis have since been sought given the need to derive and infer knowledge effectively from the real world (where the flexibility and growth rate of data is unprecedented) and to develop other mechanisms by which the behavior of data invariants or construction under a change of parameters can continuously be effectively summarized without looking only at the raw data itself. Topology was selected by Carlsson to further the development of solutions to deal with these problems since topology is exactly that branch of mathematics which deals with qualitative geometric information. Topology studies geometric properties  (in short, topology studies the notion of shape) in a way which is much less sensitive to the actual choice of metrics than straightforward geometric methods (Technical note: coordinates in TDA may not carry intrinsic meaning). This provides the ideal type of architectural approach in data modelling to derive knowledge from the ‘shapes’  within large, hugely complex datasets as opposed to only analysing the raw data itself. It allows us to consider the ‘gaps’ and relationships between the data underlying various types of information in order to derive knowledge. Still confusing? Let’s get practical!

Completely oversimplified explanation – What is TDA?

data dragon origami sirius redAn oversimplified way to explain this to the non-mathematical academic community is to say that TDA is like deriving knowledge from the structures underlining a piece of constantly changing origami. If the points on the origami describes the origami itself, then the relationships between those points tell us something about what those points are and why they are where they are. The ‘what’ and the ‘why’ can be seen as knowledge, which we can use to generate new, creative, and meaningful origami pieces or to learn why things are how they are. This’ origami piece’ can present itself in numerous practical problems of knowledge including DNA, chemistry, human interaction and many more. Associated technologies are therefore being developed, specific to analysing and deriving knowledge on the basis of TDA for specific industries. This enables data scientists to create intelligent, non-linear computational models to effectively analyse massive datasets in order to produce well-informed recomendations to business decisionmakers. Essentially, TDA has the ability to completely transform what your actuary used to do from a data-drives-value perspective. It has the ability to allow data science to generate value from the following key aspects that intrinsically differentiate data science from traditional data analysis: Utilizing erratic, sporadic as well as consistent and pattern-based data input to produce learning, usable for well-informed, robust decision-making. This means, for example, that we can also analyse data generated outside of an organisation to find relationships affecting the organisation itself. The ability to analyse seemingly random datasets is a valuable tool when seeking innovative opportunity.

Simplified technical explanation – What is TDA?

On a technical note, TDA is mainly motivated by the idea that topology and geometry provide a powerful approach to infer robust qualitative, and sometimes quantitative, information about the structure of data. It aims at providing well-founded mathematical, statistical and algorithmic methods to infer, analyze and exploit the complex topological and geometric structures underlying data that are often represented as point clouds in Euclidean or more general metric spaces. We classify these problems to be solved in mathematics as multi-agent stochastic optimization problems. In simple terms, these are ‘search problems’ where one needs to isolate randomness from pattern in order to perform proper analysis. ‘Big data’, as a generalized term, now lends itself well to the associated characteristics.  From a computer programming and architectural perspective – In TDA, multi-agent cooperative decision making can be modeled as a cyclic (decentralized) optimization, where the joint decision vector is optimized by sequentially optimizing each individual agent’s decision vector while holding the others fixed. Simply stated, in TDA, one object(vector) in the network can be optimised for decisionmaking without adjusting any of the other objects, while still leading to the best outcome for the entire network model. Moreover, because of uncertainty in knowledge of the target and knowledge of the state of the other agents, the problem is a stochastic optimization problem where only noisy measurements of the objective function are available to each agent. The unerlying design and associated programming for TDA can therefore potentially be utilised to simulate various types of neural systems for analysis, such as cellular interaction, neural / brain interaction and numerous other forms of previously unexplained systems which inhabit patterns based on objects ‘constructed’ with non-fixed coordinates (eg. specific solar systems, new molecular models, weightless mass computation models). TDA could provide ground-breaking analytical capability to many endeavors of scientific research across disciplines, yet be simplified to be practically applied to the scenarios of current business industries.

Potential – What TDA is achieving on the ground level and may do in the future

zigbee-internet-of-things[1]The mathematical branch of topology studies only properties of geometric objects which do not depend on the chosen coordinates, but rather on intrinsic geometric properties of the objects. One can imagine any 3-D shape  (eg. a tetrahedron) , behaving as a function which changes its length parameters (the lengths of its sides) constantly.  The points of the shape is therefore coordinate-free and hence TDA, utilising topology, provides a flexible method to ‘plug-and-play’ various scenarios for analysis. The relationships which are useful involve continuous maps between the different geometric objects and therefore become a manifestation of the notion of functoriality (Functoriality means that something is a functor. E.g. when someone asks about the functoriality of some construction, they’re asking whether or not it can be upgraded to a functor). This is leading to the development of multiple variants of clustering algorithms, which, functionally, can be used across a range of spaces including neuroscience, mathematical analysis itself, information technology decision support, corporate big data analysis, medical sciences, human behavioral sciences and many more to mention only a few. The current ideal application of TDA stretches way beyond the fairly simple analytical requirements of businesses and should be extended into advanced machine learning technologies to further the development of artificial intelligence (A.I)

Sounds too complex? Well, despite the fact that you might soon be replacing your actuary with a data scientist at a much higher salary, TDA might just be the single most innovative approach yet to properly slaying your big data analytics dragon.

– Johan Smith (Executive: Business Solutions)

A senior advisor will contact you as soon as possible. Contact: advisory@siriussa.comRequest Assistance Here!


Data Architecture VS Information Architecture

dataconnectionsPlease note that our philosophy on the subject enables the TOGAFF 9 Capability Based Planning approach. This enables the CIO / CDO  with proper governance from an enterprise architecture and IT perspective since capability-based planning is a powerful mechanism to ensure that the strategic business plan drives the enterprise from a top-down approach. It is also adaptable with capability engineering to leverage emerging bottom-up innovations. This aligns with the Sirius approach of planning, engineering and delivery of solutions to enable strategic business capabilities to the enterprise.

Data vs Information


binary-blue-data-infoWe are often tasked with this question – what is the difference between data and information architecture? This seemingly simple question (from a technical perspective) might be met with complicated, conflicting answers. Let’s simplify this into something that aligns well with both common sense and industry norm.

Most professionals who started their careers in the data field will agree that two decades ago this was a much simpler question. Not only has the lines between terms such as data and information been blurred due to differing business cases and applications, but the value association of ‘data’ and ‘information’ may differ between businesses. This makes the question relevant within the current ‘big-data’ paradigm.

Let’s first, therefore, clarify our Sirius definition of the difference between data and information. This definition aligns well to all the major, generally accepted frameworks on the subject.:

Information is data put into action. Where a specific data element may exist in any form of format or another, it has lesser business value until it is integrated with other data elements into a package of data elements. This package can be described as an information package. Hence, information is data with context. Knowledge and insights are further derivations of information, yet, in the IT context, it is still based on data“. 

The two forms of ‘01011000…’-Architectures

data-storage If ‘data’ refers to singular items of values without context, then Data Architecture is necessary to contain and organize the various data resources into a manageable data environment.

Likewise, if ‘information’ refers to contextualized data, then Information Architecture is necessary to combine those resources into a structure that allows the dissemination of that information to be created, shared, analyzed, utilized, and governed throughout an enterprise, across all lines of business, within all departments, with confidence and reliability.

Practical application (Data Architecture VS Information Architecture)

information-worldThe Data Architect focuses on the technical depth implied by and associated with the design, physical structuring, storage, movement, planning, backup, retention, dissemination and classification of all data elements within the enterprise and may or may not be required to engage across business functions. Such functions would typically result into technical specifications during business application design.

The Information Architect focuses on the technical, business and governance matters implied by and associated with the design, overlap, contextualization, usage, security, sharing, discontinuation, valuation and publication of all information sets across the enterprise and across business functions. The resulting architecture should form part of the greater data governance strategic pack as well as the enterprise architecture and enable the data architects with clearly contextualized application required from resulting information sets.

These two roles have clearly segregated areas of control as well as points of engagement for overlap. If segregated, it is not recommended for one role to report to the other, but it is highly recommended that both report to the ‘Chief Data Officer’ , ‘Information Management Executive’ or ‘Enterprise Architect’.

Need assistance? Get in touch!

A senior advisor will contact you as soon as possible. Alternative: 0861 222 982 / advisory@siriussa.comRequest Assistance Here!


MDM – Financial Ride or No Ride at all

jacubgeorgepic1Context

One of the goals of master data management (MDM) is to maintain a single access point to specified data sets, thereby creating a single view of information. Whether customers, products, or other entities, organisations may struggle to measure the actual benefits of consolidating and centralising data to have a single viewpoint of data.

Beyond the ability to use a centralised access point to gain a better perspective of what is going on in the organisation, organisations are always required to provide an account of the financial benefits of new IT initiatives. Within MDM, business intelligence (BI), business process management (BPM), etc. the immediate return on investment is not always seen. Unfortunately, some organisations choose not to look beyond immediate financial gains when looking to implement new solutions or change the way they currently do business.

The management of master data touches more than just the information that is maintained. By creating data repositories that reflect business functions, organisations can develop views of data that give users access to the information required to do their jobs effectively and efficiently.

To better understand the business benefits of MDM, it is important to recognise an organisation’s general information infrastructure and how silos of data have been formed within organisations creating a splintered view of what is occurring within the organisation. Then it is possible to understand the benefits of the creation of a centralised repository of data – regarding both the information accessed and the business benefits that go beyond data.

Many business executives have come to realise that with large distributed ERP and CRM systems, MDM is not only pivotal to operational success, but also to delivering business insight. It enables business intelligence and reporting solutions with accurate hierarchical and trusted master data. Without it, business intelligence is often lost for numbers.

So, what qualifies the investment in an MDM solution if financial justification doesn’t add up. It’s simple – an MDM implementation is risk-motivated. I bought my first motorcycle knowing that it cost more than the savings on fuel. Buying and riding in itself implied risk, but what would the time of commuting by car in South African city life cost me in the long run if I didn’t? Yet you might still ask, what does this have to do with motorcycling?

Fractured Information Views

Due to the nature and development of information architecture and the development of enterprise information systems, the general structure of data is that it exists in silos across the organisation. Before the concept of centralised data stores, many systems were developed to meet a business function without taking into account the big picture. Consequently, with the addition of new systems, data storage volumes and the number of transactions grow exponentially. This means that in many organisations several disparate systems exist that contain similar information but that don’t interact with other relevant systems within the organisation. In addition to this lack of interaction, duplicate data may be processed in different systems, creating duplicate work for end users and different data structures. This adds to the difficulty of looking for and identifying like data across the organisation.

The bottom line is that each system only gives a fractured view of what is occurring within the organisation. For instance, a customer relationship management (CRM) system may not have all of the account information of their customers from buying habits and payment histories, to product preferences. Different bits of information that reside in disparate systems, when brought together, create a full view of the customer. Add to this data quality efforts to standardise data views and to create a consistent view of data across the organisation and the beginnings of MDM are created.

Although MDM solutions give organisations central access points and a 360-degree view of data and entities, actual ROI is not always easy to quantify. After all, quantifying better customer service or identifying a decrease in product cycle times and tying that to an MDM initiative may not be intuitive for your organisation.

In this case, it’s like telling my wife to buy herself a Kawasaki because it’s so much faster around in traffic – problem is, she’s in insurance and from that perspective the numbers don’t add up despite the obvious benefits and highly necessitated requirement to get around town quicker.

MDM Benefits Explored

There are many benefits for organisations choosing to implement an MDM solution that result from creating a consolidated view of data. To identify the ROI associated with MDM, some key benefits should be highlighted that go beyond the consolidation of data towards increasing efficiencies within organisations and lowering the cost of doing business.

The most obvious benefit of customer data integration (CDI)/MDM solutions is the creation of a single customer view. By creating a centralised access point of contact to customer, product, or other forms of data, not only are organisations able to gain a better overall view of business entities and what is occurring with accounts, but also data quality efforts become consistent. For organisations trying to attain one view of entities, the ability to ensure data quality throughout the organisation means that consistency of business rules, data cleansing and the standardisation of business processes will become an ingrained process within the overall environment.

Despite the fact that there is a constant focus on business pains and a push to focus on business rather than data, an organisation’s healthy business environment depends upon the data that supports it. Therefore, having strong data quality efforts in place helps organisations maintain healthy information systems, identify how systems across the organisation interconnect and enable better business process management. Although IT benefits are important, business benefits are also relevant. Using customer data as an example, end users can provide better service quicker thereby increasing customer retention and satisfaction.

Because of compliance and the need to maintain specific data sets and output on a regular basis, MDM enables organisations to manage the processes essential to meet compliance requirements. This in turn translates into the ability to plan and budget more efficiently. With a broader understanding of what is occurring within the organisation, financial processes can offer a more complete view of the organisation.

Finally, building an MDM solution enables organisations to develop and maintain a strong data governance program. Because all of the mechanisms have been put in place, data governance becomes a natural extension of an MDM implementation.

Once my wife bought her first Kawasaki, we had to ensure she bought all the gear too – in pink.

Focusing on Value

Whether by focusing on business or IT, the values of MDM are numerous. Actual implementation of an MDM solution is process and resource intensive as well as continual based on the need for constant data quality and business process improvements. However, these efforts enable organisations to have a synchronised and 360-degree view of the data that helps drive their business success.

So, these days we buy the best fuel for our motorcycles. You’ve probably figured out by now that the fuel is like data… the cleaner, the better the engine runs.

Conclusion

Not having MDM in an organisation with duplicated, heterogenous sources of information in expensive enterprise resource planning and customer resource management systems is like buying an Aprillia MV but running out of clean fuel in the long run. Having spent all that money on an efficient engine but not being willing to invest in running cleaner fuel might just result into no ride at all despite the seemingly financial hard ride in the short term..

-Jacob George
Solutions Architect