Digitalization is one thing that "touches" all our lives – how can we use it to grow and live better lives.
In the corporte world we need to intelligently connect people, things, as well as businesses – they need to come together – this implies that machine learning should be linked with different design thinking approaches based on natural languages, combined with IoT tooling, and so on.
This is no easy task – creating connections between different disciplines, enabling data flow and coming up with a result / real-time analytics.
Achieving this requires resolving some key problems that can be tackled with SAP Data Hub.
Data generation just keeps exploding and integrating data is really a big issue, especially when you think about a heterogenous landscape distributed in the cloud, different systems, applications, and data storages. Getting all these data that exist in different formats to "come together" and work in an end-to-end scenario is no easy task.
Creating one-to-one connections and maintaining them would create an even bigger mess.
Data that is being used for Analysis should be trusted, complete, and relevant.
Metadata management is another important factor – especially with the numerous different data silos, you need to understand which data is where and whether of the same category. Bringing all this together and process the data that produces results that are coherent from this landscape is the need.
Governance strategies are another factor when you have numerous silos/islands everywhere. SAP Data Hub enables the understanding of these challenges in today's landscapes.
Challenges in Today's Enterprise Landscapes.
Understand the increased complexity in today's Big Data scenarios. What are the challenges in today's data landscapes? How do we build our infrastructure?
We have the centralized, on-premise approach, usually with a data warehouse in the middle. The data warehouse was fed by the ERP system, files, databases, could be real-time replication, batch loads, ETL, and so on. On top of this we had some analytics, some dashboards,some visualizations at the end of the day.
But Big Data sources brings in a more distributed data platforms. Often cloud-based, so now you have to think about how to fit all this together. You have a layer maybe with on-site data and different cloud data centers – all these have to "come" together. Most enterprises have between six-to-eight clouds running in their environments.
This means data has become less accessible – they can't go into it and have a look, different departments have different methods to access. This implies that companies don't understand their customer, supplier, products, details anymore. In effect, understanding data has declined since they do not have complete access to their data.
Legal risk due to a lack of governance is a big issue, especially when we think about GDPR regulations.
To get a competitive advantage data science approaches should give their personnel (scientists, analysts, etc.) complete data access to work with – these personnel would like to get more trusted data, and would like to better connect this data together with intelligent data coming from research projects, as well as more flexibility when data comes from the cloud.
Pain points for organizations:
The missing link between Big Data and enterprise data - different user groups don't have access to the data they need. Data is siloed, in the cloud or for example in the cloud, in the ERP system - and there's no way to bring this together.
The lack of enterprise readiness- organizations don't know how to operate Big Data, they don't know how a lifecycle could work, they just don't have the knowledge right now.
Limited tools equals high effort. There are a lot of Big Data scenarios you can achieve as open source tools. So, it's not a problem of getting a project done, it's more how you would like to bring this in a productive state, because at the end of the day, you need full access to every tool and you need to understand at the end of the when something breaks, where it breaks, how it breaks, and you need to fix it ASAP.
There should be a tool in place that can orchestrate and manage the whole lifecycle of such a product.
Technology pieces that we have today include existing systems, ERP, BW, S4 - where operations and communication with the customer exist. Then we have Hadoop, Spark, etc.
Then there is data in the cloud - S3, Azure, Google Cloud Platform, and so on. We need to get this data along with home-grown data (ERP, BW, etc.) and use machine learning algorithms. These algorithms could be a self-written algorithm written in Python, or libraries from TensorFlow.
Scalability is key, and there we would like to use new technologies - container technologies, Docker, and Kubernetes. How does one bring all this together? Data Hub does all this.
In the current landscapes there are a number of individual solutions - one-to-one point connection, data is moved from A to B, replication may be in real time, but a lot of different siloed solutions, then one has to deal with duplicates of data. All this leads to a waste of time, money and resources.
And then there are different groups within, you might have R&D, manufacturing, sales, marketing and some partners - they all have different needs in regards to what they want to achieve and the data they want to access ... leading to silos once again.
Bringing all these different worlds together is not so easy because you have a siloed solution, there's no connectivity, and especially for IT, this is pain now to make tooling, products, data accessibility possible for all the different groups within an organization.
These are similar challenges for governance- visibility of data (where does it exist) and then build applications/pipelines going through the different silos and execute processes. Data-sharing and leaving data as is at the end of every analytic result.
Big Data will add additional complexity to one's landscape. We have siloed solutions everywhere. Bringing all this together smoothly and work with the data is the need of the hour. Take a look at your current landscape, identify the weak spots, and think of how you can adjust your corporate strategy and get ready for data-implosion within your different silos – so that you can mine gold from your data.
I am just a few clicks away ...
< Contact > Dominic Fernandez
defining, designing and delivering solutions to help businesses achieve results - efficiently and effectively!