From Source to Target: Data targets

In this blog series I want to give you insights and tips on tight and efficient integration between SAP and cloud-based data warehouses or other related technologies.

9/25/2023  |  4 min

Author

Goetz Lessmann

Managing Director & CTO DEV OWN

Topics

  • Cloud Data Integration
  • Data Integration
  • Data Analytics & Data Lakes
  • SAP Data for Data Science
  • SNP Glue
t-h-chia-tVZMk-cidEc-unsplash.jpg

In the previous articles I explained a bit about the general purpose of such an integration  (e.g. which applications or scenarios you can build for your business users when you have SAP data available, but also the power of the cloud together with other data sources and modern AI algorithms ).

A lot of the concepts and even the tips and tricks are relevant no matter which technology you choose for your integration. However, at SNP we do have an out of the box solution called SNP Glue which covers all these features and requirements.

 

Which SAP data targets do you need for your cloud data integration

This article is about data targets which you may want to use for your SAP to anyCloud data integration. Basically, there are only three data targets (in real life it will get much more complicated than this):

  • anyDB
  • anyCloud
  • files

“anyDB” is a term coined by SAP to cover all databases which SAP supported before SAP HANA was conceived, basically your good old Oracle / IBM DB2 (or DB4, or DB6) / Microsoft SQL Server, etc databases. Even in the age of cloud we see a lot of such databases, sometimes running as a sidecar to an SAP database, sometimes running in the cloud. With SNP Glue this is one of the easier to accomplish integration scenarios, because most SAP data sources are all about structured data, so the complexity of a technical data replication (including schema creation and delta capture) is not very big.

I have proudly adapted SAP’s term “anyDB” to “anyCloud” which covers a really big field. This ranges from a simple cloud-based database over cloud SQL databases such as Azure SQL to full-fledged cloud-based data warehouses such as Snowflake (which in turn can run in multiple cloud environments). To cloud engineers or programmers this may all be very simple and transparent, but if you’re in a different role (e.g. you have the budget and the task to implement the cloud strategy for your company) then this may be very confusing.

 

Our goal for SNP Glue

The ambitious goal we have at SNP for our integration solution SNP Glue is to support each and every cloud technology out there. Today, we already cover a lot of ground, and keep adding adaptors for new technologies as they emerge and as they are being adopted by our customers. We support all relevant technologies for the usual suspects (in alphabetical order):

  • AWS, e.g. S3 and redshift
  • Google’s GCP, e.g. Big Query
  • Databricks
  • Microsoft Azure, e.g. ADLS gen 1 and 2 and AzureSQL
  • Snowflake, where we see huge demand from our customers based on the scalability and possibilities this data warehouse technology offers

For the sake of completeness, I should mention that we still do support Hadoop, even though we see that cloud and Snowflake are more modern and in demand by our customers today.

 

Final considerations before choosing a data integration solution

I want to finish this article with two very different comments or side notes. The first one is very technical and is about the underlying technology for integration. The second is more functional and scenario driven. Both of these side notes have one thing in common, though: you will want to pick a solution which nicely wraps and covers all these different technologies and features. You will not want to go down the rabbit hole of having to implement all this from ground up for your data integration, and you definitely want a complete solution which is maintained in a good, robust way to keep up with new technologies which arise as cloud technologies evolve (for example, the authentication between ADLS gen 1 and ADLS gen 2 works *very* differently!).

Under the hood the integration technologies in SNP Glue range from simple ODBC / JDBC over CSV files (yes, after the age of mainframes this is indeed still a thing!) to native adaptors (e.g. via REST APIs). What is important is that as a company implementing some SAP to cloud integration (or, in fact anything to anything integration) you want to pick a technology which nicely wraps all these technologies, and which also covers the authentication and security which very often is native to each cloud provider.

Finally, the data targets you choose should be suited to your use case. We see basically two types:

  • Message-based and real-time data replication-based integration. In the case of message-based integration (or streaming), data doesn’t end in a table or storage, but is passed to a message broker, e.g. KAFKA, in the form of messages. Such messages are typically hierarchical, nested data stored in JSON structures.
  • The more classical integration method is full table loads followed by delta capture (a.k.a. CDC) and near-real-time data replication. After the initial full load, rather small packages of data are transferred. This is all typically based on classical database tables rather than on business objects.

 

Your data integration solution must empower your business

As a closing remark I want to repeat what I consider a crucial point: any integration solution must empower and enable – and not restrict. You do not want a technology lock in, but one single solution which covers each case. SNP Glue is such a solution which allows customers to integrate from SAP to any cloud or database technology, while wrapping different technologies and even approaches to integration in one cockpit and one technology.

Author

Goetz Lessmann

Managing Director & CTO DEV OWN

Topics

  • Cloud Data Integration
  • Data Integration
  • Data Analytics & Data Lakes
  • SAP Data for Data Science
  • SNP Glue