From source to target: SAP data sources and SNP Glue
When looking at integration between SAP and cloud-based data warehouses (e.g. Snowflake), most architects think about tables when looking at SAP’s business applications. Generally speaking, that's not wrong of course, but it is worthwhile to look at more sources of data which may be better suited. In this blog series I want to give you an overview of such data sources, and I want to mention some of the limitations and opportunities that come with different data sources.
Managing Director & CTO DEV OWN
Tips and tricks about database tables
Before we look at data sources other than plain old tables, some tips and tricks about database tables. (and before I forget: for all strings attached or special cases our integration “SNP Glue” comes equipped with a solution… so much for advertisement, wink wink).
Here's a helpful tip: Depending on the SAP release, certain tables may offer easier access than others. For instance, SAP optimizes some tables by compression to efficiently manage database space. This is not the case anymore with S/4HANA where compression is natural due to the column-based approach to table storage. For example, business data such as Finance Document line items in SAP R/3 and ECC is stored in table BSEG. On the database, this table simply does not exist – it is stored with some other finance tables in a BLOB style table called RFBLG. The same is true for change documents and some other major SAP tables.
Things to consider when looking at tables
Second, even when looking at tables, there are some things to consider:
- Some tables contain unstructured data, and you would not want to expose them 1:1 to any data lake. For example long texts, but also file attachments can be stored in SAP directly.
- Key words: during the schema creation, i.e. when you create SAP-copycat tables on your data warehouse, you may need to rename some columns due to restricted key words being used. This may lead to mapping “fun” between SAP and your data lake / data warehouse / data mesh.
- Currency fields in SAP tables do not carry a decimal dot. The dot needs to be inserted based on the customizing of currency keys. This may not be an issue if all your currencies are USD, EUR, and other currencies with two decimal dots. However, if you are truly global, then you will have currencies with more or fewer decimals as well.
- SAP ERP was originally built as a kind of multi-tenant SAP. The first key field “client” is basically a tenant field.
- You definitely want to use an integration solution which provides true delta capture with minimum overhead and performance impact. You will not want to perform full table loads periodically from SAP to your cloud data store to avoid performance impact on SAP, long waiting time for large tables to be copied over, or data being outdated.
- Forget about “virtual tables” from your data warehouse pulling data as needed from SAP - while technically feasible this may result in unexpected heavy load on the SAP database.
- Some data in SAP may be archived, i.e. it resides in archive files (potentially on an archive server) instead of directly in SAP tables. For such a case it may be worthwhile to look into your integration solution’s capabilities to tap into SAP archives (SNP Glue can do that, while most classic ETL tools will fail brutally, which is not surprising considering the data is archived and does not reside in SAP anymore).
Other data sources in SAP to consider
So much for tables, but what else? There is a multitude of other data sources in SAP which may be worthwhile to investigate:
- CDS views on HANA are technically just what the name says - “views”. However, the data model (and sometimes also content) may be much easier to consume for data scientists or architects without a PHD in SAP.
- In SAP ERP, you actually have a multitude of data sources above tables. These include the Business Object Repository and BAPIs. You may want to use function modules as data sources as well, or even ABAP code. Using SNP Glue you can do all this and more – you can even capture the output of (most) SAP transactions and use that as input for a data pipeline to any cloud-based data target.
- In SAP’s BW you have info cubes and queries which may provide data in a nicely prepared and aggregated format to any cloud data store. There are also other objects in SAP BW which may be worthwhile to investigate, maybe even transaction “listcube”.
- The master data tables, especially the hierarchy tables, may be a good source. In SAP ERP, hierarchies such as profit center hierarchy, customer hierarchy or material hierarchy are all stored in different ways and formats. In SAP BW, however, you will find ALL hierarchies in the same technical format. Some of our customers therefore use SAP BW as a pass-through data store for cloud or data warehouses.
- REST APIs and Interfaces of all kinds are becoming more and more important, especially with SAP’s strategy to expand the product portfolio to SAS and true cloud applications.
SNP Glue makes it easier
SNP Glue provides out of the box technology for all these data sources. Of course, some may be better suited than others for a certain purpose. For example, direct table access with “heavy lifting” may be better for large data volumes and is very powerful in conjunction with using SNP Glue’s CDC for near-real-time data replication (or even streaming).
However, when you prefer to work on business object level, you may use a business event as a trigger for data replication, e.g. to send data to a message broker when an SAP user is creating an outbound delivery in the SD module. In this example, the LIKP/LIPS data would be sent directly after the posting is saved in SAP to (for example) Kafka.
A final question we hear quite often is: Does this also work with SAP RISE? The short answer is, yes it does. The longer one is that the setup and configuration of SNP Glue will differ slightly from a classical on premise installation.
Managing Director & CTO DEV OWN