Loading...

Follow SAP HANA Central on Feedspot

Continue with Google
Continue with Facebook
or

Valid
Data drives the business processes we all rely upon, yet in many organizations poor quality data causes inefficiencies that could be avoided. SAP Data Hub includes various data quality operators that allow data pipelines to be created that can improve the quality of data in motion, or at rest.

In Data Hub 2.5.1 the set of data quality operators includes anonymization, data masking, location services (address cleansing, geocoding, and reverse geocoding), and validation.

This article shows how to use the validation operator to apply a data quality rule and to trigger a business process for data that fails the rule. It demonstrates how to use the SAP HANA Monitor and Validation Rule operators, tracing data using the Wiretap and Terminal operators, and extending a blank JavaScript operator.

Creating the data source We will use a HANA table as the data source for this example, although the data could be read from anywhere that Data Hub supports, which includes databases, cloud storage, streams, applications and APIs.

To create our source HANA table, we executed the following DDL via HANA Studio.

CREATE COLUMN TABLE SVCROPRODUCT.JOURNEY (
ID INT PRIMARY KEY,
SOURCE NVARCHAR(10),
CUSTOMER NVARCHAR(30),
TIME_START TIMESTAMP,
TIME_END TIMESTAMP,
DISTANCE INT
);

In this example, new rows inserted into this table will be checked to ensure that the journey start time is before the end time, and if it is not, start a remediation process to correct the data.

Configure a connection from Data Hub to HANA
To logon to the Data Hub Launch Pad, you will require your tenant name, username and password.


Once logged in, the Launch Pad displays tiles for the tenant applications.


In this demonstration we will use Connection Management to configure a connection to the source HANA system.  While this is not necessary as connections can be configured in the modeler, connections created via Connection Management are reusable in the modeler and can also be used in the Metadata Explorer to browse and catalog metadata, and to view and profile data.

Click on the Connection Management tile to open the application, then click Create.


Enter your HANA system connection information.


Once saved, click the ellipsis against the new connection and select Check Status to ensure Data Hub can connect.


Creating the Graph

Switch back to the Launch Pad and click on the Modeler tile to launch the Modeler.


Click the + (in the top left) to create a new graph.


A new graph will be created, and rather conveniently the operators tab is selected.


Operators can be added to a graph by double-clicking on them or dragging them into the graph editor. If you know the operator name you can use the search box. Locate the SAP HANA Monitor operator and add it to the graph.


The HANA Monitor operator continuously captures new data inserted into a table. It works by creating a trigger on the source table which inserts data into a shadow table, and polling the shadow table, we’ll look at those later.

Open the operator configuration panel by clicking on the Open Configuration button.


We must configure the connection, schema name, table columns and table name. We can reuse the connection we created in the Configuration Management application. Click on the Connection property to open the property editor.


Select Configuration Manager as the Configuration Type and the Connection ID of the previously created connection.


Enter the schema name and table name of the source table.  Table columns is a comma separated list of column name and data type of the source table columns you want to monitor, so for the source table we created we will use

ID INT,SOURCE NVARCHAR(10),CUSTOMER NVARCHAR(30),TIME_START TIMESTAMP, TIME_END TIMESTAMP,DISTANCE INT

The completed monitor configuration.


A great feature of Data Hub is that you can execute a graph before it is complete, this allows for incremental development and instant gratification! Save the graph using the Save button on the Editor toolbar.


You must provide a name, a Description is optional, Category allows graphs to be grouped in the navigation pane.


Now that the graph is saved, it can be run using the run button on the editor toolbar.


The bottom pane shows that the graph is running.


Switching to HANA Studio, we can see that a shadow table and a trigger have been created to capture new rows. These are created and deleted by the operator during startup and shutdown of the graph.


If we now inserted new data into the source table, the trigger would copy it to the shadow table and the graph would process the new data. However, as our graph is very simple we have no way of seeing what’s happening, that’s not gratifying at all!

Tracing and Debugging

When testing and debugging graphs it is often useful to trace the data output by operators. There are 2 operators that can be used for this, Wiretap and Terminal. We’ll use Wiretap to display the data output by the HANA Monitor operator.

Add the Wiretap operator to the graph and connect the output port of the HANA Monitor to the input port of the Wiretap.


Save and run the graph. Once the graph is running another option appears for the wiretap instance, an Open UI button.


Click on the Open UI button, a new browser window will open.


Back in HANA Studio, we insert a new row into the source table.

INSERT INTO JOURNEY VALUES (1, 'openCAB', 'Michael', 
TO_TIMESTAMP('18-02-2019 10:00:00', 'DD-MM-YYYY HH24:MI:SS'),
TO_TIMESTAMP('18-02-2019 11:00:00', 'DD-MM-YYYY HH24:MI:SS'), 10);

The wiretap window now shows the inserted data, gratification at last!


The data output is

[{"CUSTOMER":"Michael","DISTANCE":10,"ID":1,"SOURCE":"openCAB","TIME_END":"2019-02-18 11:00:00","TIME_START":"2019-02-18 10:00:00"}]

The {} notation shows that the operator output format is JSON, and the [] hints that multiple objects may be output. Let’s insert multiple rows to see what happens.

INSERT INTO JOURNEY VALUES (2, 'openCAB', 'Matt', 
TO_TIMESTAMP('18-02-2019 10:12:00', 'DD-MM-YYYY HH24:MI:SS'),
TO_TIMESTAMP('18-02-2019 11:11:00', 'DD-MM-YYYY HH24:MI:SS'), 10);
INSERT INTO JOURNEY VALUES (3, 'openCAB', 'Tyler', 
TO_TIMESTAMP('18-02-2019 12:00:00', 'DD-MM-YYYY HH24:MI:SS'),
TO_TIMESTAMP('18-02-2019 12:30:00', 'DD-MM-YYYY HH24:MI:SS'), 10);


The data output is

[{"CUSTOMER":"Matt","DISTANCE":10,"ID":2,"SOURCE":"openCAB","TIME_END":"2019-02-18 11:11:00","TIME_START":"2019-02-18 10:12:00"},{"CUSTOMER":"Tyler","DISTANCE":10,"ID":3,"SOURCE":"openCAB","TIME_END":"2019-02-18 12:30:00","TIME_START":"2019-02-18 12:00:00"}]

While this is not easily human readable, its an array of JSON objects, showing that the HANA Monitor will capture multiple rows during each poll and the output of the operator is a single message containing multiple rows.

Using JavaScript to format data

The JSON output from the HANA Monitor must be converted to CSV before being sent to the Validation Rule operator. The Format Converter operator can be used in some cases, but we’ll demonstrate manual conversion which would be required for complex formatting.

Stop the graph, then add a Blank JS Operator, we’ll use JavaScript, but we could use the Go or Python operators to format the data.

In general, operators are data processors. They accept data on input ports, process it, then produce data on their output ports. Operators can have multiple input and output ports, and ports support various data types. The delivered operators have predefined ports for their specific purpose, but when creating or extending an operator, we must configure its ports. In our example we’ll match the input port to the output port of the HANA Monitor, and use a string datatype for the output CSV, as this is the input port type on the Validation Rule operator.

Click on the Add Port button on the Blank JS Operator.


Add an input port called inMessage.


Add an output port called outString.


Click on the Script button to open the script editor.


Paste in the following JavaScript.

var csv = "";

$.setPortCallback("inMessage", onInput)

function onInput(ctx, s) {
    var b = s.Body;
    csv = "";
    
    b.forEach(arrFunc);
    
    $.outString(csv);
}

function arrFunc(item, index) {
    if (index > 0) 
        csv = csv + "\n";
        
    csv = csv + item.ID + ",\"" + item.SOURCE + "\",\"" + item.CUSTOMER + "\"," + 
        item.TIME_START + "," + item.TIME_END + "," + item.DISTANCE;
}
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Introduction
In this blogs, I will explain about the way of accessing HANA tables from Node.js Application.
Especially, I will explain about issuing certification on SCP and implimenting conn_param in app.js

IT Architechture I will show IT Architecture outline below.
Your Node.js application connect to HANA tables on your Laptop PC.
You deploy your application on Cloud Foundry environment. The application connect to HANA tables.


Topic/table of Content
1. Preparation
2. HANA table Creation
3. Node.js Application Creation
4. Certification
5. app.js implimentation
6. HANA and Node.js Application Linkage

Environment
You need to prepare Cloud Foundry environments and HANA as a Service.
Cloud Environment :Cloud Foundry enviroment.
Database :HANA as a Service(CF)
PC/Service :1 PC (※ In my case, I used WindowsPC)

Version
WindowsPC Edition :Windows 10 Enterprise 2016 LTSB
CPU :Intel(R)Core(TM)i5-7300U CPU @ 2.60GHz 2.71GHz
Memory(RAM) :8.00GB
HANA version(CF) :4.10.5

1. Preparation
Perhaps, you think about Developing tool to impliment any code.
So, you download Develoning tool such as IntelliJ/Visual Studio Code/WebStorm when necessary.

2. HANA table Creation
I will tell you HANA table creation method.

1. Select your Spaces, after you logged in Cloud Foundry envirinment
2. Select a card of ‘SAPA HANA Service’ in Service Marketplace and select your instance
3. Open your HANA Dashborad
(※ Remember HANA endopint)
4. Click ‘Go to the HANA cockpit’ in the upper right corner of window
Click ‘SQL console’ in the upper right corner of window
You can create any HANA schema and tables
(※ Remember your schema, user and password)

3. Node.js Application Creation
At first, you create simply Node.js application. Any kind of application is fine.
Open your project that you installed Developing tool and also select app.js in the src folder.
Implement connection setting to HANA tables in the node.js. I put a sample code below.

var conn_params = {
  serverNode: "zeus.hana.prod.eu-central-1.whitney.dbaas.ondemand.com:[Your Port Number]",
  encrypt: true,
  schema: "[Your Schema]",
  uid: "[Your User]",
  pwd: "[Your Password]"
};

Next step, you impliment SQL statement in the same file.
Define a method to connect to HANA tables.
I will show a sample code below.

app.use("/conn/xxx", function (req, res, next) {
  conn.connect(conn_params, function (err) {
    if (err) {
      var msg = [{
        msg: "A connection error has occurred"
      }];
      res.json({
        searchResult: msg
      });
      return;
    }
    var sql = 'SELECT * FROM "[Your Schema]"."[Your Table]";';
    conn.exec(sql, function (err, result) {
      if (err) {
        console.log("DB Error: SQL Execution --- ", err);
      }
      conn.disconnect();
      if (Object.keys(result).length == 0) {
        var msg = [{
          msg: "The result is not found."
        }];
        res.json({
          searchResult: msg
        });
        return;
      }
      res.json({
        searchResult: result
      });
    });
  });
});

After Implimentation, you run your Node.js application.
Your sccessfully connect to your HANA Tables.s

4. Certification Creation
You back to SAP Cloud Platfrom Cockpit to generate HANA servuce keys.
You follow steps below.

1. Select Instance Service
2. Select Your HANA Instance
3. Select Service Keys
4. Create Service key

You remember the keys as you use.

5. app.js implimentation
You add two parameters related to ssl on your source codes
Put a ‘sslCryptoProvider’ key and ‘openssl’ value
Put a ‘sslTrustStore’ key and a ‘BEGIN CERTIFICATE’ value as you memorize

var conn_params = {
  serverNode: "zeus.hana.prod.eu-central-1.whitney.dbaas.ondemand.com:[Your Port Number]",
  sslCryptoProvider: "openssl",
  sslTrustStore: "-----BEGIN CERTIFICATE-----\n...(omitted)...\n-----END CERTIFICATE-----\n",
  encrypt: true,
  schema: "[Your Schema]",
  uid: "[Your User]",
  pwd: "[Your Password]"
};

6. HANA and Node.js Application Linkage
After you build source codes and deploy them, you can see your application on SAP Cloud Platfrom Cockpit.

I will show you an instruction step by step.

1. Open windows command prompt
2. Change current directory to your application context root
3. Type ‘cf login -a https://api.cf.eu10.hana.ondemand.com’
4. Tyope your e-mail and password
5. Select Org
6. Type command ‘cf push [Your application name]’
7. Open browser
8. Access ‘https://[Your Application Name].cfapps.eu10.hana.ondemand.com/conn/xxx’
You can see table records.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Robotic Test Automation Brings Ends To Scripting
Scripts always brings an expression that can foray fear into the hearts of the most fearless team of developers, QA experts and project managers as they try to deliver changes into an SAP system on time and under budget.

Well, possibly not so much ‘fear’ as a discouraging sense of acknowledgement.

The fact is, test scripts are an essential part of pretty much all the traditional approaches to SAP testing. After all, someone has to define what needs to be tested before a test can be run, right?  But nobody really wants to be the one who has to make them.
This is particularly right for testing where much of what you’re trying to test is ultimately related to business processes and the related aftermaths. The best of you will have those processes represented somehow, but to what level?  I’m willing to bet not many process maps define which screen should appear, which button should be pressed, which value should be entered, etc, at every step.  So, you need some scripts to make sure your changes aren’t going to break things.


The problem is not only do you need technical input from the SAP team, you need to get the functional teams involved since they’re the ones who understand real processes and can validate what the outcomes should be.  And those people already have day jobs.

On top of that, if you’ve ‘simplified’ your testing process by using some sort of traditional tooling you might need developers to write the code needed for your automated test execution.

So, yeah, those scripts are hard to make.

But that’s not the worst of it. You’re going to need a lot of them. A LOT. SAP often touches every facet of an enterprise, from front office to back, connecting with multiple SAP and non-SAP external systems, so to make changes with a sufficient level of confidence you’re going to have to test a lot of stuff.  I spoke to one of our customers recently who said they were reasonably happy with their test coverage. Great. How many scripts? Oh, only around 12,000!

I’m pretty sure you’ve figured out the equation here. Taking a traditional approach to regression testing, and therefore relying on test scripts, means

Confidence = (SAP + functional + technical team input) X (LOTS of processes)

If we do a bit of pseudo-math and substitute time for people, that means

Confidence = A long lead time

Substitute time for money (since it’s not news that the two are often equivalent) and we get

Confidence = A lot of money

There’s no getting away from it.  When you change your SAP systems you need to be confident that you’re not introducing unexpected negative impact on your business.
Until now you’ve needed to regression test using test scripts. Which makes your project slow, and costs a huge amount.  To put it very simply, when using traditional regression test methods

Keeping your business safe = A lot of time + A lot of money

Oh, and wait.  The maintenance.  Did I mention the maintenance?  Who wants to put their hand up for the job of analyzing which of those 12,000 scripts are impacted by change X and need to be updated?  Not me.

And then all this gets even more tricky when we start talking about more frequent delivery through modern development methodologies like DevOps.  Do you have time in a two-week sprint to define and create all the new scripts you might need, as well as analyze your test library and make all the necessary updates? I’m going to say ‘unlikely’ to that one.

Sure, I’m painting a bit of a picture of the apocalypse here.  Large organizations with complex landscapes and fully manual processes are going to find this situation harder than most.  But even today’s automation tools aren’t much more than band-aids.  They still rely on some form of test script that has to be created at the client-side of the system and maintained afterwards.

What if I told you there was a better way?  A way that removes the need for test scripts as we know them, which automatically creates a comprehensive library that can be refreshed on demand?


Robotic Test Automation: A Better Way to Test

Robotic Test Automation (RTA) is a pioneering technology that robotically generates, executes and maintains regression test records based on real-world use, automatically learning and substantiating SAP system behavior without user involvement. It can fully replicate a day in the life of your regular business.

RTA eliminates inefficient end-user recording, business process encounter, test script creation and maintenance chores – along with all the test data management associated with the process – to perform system-wide SAP regression tests in days, rather than months. Set-up time is negligible. No manual intervention is required. And, with it, you can begin to approach 90 percent test coverage – without any scripting required.


The benefits of RTA are clear. With it, you suddenly have the freedom to accelerate major SAP transformation projects, including cloud and HANA adoption, and safely experience the benefits of DevOps.

RTA gives you the ability to test more of what’s beneath the surface, so you can proceed confidently along your business transformation journey. Because the system updates itself, without any manual effort, testing can be shifted left, speeding and sustaining delivery. And, by automating the testing process, RTA removes the testing burden from DevOps teams better suited to creating and supporting a competitive, agile and relevant business.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
In this blog post I want to demonstrate you the power of the HANA-optimized Business Content and how it can help to speed up your BW/4HANA implementation.

For this, let us have a closer look together into the business content area of Fixed Asset Accounting – as an example.

Background Providing Business Content as a part of the EDW solutions has been a differentiator for SAP since the early releases of SAP Business Warehouse (BW). Business Content enables quick and cost-effective implementation as it provides a model that can be used as a guideline during implementation.

The delivered standard data models are intended as a foundation for designing, modeling, and building your EDW, and while often adapted to customer needs, they provide a jump start that facilitates faster and more consistent delivery of EDW projects.

The SAP BW/4HANA Content was specifically designed to leverage the SAP HANA capabilities in SAP BW/4HANA following a consistent EDW reference architecture called LSA++ (Layered Scalable Architecture).

The power of the HANA-optimized Content compared to the classic BW content in the area of Fixed Asset Accounting

Below graphic shows the ‘classic’ BI Content data flow for the component FI-AA (Fixed Asset Accounting) – as it exists for SAP BW:


Let’s compare it now with the data flow provided by the HANA-optimized business content that best runs in a SAP BW/4HANA system:


At a first glance you might already see the immense simplification:

◈ Less persistencies
◈ Less data load processes
◈ No dependencies of data loads
◈ Only ONE InfoProvider for reporting

This not only makes the solution easier to understand and to use. It also drastically reduces the load times inside the BW system (what it means in minutes, we will see later in this blog).

Thanks to some features that were injected into the data model, the report performance was further improved – and this is in addition to the performance boost you will get out of the box by just having HANA as the underlying data base.

To get a better understanding, here a detailed list of the changes and benefits compared to the classic version of the content


Comparison of data load duration

Finally let us compare now what the data model simplification means in regards to the data load times inside the data warehouse system.

Before this, it is important to point out that Fixed Asset Accounting is all about calculating depreciation and appreciation values, and while some of the calculations are happening in the source (SAP ERP or S/4HANA), still a bigger part of the calculation logic is done inside the datawarehouse as part of a BW Transformation (start routine). This makes FI-AA to one of the application components with the highest load times inside the BW system – at least once a month during the period-end closing processes. Therefore reducing the FI-AA data load time might be key to manage the overall load cycle inside the given time window.

Now let us have a closer look at the load times. We have measured it in systems without any performance tuning and based on the same amount of data that we received from the extractors using the same SAP ERP source system.

0FI_AA_11: ~5 million records

0FI_AA_12: ~12 million records.

BW system (Non-HANA) with classic content model

First we processed the data load in a non-HANA SAP BW system into the classic BI content model.

Here the entire load cycle inside BW (!) took 1 hour 21 minutes.

BW on HANA system with classic model

As part of the data load process there are some steps that are obsolete when running the same in a BW system on HANA without the need to change anything to the model, such as processes to calculate indexes, statistics and roll up of aggregates. Additionally the activation job is now being processed in HANA which reduces the activation time.

At this point we are still using the classic model, but just by changing the underlying data base by HANA, we are getting already a great reduction of the total load time by almost 25 minutes down to 57 minutes.


BW/4HANA with the HANA-optimized content (BW/4HANA Content)

Now we are loading the same amount of data into the new HANA-optimized business content. Here we are experiencing a further massive reduction of the data load duration by further 38 minutes down to now 19 minutes.


The total reduction of the load time was from 1 hours 21 minutes down to 19 minutes. And this includes both type of optimization, the push down of data load processes into HANA and the lighter HANA-optimized data model.

It is important to understand that we did not sacrifice any functionality nor flexibility in the reporting layer. The opposite is true. The report consumer has now more flexibility e.g. drill down to account level for all metrics – something that could not be provided in the classic business content due to the use of aggregates.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Overview
You have an existing ERP system, and you want to leverage your previous investment in the business processes that you already have implemented in ERP. You want to bring them to the new world of SAP S/4HANA. Then S/4HANA system conversion is perfect option for you…!!!

Overall journey for conversion to S/4HANA looks like as below –


(reference from SAP-Press book “ADM328 SAP S/4HANA Conversion and SAP System Upgrade)

S/4HANA conversion journey divides into pre-conversion, conversion and post-conversion phases.

In short, we should be aligned with these phases throughout S/4HANA journey. In pre-conversion phase, we are performing all pre-steps which are required to start conversion. This will help us to find any issue in early phase of project. In conversion, we are converting system to new world of SAP S/4HANA. This phase includes adaptation to new business process, validation etc… And final phase is post-conversion, here we are performing follow-up activities to make sure we get maximum use of SAP S/4HANA for our business.

Pre-Conversion

These checks will help to find out what mandatory steps we must carry out before converting S/4HANA system. The checks are divided into category such as –

◈ System Requirement
◈ Maintenance Planner
◈ SI-Check
◈ Custom-code analysis

System Requirement

We need to evaluate system and check out its compatibility regarding OS, DB and Stack (Single/Dual) etc.

◈ Unicode is needed, due to technical restriction with S/4 kernel. If legacy system is not Unicode, then at first place perform Unicode conversion
◈ Dual stacks are not supported on SAP HANA. So, perform dual-stack split, if required
◈ ECC system should be >= 6.0.
◈ Check minimum required version of DB/OS for conversion into S/4HANA


(reference from SAP-Press book “S4H100 SAP S/4HANA Implementation Scenarios)

Maintenance Planner

The Maintenance Planner checks the system with regards to business functions, industry solutions, and add-ons. If there is no valid path for the conversion (for example, the add-on is not released yet), the Maintenance Planner prevents the conversion.


Business function

Business function can have following status: always_on, customer_switchable and always_off.

◈ If a business function was switched on the start release system, but defined as always_off in the the SAP SAP S/4HANA target release, then a system conversion is not possible with this release

◈ If business function was switched off in the start release system, but defined as always_on in the SAP S/4HANA target release, then the business function will be activated during the conversion

◈ If business function is defined as customer_switchable in the SAP S/4HANA target release, it can be activated depending on the needs of the customer, Business functions that were active in source release, remains active in target release.


(reference from SAP Note – 2240360, 2240359)

Add-ons

All add-ons must be certified for S/4HANA in order to run on S/4HANA. If you are converting from ECC to S/4HANA and have add-ons that are not certified, the conversion tool will simply stop in its tracks when Maintenance Planner identifies the uncertified add-on.


(reference from SAP Note – 2214409)

SI-Check

SI-CHECKS will identify simplification items relevant to conversion. It means that it checks data inconsistency or missing mandatory preparation activities.


Tip - Do run this report early in the project.

Procedure

For performing SI-CHECKS, follow below steps

◈ Start report /SDF/RC_START_CHECKS in transaction SA38
◈ In the section Simplification Item Check Options, choose the target SAP S/4HANA version
◈ Choose the mode in which you want to run checks
     ◈ In Online Mode – The results are displayed immediately after the check is finished
     ◈ Background job: If the check will need a long running time, use background mode
◈ Run the checks to get an overview over all irrelevant simplification items and then check system consistency with Check Consistency for All
◈ Check the Results and take appropriate action based on logs.

SI-Check message and their meanings


(reference from SAP Note – 2399707)

In order to successfully use the Simplification Item Check as preparation for system conversion/upgrade project

Start running the Simplification Item Check and fixing the issues found by the checks as early as possible in project
Stay up-to-date with the latest check and check content versions early in project. But don’t miss to freeze the check notes and check content versions after converting your DEV system
Archive any data which you don’t strictly need in your SAP S/4HANA system in order to minimize the check runtime and the data cleanup effort.

Custom – Code Analysis

The Custom Code Migration process describes the tools and necessary activities that help you to migrate custom code. The process consists of preparatory analysis (Custom Code Analysis) and the adaptation of the custom code (Custom Code Adaptation) after the technical conversion.

Custom Code Process –


(reference from SAP-Press book “ADM328 SAP S/4HANA Conversion and SAP System Upgrade)

Custom Code Evaluation

ECC system contains a large amount of custom development objects (Z-Objects, enhancements and modifications) that are not used productively. Therefore, monitor system for longer period and do some housekeeping and eliminated the code, which is not used anymore within your productive business applications.

For this purpose, use either UPL (usage procedure log) or ABAP Call Monitor (SCMON) in productive system to find out, which custom ABAP objects are really used within running business processes.

Tool – Solution Manager 7.2

SAP HANA Checks and SAP S/4HANA Checks

This is most important step for custom ABAP code on the way to system conversion to SAP S/4HANA.

◈ Native SQL of the predecessor database and these database vendor specific dependencies must be eliminated
◈ Also, in some custom code implementation the SELECT statement without ORDER BY is used. This can lead to unexpected behavior when database is changed (eg. SAP HANA) because the results return in a different order without ORDER BY. Therefore, you need to check your SQL SELECTs without ORDER BY statement if they are still correct
◈ Pool/cluster tables were removed with SAP HANA, therefore the database operations on these table need to be removed from custom ABAP code

Tools – ABAP Test Cockpit (ATC)

Hana Sizing

Right size SAP HANA hardware to realize the maximum benefit from investment and reduce long-term total cost of ownership. SAP sizing report will provide estimated amount RAM, disk size etc.

The following workflow will help to size HANA hardware.


Conversion

The actual conversion is started with SUM tool. Sum tool performs execution part including Database migration (DMO) and SAP S/4HANA conversion.


(reference from SAP-Press book “ADM328 SAP S/4HANA Conversion and SAP System Upgrade)

SUM tool segregate activities as uptime processing and downtime processing.

The uptime processing is the migration of the shadow repository. It can be migrated during uptime, because no changes are done any long – since the repository was locked at the beginning of the SUM procedure (dev lock). The shadow repository is built up from upgrade media (upgrade DVDs) + the maintenance planner download + additional files (files in the download directory, transport requests, add-ons etc).

The downtime processing is the migration of the application data, customizing data, and user master records. It can be migrated during downtime only, because it would be changed continuously during uptime. New standard customizing and new data, delivered by SAP, is imported.

The migration to SAP HANA DB takes place partially in uptime (UT) processing and partially in downtime (DT) processing of SAP up.


(reference from SAP-Press book “ADM328 SAP S/4HANA Conversion and SAP System Upgrade)

Overall Procedure at a Glance

SUM tool has too started on Primary application server. After some basic configuration setting, such as stack.xml, the SAP up will start to create shadow system. It contains the basic tables and some customizing tables. The system is still running, and end user may work in the system and use functionality that may change application data into database. In this phase, the system is running and available for end users (uptime processing), but the development environment is locked. Then the ramping down phase will start which includes, locking users, cleaning up queue etc.

The technical downtime phase is started. It migrates data from the source database into the target database. After this application tables are updated to the target release. Then validation will take place and go/no-go decision. After go decision from business owner, ramping up phase will start which includes reconnecting landscape, unlocking users etc. and finally system will be live on target release and available for end user.

The system is now migrated to the target database and updated to the target release.

Post – Conversion

(reference from SAP-Press book “ADM328 SAP S/4HANA Conversion and SAP System Upgrade)

Custom Code Adaptation

◈ Need to adapt any modification and enhancements using the standard transactions SPDD, SPAU, SPAU_ENH
◈ Need to fix any issues due to SAP S/4HANA simplification

Functional Adaptation

Once system conversion to SAP S/4HANA is completed, need to carry out functional adaptation based on the ATC results.

Adjust modifications and enhancements – To adapt the modifications and enhancements, use the standard transactions SPDD, SPAU, SPAU_ENH

Fix SAP HANA and SAP S/4HANA findings – Adapt custom ABAP code, using the findings of the SAP HANA and SAP S/4HANA checks (ATC results). Findings of SAP S/4HANA checks are related to S/4HANA Simplifications. Each simplification requires a different approach to solve findings. Therefore, findings of SAP S/4HANA checks refer to a SAP Note which describes how you can solve them.

Tool – ABAP Test Cockpit (ATC)

Performance Tuning

Once system conversion to SAP S/4HANA is completed, then need to look which business processes can be optimized on SAP HANA database. As we can make use of full power of SAP HANA regarding performance. Therefore, we need to look which SQL statements can be optimized.

ABAP SQL Monitor allow us to get performance data for all SQL statements executed in production. SQL monitor will help to understand what are the most expensive and most frequently executed SQL statements.  SQL Monitor allows you to link the monitored SQL statements to the corresponding business processes including various aggregation and drill-down options.

Tool – ABAP SQL Monitor

Cleaning up Obsolete Data after the Conversion

To delete obsolete data that may remain after the conversion of your SAP ERP system to SAP S/4HANA.

Cross Application Follow-on Activities

◈ Adapting Database Extension to SAP S/4HANA
◈ Adapting the User Interface
◈ Output Management

Application – Specific Follow-on Activities

◈ Finance
◈ SAP Credit Management
◈ Human Resources
◈ Sales – Order and Contract Management
◈ Retail
◈ Environment, Health and Safety
◈ Product Safety and Stewardship
◈ Integration
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
This blog post is for setting up SSL for Application server S/4HANA for successful connection with SAC (SAP Analytics Cloud).

Background –
When we are connecting SAC (SAP Analytics Cloud) to SAP S/4HANA system with direct live connection, we need to make trusted connection.

Else error can be seen as –


Setting Up SSL

Check CommonCryptoLib version

Login into <Applicaion Server Host> as <sid>adm

server: <sid>adm > cdexe

server: <sid>adm > pwd

/sapmnt/<SID>/exe/uc/linuxx86_64

server: <sid>adm > sapgenpse -l /sapmnt/<SID>/exe/uc/linuxx86_64/libsapcrypto.s

.

.

.

Using -l parameter to load CommonCryptoLib

   -l "/sapmnt/<SID>/exe/uc/linuxx86_64/libsapcrypto.so"


  Platform:   linux-gcc-4.3-x86-64   (linux-gcc-4.3-x86-64)

  Versions:   SAPGENPSE       8.5.28 (May  8 2019)

              CommonCryptoLib 8.5.28 (May  8 2019) [AES-NI,CLMUL,SSE3,SSSE3]

                Build change list: 238087


  USER="<sid>adm"


  Environment variable $SECUDIR is defined:

  "/usr/sap/<SID>/DVEBMGS00/sec"

Update SAP Crypto library

1. Download latest crypto library from SAP market place:

SAPDownload à Support Packages & Patches à By Category à SAP CRYPTOGRAPHIC SOFTWARE à SAPCRYPTOLIB à COMMONCRYPTOLIB 8 à <Select appropriate OS version> à Download latest SAR file

SAPCRYPTOLIBP_8528-20011697.SAR —- for Linux X86_64

1. Move SAR file from download basket to application server

Use winscp to move to application server

1. UNCAR SAR file : (login with <SID>adm into application server

SAPCAR -xvf SAPCRYPTOLIBP_8528-20011697.SAR

1. Move uncared all content to Kernel

mv * /sapmnt/<SID>/exe/uc/linuxx86_64

Profile Parameters

Login into <Applicaion Server Host> as <sid>adm and remove below profile parameter

ssf/name

ssf/ssfapi_lib

sec/libsapsecu

ssl/ssl_lib

Define Https parameter 

Add below entry into Instance profile

icm/server_port_1 = PROT=HTTPS,PORT=52$$,TIMEOUT=30,PROCTIMEOUT=60

and restart the application server

Generate Certificate

1. Transaction Code – /nstrust and click on edit.


2. Right click on SSL Server Standard and Select Create


3. Click on OK


4. Update entry as mentioned in the screenshot


5. Make sure Algorithm Overview as below –


6. Once you click on OK, you can see entry has been created.


7. Now, Create Certificate Request by clicking on button


8. Select algorithm as SHA256


And click on OK

9. Download certificate locally.


10. Save to your local machine.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
If I’m doing something for the first time and I know I will need to replicate it multiple times, I tend to take screenshots and notes for my self of the future. Since I got this question a couple of times in the last weeks, I’m publishing those notes.

This post is about setting up Smart Data Integration from an on-premise SAP HANA instance (HANA Express in my case) to my tenant of SAP Cloud Platform, SAP HANA Service on Cloud Foundry.


(as usual, check your license agreements…)

I am using the FileAdapter there, I will use the HanaAdapter here so I will only cover the differences.

Prereqs:

◈ You have an instance of SAP HANA Service in Cloud Foundry (the trial HDI containers will unfortunately not work… this makes me sad too…)
◈ You have access to the cockpit for your SAP HANA Service
◈ You have a HANA instance that can be accessed from the outside world
◈ You have whitelisted the IP address of your HANA express or on-premise instance
◈ You have admin rights to create users in both HANA instances

Download the Data Provisioning Agent

I got it from the Software Downloads but you can also get it from the SAP Development Tools. Just make sure you have version 2.3.5.2 or higher so you can connect through Web Sockets.

At the time of writing this blog post, the version of the SDI adapter in HXE’s download manager was not high enough, but that would have been the best option:

cd /usr/sap/HXE/home/bin
./HXEDownloadManager_linux.bin -X

You may already have it in your /Downloads folder in HXE too.

If your file ends with extension “.SAR” or “.CAR”, you will need to use sapcar to inflate the file first. Else, good old tar -xvzf on your HXE machine command will do.

Check the ports

The DPagent will ask for two ports to listen. If you are running in the cloud like me, make sure you open those ports. The defaults are 5050 and 5051.

Install the Data Provisioning Agent

My on-premise SAP HANA, express edition instance lives on Google Cloud, so I uploaded the inflated dpagent file first and these commands did the trick:

sudo su - hxeadm
cd /extracted_folder/HANA_DP_AGENT_20_LIN_X86_64
./hdbinst


I hit enter on everything as I wanted the default directory and ports, but of course you can change these.

Note the installation path is /usr/sap/dataprovagent .

Configure the Data Provisioning Agent

Set the environment variables first and navigate into the directory where you installed DPAgent:

export DPA_INSTANCE=/usr/sap/dataprovagent
cd $DPA_INSTANCE
cd bin


And start the configuration of the Data Provisioning Agent

 ./agentcli.sh --configAgent

Go with option 2 first, Start or Stop Agent:


Don’t forget! You will need to have whitelisted the public IP address of your source machine for access to your HANA service.

Go back to the main menu, choose 5. SAP HANA Connection to log in to your SAP HANA Service:


You will need your instance ID and the websocket endpoint. You get these from the Cockpit of your database instance:


Once you have connected successfully, register the agent with option 7. Agent Registration


You should now see the agent in the Database Explorer in HANA as a Service


Finally, register the HanaAdapter with 8. Adapter Registration:


Create a user in the source database

You will need a user with the right access in the source database (HANA Express in my case). Something like:

create user remousr password P455w0Rd NO FORCE_FIRST_PASSWORD_CHANGE;
grant select, trigger on schema FOOD to REMOUSR;


I also want to replicate data from HDI containers, so I open the SQL console as Admin:


And run something like:

set schema "ANALYTICS_1#DI";
CREATE LOCAL TEMPORARY TABLE #PRIVILEGES LIKE _SYS_DI.TT_SCHEMA_PRIVILEGES;
INSERT INTO #PRIVILEGES (PRIVILEGE_NAME, PRINCIPAL_SCHEMA_NAME,  PRINCIPAL_NAME) values ('SELECT','','REMOUSR');
CALL "ANALYTICS_1#DI".GRANT_CONTAINER_SCHEMA_PRIVILEGES(#PRIVILEGES, _SYS_DI.T_NO_PARAMETERS, ?, ?, ?);
DROP TABLE #PRIVILEGES;


Some notes:

◈ ANALYTICS_1#DI: Comes from doing right-click on a table in the HDI container, choosing “Generate SELECT statement” and adding “#DI” to the schema.
◈ You can also grant roles, which is preferable. 

Create a remote source

Finally, on the database explorer connected to you SAP HANA Service on Cloud Foundry, right-click on Remote Sources –> Add Remote Source:


You will see the agent and adapter you registered before.

Enter the host of the source machine (in my case, the public static IP address of my HXE machine.

Enter the port number where the indexserver of your source tenant database is listening (39015 for the default first tenant on HXE, 39013 for the SYSTEMDB which you should not be using anyways ).

If you don’t know the port for your database, querying the table M_SERVICES will tell you the SQL_PORT of your indexserver.

Set the credentials to Technical User and enter the user and password for the user you created in your source database:


And you are ready to roll!


If you want to consume the remote source you have just created in Replication Tasks, Flowgraph and Virtual Tables from an HDI container in Web IDE.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
In this blog, I will show how to evaluate the business value from a machine learning model built in SAP HANA using R.

Combining SAP HANA & R allows the data scientist or advanced business analyst to leverage the power of SAP HANA and at the same time get the vast functionality from the use of R Packages. The main benefit with an approach of using SAP HANA in combination with R is that it allows you to utilize the power of SAP HANA both in terms of scalability and performance through R scripting. In a nutshell, there is no movement of training data from SAP HANA to the R server/client and the training of machine learning models is performed exactly the same place as where the data resides. In short: algorithms and data are on the same in-memory SAP HANA server.

In this blog, I will leverage an R package called modelplotr that provides easy debriefing of your machine learning models. Including a debriefing method that helps determine the actual business value that can be expected if the project is implemented and deployed.

During this blog, you will see how to build the visualization below. This visualization provides you information with the business value ($83.750) expected if this machine learning project is built and put into production.


From my own experience as an SAP Data Science consultant, I am often faced with the question of how to show the actual business value that machine learning is bringing to a company. I will with an example be showing how to evaluate the built machine learning model from a business value perspective. This approach can be used to determine the business value and as such act as a communicating device when prioritizing projects within a company.

Moreover the approach can also calculate the actual benefit of machine learning and in specific the optimum threshold in a machine learning model – for instance the optimum number of potential customers to offer a product, prevent churn and so on.

Firstly I want to showcase how to build a simple machine learning model in SAP HANA – but using the data science language R. SAP HANA comes with a wide set of machine learning algorithms to deal with regression, classification, forecasting, etc. This library is called PAL – short for Predictive Analysis Library (see more in the references below). Some of the algorithms in PAL can now be accessed through the use of R. In the following, I am using Rstudio as my preferred graphical user interface to R.

1. Machine learning with SAP HANA – all Interaction performed directly from R

The use case presented in this blog is a rather simple dataset. I have chosen this dataset so that you potentially can re-do the steps I am showing in your own environment. The data science challenge is a classic supervised binary classification. Given a number of features all with certain characteristics, our goal is to build a machine learning model to identify customers interested in buying a financial service (term deposit). A term deposit is a cash investment held at a financial institution. Your money is invested for an agreed rate of interest over a fixed amount of time or term. Term deposits can be invested into a bank, building society or credit union.

This dataset is a subset of the dataset made available by the University of California, Irvine. The complete dataset is available here: https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip.

Let’s look at the data:

Data description
y: has the client subscribed a term deposit?
Duration: last contact duration, in seconds (numeric)
Campaign: number of contacts performed during this campaign and for this client
Pdays: number of days that passed by after the client was last contacted from a previous campaign
Previous: number of contacts performed before this campaign and for this client (numeric)
Euribor3m: euribor 3 month rate (The Euro Interbank Offered Rate (Euribor)).

First I connect to my SAP HANA Server from R:


Exploring the data (see appendix for an explanation of how I got the data into SAP HANA from R):


2. Realistic business problem solved with machine learning.

Creating and training many Machine Learning models from scratch. As shown below I don’t just create one machine learning model but many. The purpose is the evaluate the models and chose the model that is out-performing other algorithms on this specific dataset.


Evaluating the trained models. Here I am using the score function in SAP HANA. The result from each trained model is stored in a comparison table:


As shown above the model accuracy for logistic regression out-performs the decision trees, support vector machine and random forest (ensembles of trees). We are most interested in the model that out-performs on the test data – to avoid overfitting (models could potential “remember” all points in a training dataset).

3. Illustrating the actual business value.

In my previous blog, I showed how to build the Receiver Operating Characteristic (ROC) curve which is used to assess the accuracy of a continuous measurement for predicting a 2 class outcome (bi-variant classification). In marketing, sales, medicine, etc., the ROC curves have a long history of use for evaluating the usability of machine learning models. In a nutshell, the ROC curves help identify whether the build machine learning model provides better than random selection, however, it can also help identify the optimum threshold of true positives and false negatives. This can be very valuable in use cases where for instance there is a cost associated with contacting customers.

However, explaining the sensitivity and specificity of ROC-curves in a business meeting can be challenging. This is where the Gains chart can be used to simplify things. The Gain chart is in short a representation where on the X-axis we have the percentage of the customer base we want to target with the campaign. The Y-axis gives us the answer to what is the percentage of all positive response customers have been found in the targeted sample.

Now that we have a model it is time to evaluate the business potential with a visualization of the cumulative gains and where we start using the R package modelplotr with predicted results from the SAP HANA trained model.

Plotting the Cumulative gains curve:

As shown the “modelplotr” has a function named plotting_scope. If you are interested in how I build the the “scores_and_tiles” dataframe check the appendix below.

Plotting the lift curve:
After having prepared the plot_input data it is really easy to plot the additional curves:

plot_cumlift(data = plot_input, highlight_ntile = 20)


Plotting the profit curve:

Finally, we come to my favorite plot and one of the main reasons for this blog. This approach is just as the other plotting functions, however you can declare the fixed cost of initiating this project, the variable cost of contacting a customer and the profit per customer that accepts the offer.


The variables I entered are solely based on my own imagination and for illustrative purposes only.


Perfect, we have now built a number of machine learning models and chosen the model with the highest accuracy on test data (newer seen by the algorithm). Furthermore, we have used the built model to predict the probabilities of a customer being interested in the product we are offering (term.deposit) and have even shown what the expected business value could be if we were to implement the solution.

Lastly, we have shown that if we contact the 17% customers that have the highest probability of buying we will earn $83.750.

That concludes the blog thanks for reading.

Appendix 1 – Preparing data for the package modelplotr:

The modelplotr requires a dataframe with this exact shape:

column type definition
model_label Factor Name of the model object
dataset_label Factor Datasets to include in the plot as factor levels
y_true Factor Target with actual values
prob_[tv1] Decimal Probability according to model for target value 1
prob_[tv2] Decimal Probability according to model for target value 2
… … …
prob_[tvn] Decimal Probability according to model for target value n
ntl_[tv1] Integer Ntile based on probability according to model for target value 1
ntl_[tv2] Integerl Ntile based on probability according to model for target value 2
… … …
ntl_[tvn] Integer Ntile based on probability according to model for target value n

First step to build this dataframe is to execute the SAP HANA trained Model with train and test data.


This produces two small tables with an ID, Predicted class and the probability of the prediction.

The second step is to build two dataframes that contains the probability of class = term.deposit. Remember the gains chart or ROC-curves needs to have the probability of a correctly class.


Final step is to stitch the data from the two previous steps:


This dataframe can now be consumed in the modelplotr functions.

plot_input <- plotting_scope(prepared_input = scores_and_ntiles,
select_targetclass = “term.deposit”, scope = “compare_datasets”)
plot_cumgains(data = plot_input)

Appendix 2 – loading data from R into SAP HANA:

If you would like to reproduce the example that I have shared in this blog. Here is how you can load the shown data into SAP HANA in a very easy way. This code could also be modified to fit your own data. You can either go with the sample dataset or go to the link with the full dataset https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip

The approach with the sample dataset is shown in the following:


Creating the tables in SAP HANA:


Filling in the data and checking the results:


All done.

You have now created two new tables in SAP HANA and inserted the bank term deposit data.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Let’s first go over an example of a HANA DB role request and then show that role provisioned in HANA DB through GRC Access Control.  First screenshot below is an access request for HANA DB role ABAP_ADMIN. Once the request is submitted and goes through the approvals, the role gets assigned to user ID.  The second screenshot shows the assignment in HANA DB.



In this blog, I will cover configuration steps to connect GRC Access Control to HANA DB.  I am outlining only steps that are required for HANA DB connection.  There are common config steps to setup Access Request Management (ARM) such as config parameters, define request types, define EUP, provisioning settings, data source setup etc. that are not in scope for this blog.  You also need to have MSMP Workflow setup specific to your requirements for approval routing(s). 

Both repository and catalog HANA DB role types along with ‘Analytic privileges’ can be provisioned through GRC Access Control.

Now, let me cover primary HANA DB specific configuration steps.  First two are Basis tasks and are usually performed by Basis consultant and rest are performed by GRC consultant.

1. First step is to download SAP GRC Plug-in for HANA DB from SAP Market Place.
2. Then you will need to deploy delivery unit with content for the SAP HANA Plug-in for GRC Integration with HANA by using HANA Studio:

SAP note 1969912 provides steps on how to deploy delivery unit and then install HANA Integration API to the system catalog.

https://launchpad.support.sap.com/#/notes/1869912

Note: This step is to be performed in HANA DB that is to be connected to GRC and where access will be provisioned.

3. Next you need to configure SAP GRC Access Control system with HANA Integration. Below steps need to be performed in HANA DB:

◈ Create a user GRC_DBCO_PI. This user must have role sap.grc.pi.ac.roles::SAP_GRC_PI_ADMIN.  Then, go ahead and deploy (activate) this user.
◈ Now login with this new user and change password. This is to set a permanent password for the user.
◈ User GRC_DBCO_PI should be used in next step to create DBCO connector.

4. Create HANA connector by using DBCO transaction code and configure this connector to connect to HANA DB. Below are the steps and all these steps are performed in GRC system:

◈ Create a HANA connector using DBCO


◈ Create a logical connector. Note that this logical connector must be of the same name as DBCO connection created in step i.. above.  You can either use transaction SM59 or SPRO path SPRO > SAP Customizing Implementation Guide > Governance, Risk, and Compliance > Common Component Settings > Integration Framework > Create Connector



◈ Now, you need to integrate HANA connector to GRC using Integration Framework as shown below. Please follow SPRO path SPRO > Governance, Risk, and Compliance > Common Component Settings > Integration Framework > Maintain Connectors and Connection Types. Note that Connection type is ‘HDB’ for HANA DB.



◈ This connector must be assigned to Integration Scenario ‘PROV’ through SPRO path SPRO à Governance, Risk, and Compliance >Common Component Settings > Maintain Connection Settings. Please also assign this to other Integration Scenarios as necessary such as AUTH, ROLMG, SUPMG.



5. Import Roles and assign approvers

Last step in this process is to import HANA DB roles and analytics privileges to GRC Access Control. This is necessary to make roles available to select and request in access request. Roles import steps are like ABAP roles provided that you have selected correct application type, Landscape, and Source system and they need to be specific to HANA DB.  Role type for Analytic privileges is ‘HAP’ and it is ‘SIN’ for roles.  When importing Analytic privilege, you need to use File on Desktop or File on Server option for ‘ImportSource’ and modify role type to ‘HAP’ in attribute file before importing it.



With successful completion of above steps, you have connected GRC Access Control to HANA DB for access provisioning and roles and Analytics privileges are available to request.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
When batch-loading data from a source system into your Data Mart or Data Warehouse, you usually look for ways to reduce the to-be extracted data set to minimize extraction time and resource consumption. Ideally, there is a way to only extract changed records, for example, using a change date column. However, this does not always fly. For example, change date columns are not always reliable or deletions are not always tracked in the source table. An alternative would be to switch from batch to log or trigger replication on the source, but you don’t always have this luxury. So, now you have to load the entire source data set, but still you want to identify the change data, to only process delta’s in downstream data flows. How can you identify such change data using batch loading?

In this blog, I will explain how you can identify the delta data set using Smart Data Integration (SDI) flowgraphs inside the SAP Web IDE for HANA. A flowgraph is used that writes to a so-called changelog table. In this blog, only the changelog table is filled. Once you have the change data, it’s up to you how to use it in downstream data flows.

For this blog, version SPS04 Patch 0 of the Web IDE for HANA was used. In earlier or later versions, or in the Web IDE Full Stack for CF, UI or behaviour might differ from what is described here.

Use case description
The following use case will be used, which was recently described to me by a customer:
◈ The source table is
     ◈ in an external database, connected to the SAP HANA SQL DW via SDI.
     ◈ rather large, but there is no field that tells us if data has been changed or deleted.
◈ The comparison table (the table that the source table needs to be compared with)
     ◈ tracks the history of all changes using validity timestamps
     ◈ has different column names than the source
     ◈ is populated by a tool that expects only changed records as input

You might have wondered if the Native Data Store Object (NDSO) would have been a candidate to solve this issue. After all, the NDSO provides a delta-mechanism out of the box. However, it is the three characteristics of the target table that discard that option, and therefore we solve this use case using SDI flowgraph transforms which cost more time to build, but allow for more flexibility.

Source and comparison table
The source table is defined as follows:


Figure 1: Source table

The comparison table (the table that the source table needs to be compared with) is defined as follows. As you can see, column “ValidityFrom” is part of the key, and serves tracking the history of the data set together with the “ValidityTo” column. Also, there is a field “SalesRename”, which should be compared to field “Sales” of the source table.


Figure 2: Comparison table

Creating the changelog table and the flowgraph

First, we need to define a changelog table that stores the change data. Initially, only a table is needed that identifies the changed data for each individual run, so we do not need to store changelog history.

The changelog table is defined as follows. It has a similar structure to the source, except for the renamed field, and the two added fields that indicate the Change Type (I for Insert, U for Update, D for Delete) and the ChangeTime (a timestamp of when the flowgraph detected the changes). Note that the flowgraph will expect such fields if you want to construct a changelog, although you are free to change the ChangeTime column to a date format or leave it out completely and only stick with the ChangeType column. I prefer to have the ChangeTime column to know for sure which results I am looking at in the changelog table.


Figure 3: The changelog table that should contain “Change Data”

The flowgraph needs four nodes, as you can see in the below illustration.


Figure 4: Main view of the created flowgraph

The first node is the data source, which does not need further explanation. The second node serves to rename the incoming fields to match with the table we want to compare with. Since in our source we have a field “Sales” and in the table to compare with we have a field “SalesRename”, the “Sales” field is renamed to “SalesRename”. Please note that any renames should be done at this point. You cannot rename fields in the Table Comparison node itself, and the Table Comparison node expects fields from the source and compare table to be the same.


Figure 5: Rename of a field in the Projection Node

In the table comparison node on the source screen, first select the table that you want to compare with. Once you do this, the fields of that table are displayed to the right (which has no further use than to inform you on the definition of that table). The “Generated Key Attribute” is left empty. This option is only necessary if you expect that the comparison table has duplicate entries on the compare criteria, and the compare table has a surrogate key. In that case, the record with the largest surrogate key is chosen to compare with.

Since the comparison table tracks history of the data set, the comparison should only take place on records that are “current”. In the example case, these are the records of which the ValidityTo timestamp is equal to ‘9999-12-31 23:59:59’. Also, since the source can delete records, and these should be tracked, the “Deleted rows detection” checkbox is checked. Please note that this option will slow down the operation, so only use it if necessary.


Figure 6: Comparison table settings in the Table Comparison node

In the attributes section, by default no fields are shown, which means that the comparion will take place on all fields of the source table. In case you do not want to compare all those fields, you can add the ones needed for comparison. The less fields to compare with, the faster the operation will be.

The Primary Key checkbox represent the fields on which the comparison is based. In the example use case, this is the SalesOrderID, so the default selection is correct.


Figure 7: Source table settings in the Table Comparison node

In the target node, the ChangeType and ChangeTime columns are left empty. They will be filled in the Settings panel.


Figure 8: Changelog table mapping in the Target node

In the Settings panel, the “Truncate Table” checkbox is checked, because the changelog should only store the records from single flowgraph executions. The Writer Type should be set to “Insert”, because Updates and Deletes should actually be stored as individual records, and not treated as an actual Update or Delete.

The Change Type Attribute and Change Time Attribute, obviously, are mapped to the ChangeType and ChangeTime column.

In this example, we are not tracking the full changelog history. If you would like to, you should make sure the changelog table contains a surrogate key column, which you then map to the “Key Generation Attribute” field. Also, you should create a sequence using an hdbsequence file, and map the sequence in the “Key Generation Sequence” field. Don’t forget to uncheck the “Truncate Table” checkbox.


Figure 9: Writer settings in the Target node

Test the flowgraph – example input and output

To establish a starting point for testing the flowgraph, records are inserted to the source table, as well as the comparison table. In the comparison table, one business record has two versions, of which only one is current.

TRUNCATE TABLE "deltas_for_fulls::SalesOrderHeader_TableCompare_HistoryTarget";
TRUNCATE TABLE "deltas_for_fulls::SalesOrderHeader";
INSERT INTO "deltas_for_fulls::SalesOrderHeader" VALUES (1,'13.10.2010','Muhammed MacIntyre','Eldon Simple','6','261.54','United States of America');
INSERT INTO "deltas_for_fulls::SalesOrderHeader" VALUES (2,'01.10.2012','Barry French','Cardinal Regular','2','199.99','Germany');
INSERT INTO "deltas_for_fulls::SalesOrderHeader" VALUES (3,'21.06.2019','Zeph Skater','SomeBoard','30','300.00','Spain');
INSERT INTO "deltas_for_fulls::SalesOrderHeader" VALUES (4,'10.07.2011','Clay Rozendal','R380','30','4965.76','Luxembourg');

INSERT INTO "deltas_for_fulls::SalesOrderHeader_TableCompare_HistoryTarget" VALUES (1,'12.06.2019 00:00:00','31.12.9999 23:59:59','13.05.2019','Muhammed MacIntyre','Eldon Base for stackable storage shelf platinum','6','261.54','United States of America');
INSERT INTO "deltas_for_fulls::SalesOrderHeader_TableCompare_HistoryTarget" VALUES (2,'12.06.2019 00:00:00','31.12.9999 23:59:59','14.05.2019','Barry French','Cardinal Slant-D Ring Binder, Heavy Gauge Vinyl','27','244.57','Brazil');
INSERT INTO "deltas_for_fulls::SalesOrderHeader_TableCompare_HistoryTarget" VALUES (3,'01.05.2018 00:00:00','11.06.2019 23:59:59','11.06.2019','Zeph Skater','Shortboard','40','400.00','Colombia');
INSERT INTO "deltas_for_fulls::SalesOrderHeader_TableCompare_HistoryTarget" VALUES (3,'12.06.2019 00:00:00','31.12.9999 23:59:59','11.06.2019','Zeph Skater','Longboard','30','300.00','Colombia');

The source table now has the following content:


Figure 10: Source table contents for test run

The comparison table now has the following content, and comparing it with the source, you should see that the difference consists of one insert, two updates, and one deletion.


Figure 11: Comparison table contents for test run

In the results after running the flowgraph, you can see the captured change data.


Figure 12: Changelog table contents as result of test run

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview