SAP HANA Fast-Restart
Published Dec 11 2019 09:39 AM 14.7K Views
Microsoft

Objective

Over the recent years, we witnessed the fast growth of SAP HANA database size which dramatically stretches the time it takes to load massive amount of data onto server memory every time either the SAP HANA services or the server itself need to be restarted. Persistent memory (PMEM) is a solution to reduce data load time for servers capable of supporting the new memory technology. For those who haven’t yet taken on PMEM, SAP HANA Fast Restart (FR) is a compromised ‘software’ alternative to avoid the data load time in cases when HANA services is the only component needs restarting.

 

Since HANA 2.0 SPS 04, SAP introduced the FR feature to preserve memory content in temporary file systems (tempfs) which are directly mounted on Linux memory segments. Linux tempfs is like a RAM drive, tempfs content disappears if the server is rebooted. If the host server (or VM) stays online and SAP HANA services restart and recover, they can attach onto these memory segments with prior content before service recycled; thereby completely skip the reloading of HANA data.

Test environment
We conducted a test to observe the shutdown and startup behaviors of a 3+ TB SAP HANA database on an Azure Mv2 VM. The Mv2 server is the M208ms-v2 SKU that has 208 vCPUs and 5700 GiB RAM. It has SLES 12 SP4 for the OS, installed with SAP BW/4HANA on HANA 2.0 SPS 4 rev 40. The data content is the data collection used for the BW/4 benchmarks. The data size on disk and its memory footprint can be seen below,

 

1.database_footprint.png

HANA Startup Process overview

It’s of value to describe the HANA startup phases to support the parameters used to earmark HANA state of readiness in the boot process. Below is a synopsis of the HANA startup process while this blog provides more details.

  1. Opening the Volumes – services with persistence validate filesystem mounts and open the volumes
  2. Loading and Initializing Persistence Structures – initialize persistence structure and load statistic
  3. Loading or Reattaching the Row Store – this time is saved with the hdbrsutil OS processes which preserve process memory across HANA restarts (not VM reboot)
  4. Garbage-Collecting Versions - the garbage collector cleans up all versions except for the most recent one for any column store table
  5. Replaying the Logs – replay redo logs for both row and column stores. This action requires row stores to be in memory while the required column tables can be loaded on the fly
  6. Transaction Management - all services of a database synchronize with each other to ensure transactional consistency
  7. Savepoint - All changes that have been performed in steps 3 – 5 are now persisted to the DATA volumes by a savepoint
  8. Checking the Row Store Consistency - a row store consistency check is performed as last step during startup. This step can be configured to skip to save 10 min or so if one is certain of row store consistency during run time
  9. Open SQL Port - As soon as the SQL port is open, the application can access the HANA DB

The opening of SQL ports is a key timestamp earmarking HANA service availability while the rest of the column store data continues loading. If an incoming query requests data from a column table that is not yet in memory, it will be loaded on-demand. Hence the query performance may be suboptimal compared to normal HANA operation. The column tables load duration varies depends on the data size, and the storage system performance.

SAP HANA DB trace

To have SAP HANA DB startup and shutdown related activities captured, the database traces need to be configured at the ‘debug’ level, so all actions are available in the various trace files for investigation. To learn about the different trace levels available, please see this SAP help article.

Test read-outs

The SAP HANA FR option uses memory mapped storage structure in the file system to preserve and reuse MAIN data fragments to speed up SAP HANA restarts. This is effective in cases where the operating system is not restarted. For FR setup details see SAP help.

Before FR configuration, the SAP HANA DB startup timing was recorded to capture the baselines. The same metrics were again observed on another restart process after FR enablement for comparison. This report includes timestamps of significant events as the basis to draw the test conclusion.

Non-FR Startup load time

Extracted meaningful records from SYSTEMDB – nameserver_<hostname>.30001.000.trc. First record of the trace file registering the start of the name service:

[57074]{-1}[-1/-1] 2019-08-27 22:25:48.228382 i Basis TraceStream.cpp(00708) : ==== Starting hdbnameserver, version 2.00.041.00.1560320256 (fa/hana2sp04), build linuxx86_64 b178e03892acbdd031bc1a7824a3cd17c7db3ae3 2019-06-12 08:26:49 ld4550 gcc (SAP release 20181205, based on SUSE gcc7-7.3.1+r258812-2.15) 7.3.1 20180323 [gcc-7-branch revision 258812]

[57213]{-1}[-1/-1] 2019-08-27 22:25:59.489436 i Service_Startup tcp_listener_callback.cc(00074) :

start the SQL listening port: 30013 with backlog size 128

Observation: It took 11 seconds from the start of the name server to the SQL port opens.

 

Right after this time, the xsengine starts, indexserver begins to unload, then load tables. The load duration lasts from

  • The beginning of the indexserver load trace file: 0;3;2019-08-27T23:33:16.521000+00:00;SAPBHB;0;ESENDCONTROLT;8916;0;0;en;;3;300;0;transaction_id=30;
  • The last statement of the trace file: 0;3;2019-08-28T01:41:10.482070+00:00;SAPBHB;0;ENHSPOTCOMPSPOT;6973;0;0;en;$trexexternalkey$;3;3;0;statement_id=1290621387667374, statement_hash=6a188027dffefeee5a0fafa8b24552db, transaction_id=240, statement_execution_id=1125912791771611, connection_id=300496, db_user=MUELLERCARS, application_name=ABAP:BHB, app_user=SAPBHB;

Observation: It took 2 hours and 8 minutes to complete loading the column store.

FR enabled startup load time

Name server trace

[131307]{-1}[-1/-1] 2019-08-29 04:15:10.304153 i Basis   TraceStream.cpp(00708) : ==== Starting hdbnameserver,

[131340]{-1}[-1/-1] 2019-08-29 04:15:16.590652 i Service_Startup tcp_listener_callback.cc(00074) : start the SQL listening port: 30013 with backlog size 128   

Observation: It took 6 seconds from the start of the name server to the SQL port opens

Index server trace

First entry of the load trace file

0;3;2019-08-30T18:19:32.760000+00:00;SAPBHB;0;0BW:BIA:BI0_0C00014222;450069;0;0;en;;3;79;0;statement_id=1291262830569130, statement_hash=5a9012b2349c8e356c328bd696bbe9e9, transaction_id=109, statement_execution_id=1125912791746560, connection_id=300645, db_user=_SYS_STATISTICS, application_name=Embedded Statistics Server;

Last entry of the load trace file

0;3;2019-08-30T18:19:36.753000+00:00;SAPBHB;0;/IWBEP/L_ST;26660;0;0;en;;3;100;0;statement_id=1291262830569130, statement_hash=5a9012b2349c8e356c328bd696bbe9e9, transaction_id=95, statement_execution_id=1125912791746560, connection_id=300645, db_user=_SYS_STATISTICS, application_name=Embedded Statistics Server;

Observation: load related activities were recorded for 4 min.

 

simply based on the duration of the indexserver trace file, the load duration from multiple trials of HDB restart with FR enabled showed significantly faster startups as compared to HANA restarts without FR. With this test configuration, it reduced the start time from 2 hours and 8 minutes without FR; to merely a few minutes with FR.

 

A view from hdbsql

The above conclusion was drawn from the database activity traces but how does it look reading a big table with hdbsql during the early stage of SAP HANA data load? We ran a small experiment to satisfy that curiosity. Admittedly, this is not the most precise measurement because we can’t manually trigger the hdbsql connection request at the exact point in time of the startup process, but good enough to establish a ballpark reference.

Test run with FR configuration active

With each of these test run, the B32 database is restarted and the below sql command is executed as soon as the hdbsql connection to HANA is possible.  In this scenario, the target table is supposedly already in memory.  For the next case without FR, this is done to query the table before it gets a chance to load.

 

For the test, we added up a key figure column of the largest table in this system containing more than 10 billion rows.

hdbsql B32=> select sum("/BA7/S_CURKYF01") from "SAPBHB"."/BA7/AB4CORPM1"

1 row selected (overall time 22.164005 sec; server time 14.166588 sec)

Test run with FR undone

hdbsql B32=> select count(*) from "SAPBHB"."/BA7/AB4CORPM1"

1 row selected (overall time 92.801180 sec; server time 84.679599 sec)

 

Observation: the Sum() function took roughly 4.5 times longer to run without FR setup.

 

To sum up, SAP HANA Fast Restart configuration eliminates the data load time associated with database services restart thereby improves the data availability for SAP business applications.  To preserve data in memory across VM or host reboots, persistent memory is required.

 

 

7 Comments
Copper Contributor

Hi,

 

Edit: answered my initial question on next read :)

 

Is this boost also observable on/in different system types (S4, etc.)?

 

Br,

Vanja

Microsoft

Vanja777,

Although I didn't test out applications like S/4HANA.  Fast Restart is a database feature so I would expect it to act the same way as it does in BW/4HANA.

Ben

Copper Contributor

Thanks, that is a very interesting test and I'd like to get clarification on one aspect: Table data in memory after reboot yes or no?

 

The behavior of Hana by default is to not(!) load the column store table-partitions into memory as part of the boot, unless needed. It is needed if the table-partition is marked as cached/preloaded, if it contains data required for the log replay and similar things.

 

What has been the settings in your test case in that regards?

 

I would have expected the boot to finish in just a few minutes instead of 2 hours also. And of course, the first query reading all columns of all partitions, that will take a long time so that at the end, it does take 2 hours until Hana really is usable. Meaning Hana is up and running plus all data is in-memory.

 

But I get the impression I am wrong in that assumption, hence would like to validate.

 

Thanks in advance for your time!

Microsoft

Hi Werner,

My setup was the BW/4HANA benchmark.  As such everyone who runs benchmark definitely don't want any loading during query execution.  With that in mind, there hasn't been any selective unloads configured.  By defaults all row and columns stores are in-memory in my case.

Ben 

Copper Contributor

Thank you. Would it be possible to validate that? Two points come to mind:

  • select table_name, is_preload, is_parital_preload from tables;
    Are all tables really set to preload=true?
  • alter table <table_name> preload none;
    Create a script to change all BW tables to preload=false and then reboot the system again. What is the startup time then?

I would think that this information is very important for readers, because we are looking at two extremes.

Extreme #1 is all data preloaded in-memory.

Extreme #2 is an empty database that is loading data at access.

 

The reality will be somewhere in the middle. The main finance table/partitions for the current and previous year will be used from them get-go. The 20 years of historical data will not be used and if, rather by accident. These partitions might even be marked to be removed from memory after a certain amount of time without any access.

So at the end, all databases will have table/partitions constantly accessed versus table/partitions seldom used. Why wait for the seldom used tables to be loaded into memory during boot time if the data is not required?

Hence knowing the two extremes would help the reader (and myself) to quantify the situation. Currently people are saying a 2TB Hana takes 2hours to boot. Without any further qualifiers and footnotes. 

 

Appreciate your time!

 

-Werner

Microsoft

Hi Werner,

You raised some interesting points.  I ran the SELECT query against the "sys"."tables" system view and found ALL of the tables in my BW schema have 'preload=false'.  I looked further to learn why my HANA DB is preloading at every restart. I think I found the answer in SAP Note 2127458 - FAQ: SAP HANA Loads and Unloads, point #3. When do loads happen?  In my indexserver.ini -> [sql] -> 'reload_tables' is set to 'true'. The answer hence is the first scenario 'Reload after startup (pre-warming based on columns previously loaded)'.  All pertinent data in the benchmark runs have been reloaded at my HANA restarts.   

Hope that helps.

Ben

Copper Contributor

Interesting. So it loads those tables that had been in-memory right before the shutdown. We are getting closer to understand all the details, it seems....

Version history
Last update:
‎Dec 11 2019 09:39 AM
Updated by: