Content is loading
Skip to main content

Operational problems at theDanske Bank Group

Original announcement entitled "Statement on operational problems at the Danske Bank

April 3, 2003

Press release
On Monday, March 10, IBM’s technical division was performing a routine replacement of a defective electrical unit in an IBM disk system, type RVA (the disk system used for storage of data in DB2 database software), at DMdata’s operating installation in Ejby. During the repair process, there was an electrical outage in the disk system, and the result was that operations stopped at one of the Bank’s two operating centres (Ejby) as of 16:08 CET.

A combination of circumstances turned the replacement, which began as a routine operation, into a situation with wide-ranging effects. These circumstances revealed several never-before-identified errors in IBM’s DB2 database software, errors that led to a critical operational situation.

The consequences were significant for the Danske Bank Group’s customers, particularly involving payments and the trading and settlement of currencies and securities. Realkredit Danmark and Danica Pension were affected only on a very limited basis.

Technical description of the difficulties
The RVA disk and the business data stored on it were not accessible to the Group’s systems after the breakdown. This meant that the part of the systems running at the operating centre in Ejby (currency and securities trading, as well as foreign systems and payments) was not able to function. Operations continued for the systems running at the operating centre in Brabrand (cash dispensers and self-service systems.)

When the problematic RVA disk unit was given the all-clear on Monday, March 10 at 22:00 CET, the Bank’s systems in Ejby were restarted as usual after an operational disturbance. The Group’s batch runs (processing of large amounts of collected data from product systems for book-keeping, printing, clearing and data warehousing) were started Tuesday at 01:00 CET, a delay of six hours. However, during the morning of Tuesday, March 11, it became clear that the batch runs were not running correctly.

Even though the re-start of DB2 database software went normally, a combination of circumstances was creating inconsistencies in the data. This first software error in DB2 database software had existed in all similar installations since 1997, without IBM’s knowledge.

In the following process of data recovery, which lasted from the morning of Tuesday, March 11, until Friday, March 14, three more errors were discovered in IBM’s DB2 database software – faults uncovered due to the original unusual events at the restart of DB2 database software.

The second software error meant that the recovery process on several DB2 tables could not be started, which delayed the recovery.

The third software error prevented recovery jobs from being run simultaneously, which further delayed the recovery.

The fourth software error kept recovery jobs from re-establishing all the data in the tables. This last error, which appeared on Thursday, March 13, resulted in new episodes of inconsistent data that had to be recreated by other methods. This made the process longer and more complicated. In order to avoid longer delays, the Bank decided not to wait for its IT supplier’s patch on the software. Instead, it relied on a process the Bank developed with DMdata, building on the Bank’s own back-up data from its operating centre in Brabrand.

System operations at the Bank’s centre in Brabrand were largely unaffected, although the functioning of several systems was affected by the lack of data from the Ejby centre.

On the afternoon of Thursday, March 13, several online systems were restarted, including the currency and securities trading system as well as the Group’s foreign systems. In the early hours of Friday, March 14, data was finally recovered.

On the morning of Friday, March 14, all online systems were successfully re-started, and batch runs from Monday, March 10, were begun. Several extraordinary checkpoints were set up to make sure all data was recovered correctly.

Batch runs from Monday, March 10, until Friday, March 14, were carried out during the weekend, and in early in the morning on Monday, March 17, all data from the problematic disk was transferred to other systems, and the disk was taken out of operation.

On Monday, March 17, Danske Bank settled all accumulated transactions with counter-parties outside the Bank and all operations were normal.

None of the errors in DB2 were previously known to IBM, and therefore no patches could immediately be provided. Patches for these errors are now available for all users of DB2 database software, including Danske Bank and DMdata. Danske Bank and DMdata have now installed these patches on the Bank’s systems.

During the system problems, Danske Bank was not at risk for data loss, since all data could be recovered from the operating centre in Brabrand or from DB2 database software logs, which are stored on alternative operating systems. The greatest difficulty was that the recovery process was extended due to the appearance of new software errors. If further software errors had appeared beyond the four named, the only consequence would have been that recovery would have taken longer.

New measures
Danske Bank and DMdata have reviewed the entire course of events and are now discussing with IBM measures that will improve IBM’s DB2 system as well as more extensive and efficient emergency procedures in case of such critical events.

It is not unusual to continuously locate software errors and receive patches from the supplier to fix the errors. DB2 is under constant development and the Group receives new versions on an ongoing basis. Such errors are not unique to Danske Bank, but a problem all users of DB2 database software encounter. DB2 is the most popular database software for mainframe systems.

Since 1991, the Danske Bank Group has had two IT operating centres, which to a certain extent duplicate data and back up system operations. This duplication meant that the Bank was able to recover data.

In addition to the current two-centre operating system, the Danske Bank Group began in the autumn of 2002 to implement the so-called GDPS emergency security system (Geographically Dispersed Parallel Sysplex) with mirrored disks supplementing the existing two-centre operation set-up.

The GDPS system ensures that the Group’s central systems will have a safer emergency system if hardware fails, because daily operations run on two identical hardware complexes with mirrored data. If hardware fails, the mirroring of data ends immediately and the other complex takes over operations. This will shorten the time it takes to restore data and reduce the effect of breakdowns. The press has suggested that installation of a mirrored disk would have allowed Danske Bank to avoid the recent technical problems. However, mirrored disks, with or without the current version of GDPS, would not have prevented the recent breakdown of Danske Bank’s IT systems. The Group is of the opinion that the next version of GDPS, to be released in May 2003, would have prevented the effect on operations caused by the hardware error.

To prevent complex software errors, Danske Bank is considering future investment in the two-centre operations to protect some of the Group’s vital IT systems.

The business community and the public have requested more open communication, a request which the Bank will do its best to fulfil. The request has now been incorporated into the Group’s emergency procedures, including the procedures for major operation breakdowns.

As the course of events shows, new software errors during the period from Tuesday morning, March 11, to Friday, March 14, made events impossible to predict. During this period, Danske Bank believed at several points that operations had been normalised. Regrettably, this turned out not to be the case due to the new software errors.

Peter Schleidt, Executive Vice President, tel. +45 43 27 85 00.