Remember data warehouses? Yes, they are still relevant

Most clients I speak with tell me they have no intention of shutting down their on-premise data warehouse. Others have migrated to the cloud, but still with a cloud data warehouse platform.
Sometimes I get wary of the latest thing. Today I want to trip down memory lane a few years, and talk about data warehouses. The term “data warehouse” can create the impression that it is merely a place to store lots of data, which raises the issue, why not keep the data in a data lake or native object store?
An active data warehouse performs many more essential functions. It represents the single source of integrated, curated data, both historical and current. It supports and often provides complex analytics and processing of models for data scientists. Because of its relational nature, it stores and processes data at the atomic level. The Hadoop file system, HDFS (though I don’t know if anyone still uses it), stores data as files, so operating on the contents of those files requires more work to decompose them and pass results off to other modules.
Prevalent thinking about data warehouses is that data is ingested, either cleaned up beforehand via ETL or staged in the data warehouse and cleaned there (ELT). Either way, it is positioned as a source of data (and metadata) for queries. But in many cases, it is not a one-way flow. The data warehouse can be used to create models that, in turn, generate results.
For example, I’m going to describe a data warehouse that curated source data from twenty different internal systems in a life insurance company. Extracts were pulled from the data warehouse to feed a valuation system, which generated large volumes of cash flow information based on interest rate assumptions, mortality and lapses. In one case, the fine-grained time-series cash flows from hundreds of valuation runs were analyzed and used to generate solvency reports and other statutory requirements.
Regulators of life insurance companies are primarily concerned with solvency. Since life insurers typically issue contracts with liabilities not due for years or decades, it is critically essential that regulators carefully watch insurers’ business practices, investment portfolios and experience. This particular company caused some concern with the regulators for several reasons I won’t go into, but management chaos was on top of the list.
The basis of a solvency study is a portfolio valuation. Most of the time, the valuation is done at a slightly aggregated level, not for each policy, such as ten year level term for 10-year age brackets and possibly some other variables. However, for this company, the regulators demanded a full seriatum valuation – not just each policy, but each policy coverage, such as death benefit, accidental death benefit and disability waiver benefit, for example. In addition, instead of providing the valuation and all associated studies on an annual basis, they demanded them on a quarterly basis. They gave the company until the end of the following year to comply, about eighteen months.


