“Attention reader! The following project description is written for the technically literate. Proceed with caution or with a development thesaurus within reach.”
Now that we have made the proper introduction let’s proceed to the actual project details. Data Mart was designed for a major insurance company with a desire to up its game in big data analysis. Two things should be immediately evident: (1) the amounts of data owned by such a company are colossal, (2) tools for the analysis of such data are expected to be computationally elaborate and demanding. Yes, we dig into these kinds of projects like crazy!
So here is our development strategy!
Data Warehousing & Data Analysis for major insurance company
Mobile Technology Supported: Web application
Web version of the App: Yes
Cloud Service used: Private Cloud owned by the client
Supported Bandwidth Requirements: 30 TB
Supported concurrent User Requirements: 1.000
[read more=’Read more’ less=’Read less’]
The Development
It is obvious that we had two major issues to handle with Data Mart. Firstly, the storage issue. Secondly, the analysis issue. So, the project was split in two separate but interconnecting development works.
The Two Pillars
The Data Mart project has 2 major pillars – Data Warehousing and Data Analysis. To be more precise:
- The Data Warehousing layer which is responsible for storing historical data of the organization in the form of Facts and Dimensions. Huge amounts of data are extracted from various data sources like AS400, CRM & Insurance Management systems.
- The Data Analysis layer which abstracts the underlying data warehouse tables to analytical models that reflect how business users perceive the business entities and measures, regardless of the data warehouse table schema.
The Trick
As we stated earlier, the amounts of data stored are colossal. And since the Data Warehouse typically contains a massive volume of data, the processing overhead for complex queries can result in unacceptably slow response times. If, in addition, many users access the data concurrently, the problem gets even worse. And now for the trick…
In our case, the data model pre-aggregates the data!
This not only provides vastly superior performance for analytical queries but also ensures the stability of the system regardless of the number of users working simultaneously.
The Technology
And for the technical part, these are the technologies implemented.
On the Server Side
- SQL Server Database Engine
This is the basic SQL Service. It’s needed for hosting the DW Database.
- SQL Server Analysis Services (SSAS)
This is the required SQL Service for hosting the Analytical Data Models (Cubes).
- SQL Server Reporting Services (SSRS)
This is the required SQL Service for hosting and sharing the User Reports.
- SQL Server Integration Services (SSIS)
This is the required SQL Service for data extraction, transformation, and loading (ETL). The tool may also be used to automate updates to multidimensional cube data. Although it is not necessary, it is strongly recommended to use SQL Server Data Tools through Visual Studio Environment.
On the Client Side
- SQL Report Builder
Report Builder provides a productive report-authoring environment for IT professionals and power users.
- PowerBI Desktop & Excel
Both environments can be used to connect to Analysis Services and use the predefined Models for further Analysis & Visualization. Excel automatically recognizes the Analytical Models and imports them using the Power Pivot Plugin. This is an incredible powerful option for Excel Power Users, because they can:
- Take advantage of familiar Excel tools and features
- Process massive amounts of data in seconds
- Use powerful new analytical capabilities, such as Data Analysis Expressions (DAX)
[/read]