

In the above table, since policy number itself is a unique identifier I was left with the task of next eight variables. I noted down each variable from the data and gave description to each of them.įind unique values for each individual variable from the data. The IBM Dictionary of Computing defines Data Dictionary as a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format. The moment I received the data from my Boss along with the scary deadlines, my initial job was creating a data dictionary.Ī data dictionary contains the descriptions of data variables in a data model. Task 1 of Multivariate Linear Regression: Get the Data Dictionary The basic objective was to identify and quantify losses in the insurance portfolio.Īlthough I did have theoretical understanding of regression, yet it took me 2 sleepless nights applying it practically. The data was related to Auto Insurance and it was not about identifying and defining variable types, rather it was all about creating a data dictionary and doing a Bivariate Regression Analysis by a way of multivariate linear regression.

My Boss had given me an assignment where I had to work on the data from the Banks General Insurance unit. I remember those early days of my career when I was a part of Risk Analytics team at a bank.
