ETL

Data Catalog: Measures

233 views September 10, 2018 September 14, 2018 admin 0

Each value stored in the database belongs to one, and only one Series. As of September 2018, UDM database stores over 34,000 series. How should one go about looking for a series of interest? To facilitate this process, we recently introduced our own classification of series into meaningful Categories and Measures. Consider for example these series: “Boterproductie” (Netherlands), “Elaboración de mantequilla” (Chile), “MRiRW Masło Ekstra, 82% tl., w blokach, średnia ważona cena” (Poland) and “Butter, Stocks in All Warehouses” (United States). Each of these series contain either production, price or inventory of butter. Butter is the concept common to each of these series, and Butter is the category all of them are associated with. We may wish to further reduce the list of series of interest to include only prices, and exclude stocks, production or similar. That is the purpose of the dimension called Measures.

Each new Series we add to the database belongs to at least one Measure, but may, where appropriate, belong to multiple Measures. Classification of Series into Measures was introduced in June 2018, and series added to the database before that date may not yet be classified, but our plan is to eventually have all series classified into one or more measures.

The data catalog endpoint with the full list of all Measures is /UDM_Catalog_Measures. The fields are:

  • MeasureID (int, not null)
  • MeasureName (nvarchar(400), not null)
  • MeasureDescription (nvarchar(900), not null)
  • LastModified (datetime2(6), not null)
  • Active (bit, not null)

MeasureID is the unique identifying number assigned to each Measure, i.e. the primary key for the table where this data resides. MeasureName is the brief name for the Measure. MeasureDescription is the verbose description of the Measure. The feed UDM_Catalog_Measures contains the timestamp field LastModified which can be used to facilitate incremental pulls. No record is ever hard-deleted from the underlying UDM database table. Instead, soft-deleted records have Active field set to 0. Soft-delete principle in conjunction with the timestamp of last change allows the users to optimize their data pulls by only requesting those fields that have been modified since the last time the data feed was pulled. For example, the request to the server may be:

http://udmdirect.dairymarkets.com/Universal/UDM_Catalog_Measures?$filter=LastModified gt DateTime’2018-06-01T04:30:00.0000000′

or more simply:

http://udmdirect.dairymarkets.com/Universal/UDM_Catalog_Categories?$filter=LastModified gt DateTime’2018-06-01′

Our Excel users can see which Series belong to a particular Measure in the tab Series, by filtering the column Measure:

ETL users should pull in the cross-reference catalog feed /UDM_Catalog_XRef to obtain information on all relevant many-to-many relationships in the UDM Database.

See related Knowledge Base articles: 

 

Was this helpful?