ETL

Data Catalog: Publishers

285 views June 10, 2018 May 29, 2019 admin 0

UDM team maintains a carefully curated structure of metadata categories for each value we store in our databases. Publisher is the top-level metadata category. Publishers can have multiple Reports, but each Report belongs to one, and only one Publisher.

Excel users can find the Publishers collection in the UDM Data Catalog Excel file, under tab Publishers.

Data Feed for ETL automated connections is /UDM_Catalog_Publishers.

The fields in this data feed are:

  • PublisherID (int, not null)
  • PublisherCountryID (int, not null)
  • PublisherName (nvarchar(400), not null)
  • PublisherNameTranslated (nvarchar(400), not null)
  • PublisherNameShort (nvarchar(30), null)
  • PublisherUrl (nvarchar(400), null)
  • LastModified (datetime2(6), not null)
  • Active (bit, not null)

PublisherID is the unique identifying number assigned to each publisher. PublisherName is the full official name of the agency which publishes the data we collect. For example, “L’Établissement national des produits de l”agriculture et de la mer (FranceAgriMer)“. UDM standard protocols are to store time series, report names and publisher names using the language in which they were originally published, in this case French. To facilitate data searches we also provide translated names (PublisherNameTranslated), as well as short names/acronyms (PublisherNameShort). PublisherUrl field contains the link to the homepage of the publishing agency. PublisherCountryID identifies the country where the Publisher is headquartered.

The feed UDM_Catalog_Publishers contains the timestamp field LastModified which can be used to facilitate incremental pulls. No record is ever hard-deleted from the underlying UDM database table. Instead, soft-deleted records have Active field set to 0. Soft-delete principle in conjunction with the timestamp of last change allows the users to optimize their data pulls by only requesting those fields that have been modified since the last time the data feed was pulled. For example, the request to the server may be:

http://udmdirect.dairymarkets.com/Universal/UDM_Catalog_Publishers?$filter=LastModified gt DateTime’2018-06-01T04:04:00.0000000′

or more simply:

http://udmdirect.dairymarkets.com/Universal/UDM_Catalog_Publishers?$filter=LastModified gt DateTime’2018-09-01′

 

To set up this table in your database, you may use the following code:

CREATE TABLE [UDM].[Catalog.Publishers]( 
		[PublisherID] [int] NOT NULL,
		[PublisherName] [nvarchar](400) NOT NULL,
		[PublisherNameTranslated] [nvarchar](400) NOT NULL,
		[PublisherNameShort] [nvarchar](50) NULL,
		[PublisherCountryID] [int] NOT NULL,
		[PublisherUrl] [nvarchar] (400) NULL,
		[LastModified] DATETIME2(6) NOT NULL, 
		[Active] BIT NOT NULL, 
    CONSTRAINT [PK_Catalog.Publishers] PRIMARY KEY ([PublisherID])
) 
GO

Was this helpful?