In some development environments, you dont have all the required dimension data and as a result, your fact loading mapping's test runs go for a toss. The mapping wont be able to load anything (reject everything) since some or other foreign key would be missing for each row.
In other words, only those records would be loaded for which ALL the foreign key constraints would be satisfied. However, in Production environment, this would almost never happen. Or even if its the case, we'd actually want those rows to be rejected.
This can work out to be a serious impediment to development/unit testing. It prevents the developer from seeing whether or not his his mapping is behaving appropriately for the happy flow functionality.
One way of achieving this can be to work using a Mapping Variable indicating the Environment. The developer can run using a value like 'D' or something, indicating a different environment than Production, with 'P' (production) being the default value.
Now in the mapping, just about where you decide to reject a record based on different type of conditions, you could put an AND condition involving this mapping variable, e.g. ....AND $$MAPP_ENV = 'P'
Now, the expression would return true only in the Production environment, and therefore would work as expected. In Dev though, this expression would return false and would not reject that row.
Now, to be able to satisfy the db constraints for the fact table so that the row is actually inserted, you'd need to use some placeholder convention. One of the approaches can be to use an outlier value as the foreign key value. For Example, for customer id, keep a -1 in dimension table, meaning "Undefined". And, in all such dev cases, send -1 to the fact.
It would serve both the purposes, tell your fact table that there is something diff about that row, and still inserting a row in there, so that the testing for the rest of the columns is not stopped because of one foreign key missing out.
No comments:
Post a Comment