Overview
At UTA the core entities in this authoritative data product are curated by a central team called the data management office, or DMO. The resulting clean data created by the DMO is then made available to application developers via the service APIs described elsewhere in this documentation.
This means that the DMO is a "gatekeeper" that stewards the creation and update of parties, roles, and relationships before they are made available to applications.
That word -- gatekeeper -- is a red flag to some readers. To understand why we made this choice, instead of letting users self-service new entity creation, it may help to understand some background.
User-managed authoritative data without controls will necessarily degrade
Extensive experience with authoritative data management in the talent and sports agency verticals has shown us that allowing the "crowdsourcing" of core data entities leads to rapid degradation of data quality. As duplicates, incomplete data, and errors are introduced, then confidence in the data decreases, which in turn causes frustrated and hurried users to create what they believe to be new, "cleaner" entities, -- which in turn leads to creating more duplicates, incomplete data, and errors!
How do we solve this? Before we outline solutions, we must first understand our customer.
Key customers: Sellers and Aspiring Sellers
We must take an objective and non-judgemental view of the customers we are serving.
The bulk of the user population at a company like UTA can be divided into two groups:
- those with a selling responsibility (agents), and
- those who aspire to a selling responsibility, (assistants)
This population earns the revenue for the company; therefore it makes sense that there should be more of them at UTA.
Sellers are experts at optimizing their time around serving their clients and buyers and thus creating revenue. Sellers who do not manage time well don't last long. They have learned to be fanatical about not getting involved in "backoffice" (non-selling) activity unless absolutely forced.
Aspiring sellers (assistants) are judged on how effective they are at assisting their direct managers -- the sellers -- in doing their sales and service jobs.
Also, aspiring sellers tend to have a short tenure: they have no intention of staying in an assistant role for very long. The high churn of this population increases the training burden dramatically.
It is exceedingly uncommon for technical professionals to understand a population of sellers -- so you get to break the mold! Technical professionals tend to picture end users more like analysts, or knowledge workers. These people definitely exist at UTA, but they are a small minority of the population.
Given this background about our key customers, let's look at possible solutions to the entropy problem created when we allow them to create authoritative data:
Solution 1: Train users to create good data
Data management is a specialized skill. To do the job well, one must not only understand the nature of data and connections between data, but also the domain in which the data is created and used. To do data management at UTA means one must have knowledge of the entertainment and sports domain.
We discussed seller time management earlier. Sellers will not participate in learning the arcane nuances of data management, as this is a poor use of their time. They will in every case delegate the job to the assistant.
Remember the incentives driving assistants -- particularly, the focus on supporting the direct manager. In practice, if the creation of a duplicate database record gets the job done for their manager, then the assistant will create the duplicate record without hesitation. Given the choice between disappointing either the technology group or the person in control of their career, it should be clear who loses.
With no incentive structure around data management, and a high turnover in the population, data management training will be ineffective for most of our customers.
Solution 2: Allow users to create new entities as "pre-stewarded", then merge later
On the surface this appears to be a "win-win" -- allow the users to create data at will (thus letting them move quickly), and behind the scenes tag it "pre-stewarded," queuing the record(s) up for review by the DMO. Should the users create bad data, it will eventually be cleaned up.
Using this approach, the DMO renders one of three decisions:
- Allow the record to be mastered as-requested, re-classifying it as "Stewarded"
- Modify the record to clean-up inaccuracies or incomplete values, then re-classify it as "Stewarded"
- Reject the record as a duplicate, then decide how much to merge between the invalid and valid records
These steps are straightforward to implement inside the authoritative data service itself, and indeed, we have a built-in mechanism for recording this.
However, the changes in authoritative data are not propagated out to other systems directly. Indeed, the authoritative data product should have no knowledge of what sort of system downstream is consuming its data; only that the authorization rules are being met.
What the authoritative data service does is publish endpoints and events that allow downstream systems to see the changes and dupe decisions. It is the responsibility of the downstream system to listen for those changes, then use the information to update their local transactional data if needed.
Experience has shown that should we allow the end user to create a pre-stewarded record, there is a high likelihood that the DMO will later reject the record as a duplicate. This may lead to some unpredictable effects, particularly if the user has gone forward and created a transaction with the pre-stewarded record. For example:
- User requests a buyer called "Fabrikam", gets the "pre-stewarded" partyId, then creates a transaction in a touring system, and sends out a contract.
- At the same time, a booking is created in the financial system under the pre-stewarded entity "Fabrikam."
- Later, the DMO verifies that the "Fabrikam" the user wants is a duplicate of "Fabrikam, Inc." which has other transactions attached to it. The DMO rejects the new record by recording an
IS_DUPLICATE_OFbetween the invalid and valid party records. - The touring system gets word via the change mechanism that its transactions with "Fabrikam" should now use the data and id from "Fabrikam, Inc." It must be sure to change its transaction since the party data is invalid. But the buyer and client have already received a contract with a different buyer name. The music team will need to decide what happens here.
- Similarly, the booking is now recorded in the financial system under a different partyId and name. The finance team will need to decide what should happen here.
This is all technically possible. But the question is:
Would it be simpler and less error-prone to avoid cascading stewardship decisions to downstream systems by simply giving the DMO a reasonable opportunity to create the correct record once at the outset? We would be trading off a little bit of user convenience for a lot of downstream system complexity.
This leads us to the third solution:
Solution 3: Require most authoritative data entities and relationships be created and updated by the DMO before transactions can be created
This is the approach we intend to take at UTA. It will require a great deal of focus on the customer experience, but the data quality benefits will pay off.
This means that all systems must start their authoritative data creates and updates with the authoritative data product and must not allow their users to create that data on their own in the local system.
To understand the approach, read on about the transaction hub.