One of the most important data quality functions consists in the
identification of data records which refer to the same business object, e.g. the same business partner or the same product. This requirement arises in a wide range of contexts:
- Identification of duplicates in the user database
- Overlaps between different third-party lists in customer acquisition
- Overlaps when merging information from different source systems for the purposes of data integration
- Assignment of information from reference databases
In addition to this, there are business requirements which have the goal of creating "specific" groups of business objects and primarily serve the needs of information acquisition and data enhancement.
- A typical example in the B2C environment is grouping according to household:
Furthermore, data and information gaps can be closed through data enhancement. This can concern demographic or geographic data made available by third-party suppliers. For example:
- All residents in a household can be grouped together within the framework of geographic information systems.
By means of the
mailBatch function, Uniserv offers a productive and high-performance tool for finding duplicates and for "specific" groups.
In many cases, however, the identification of similar records and the creation of groups based on this is only half the battle. In the last analysis, it often involves the
consolidation of a variety of information from the members of a found group into aggregate information, which is then merged e.g. in a master record representing the combined information from all the records of the group.
The
consolidate function serves this purpose. It enables rules to be defined, which are then applied to each duplicate group in an automated process.
In this respect, the
consolidate function supports the two modes "
enhance" and "
aggregate":
- Enhance means that information from the identified master record is transferred to all members of the group. This can concern e.g. a reference number, an e-mail address or other information which can be used to enhance the records.
- Aggregate means that information from the group members is combined and stored in a master record. This permits the following:
- Sales which are distributed over several records can be accumulated.
- The most complete information can be selected from several records,
e.g. the full first name instead of the abbreviated first name
- Partial information can be linked to aggregate information
consolidate is therefore an optimum add-on for mailBatch. However, the use of
mailBatch is not a prerequisite for
consolidate. In simple cases, in which the records referring to the same business object can be identified by a key value,
consolidate can be configured in such a way that it creates the groups itself on the basis of key values.
consolidate has a wide range of applications. They extend from transferring information from a reference database, such as communication data or risk assessments, to linking records from different systems through key relationships and to creating a master record from several records according to complex rules.