UNISERV GmbH

   
Rastatter Straße 13   
75179 Pforzheim
Germany   
Tel. +49 (0) 7231 / 936 - 0   
Fax +49 (0) 7231 / 936 - 2500

FAQs about Merge-Purge and Duplicate Check

 

What is the maximum volume of addresses that can be processed with mailBatch?

The maximum address volume as far as technology is concerned is higher than any amount that occurs in practice. Of course, the amount also depends on what resources are available (disk drive and motherboard).

In practice, the total amount of addresses to be processed is not that significant for a comparative checking program, since these can possibly be distributed to several countries. Important is that the processing of the addresses is performed for each country without segmentation.

Useful as a point of reference may be that there are users of mailBatch who regularly process more than 100 million addresses from one country without segmentation in a single comparative check run.

 

Do larger address files have to be segmented for processing in mailBatch?

No! Segmentation is a technique that usually has to be used by systems with simple technologies in order to achieve acceptable performance even with rather large address stocks when comparing n:n. The great disadvantage of segmentation is that usually no duplicate recognition takes place between segments. Here there are so-called "blind spots", in which usually not an inconsiderable number of duplicates remain unrecognized.
 

Does mailBatch always have to be run combined with a postal address check?

No! Generally you can run a comparison check completely without a postal address check. You can even compare sets of data without postal address information. Nevertheless, in many cases it is advisable to run a postal address check and correction together with the comparative check (see also postal address check). Thus, duplicates can be discovered arising from incorporation or the renaming of streets that would not otherwise be recognized for sure.

 

How can the extraordinary performance of mailBatch and, at the same time, the acknowledged higher result quality in contrast to other systems be explained?

In contrast to the usual offers on the market, Uniserv has developed two technologically completely different methods for the two tasks – on the one hand the mass comparative check n:n, on the other hand the online individual comparative check 1:n. Both methods pursue the same goal (error-tolerant recognition) and deliver comparable results, but are designed especially for the particular application environment. Most other providers supply the same method for both fields of application, using, however, separate "packaging". For small amounts of addresses this is not very crucial, but the larger the volume of addresses, the more the issue of performance becomes relevant.
 

The parametrization of mailBatch allows very great flexibility. Doesn't this inevitably lead to a certain complexity in the application?

The standard parameters for typical individual cases were delivered along with the product. You can either use these directly or as model for your own parameters. Furthermore, we provide you with interactive tools that will support you in creating comparative parameters. Here you can develop your own ideas with regard to the comparative rules and immediately check which address constellations are found as duplicates and with certainty, which are not.

In addition we offer intensive product training seminars, and of course our experts will take pleasure in advising you in "customizing" this product to meet your special requirements.

 

Can data sets that only contain names but not addresses be checked with mailBatch?

Of course. You yourself can establish under what conditions two data sets should be counted as duplicates. Nevertheless, additional information besides the name should be available in order to surely recognize duplicates e.g. on the personal level. For example in the insurance field, comparative checks on the basis of last name, first name and date of birth are not unusual.
 

What is meant by the clustering that is possible with mailBatch?

This method is often used within the framework of data warehouse projects or for building up corporation address databases. With clustering no elimination of the found duplicates takes place. Instead, different "cluster views" of the address database and "cluster identifications" are automatically formed and entered into the address database.

Typical clusters for consumer addresses are the person, the household or all addresses in one building. For business addresses often the department/contact person of a company or all contact persons within a company are used for cluster formation.
 

Why is it recommended that an online managed address database be checked periodically with mailBatch, even when a tool in the online application is used to recognize duplicates?

In most online systems it is usual that when entering new addresses or changing them, the user receives a message that there is a suspicion of duplicates which then can be accepted or rejected. Through faulty operation or because conclusive clarification is not possible in the short term, even in such systems it comes to unconscious or willing acceptance of duplicate or multiple customer addresses. These cases can be tracked down with a periodical mailBatch run and can be clarified asynchronously.
 

Are there any special functions for effectively conducting the periodical mailBatch comparative check of an address database that is maintained online?

Yes! For this there are two essential functions:

First, the index database of mailRetrieval can be read and processed directly in mailBatch; an important argument for heightened performance and integrated processing.
Second, it is possible to focus the duplicate recognition solely on the new or changed addresses that have been added to the database since the last mailBatch processing. Thus, duplicates which have already been checked and consciously been left in the file will not be shown with every comparison. That makes manual follow-up processing considerably easier.

 

Quick Links

News

Uniserv listed in the Magic Quadrant for Data Quality Tools 2007 more... 
________________________

Postal Validation:
Three new postal expert systems available: Rep Czech, Hungary and Slovakia. Test it at our live demo!

Batch Check

Sequential detection and elimination of duplicates

Scrolling
page 9 - 9

UNISERV GmbH

   
Rastatter Straße 13   
75179 Pforzheim
Germany   
Tel. +49 (0) 7231 / 936 - 0   
Fax +49 (0) 7231 / 936 - 2500


www.uniserv.com  | 
12.05.2008
Sitemap | Webmaster | Disclaimer | Privacy Policy | Imprint | © 2008 Uniserv GmbH