Normalize E-mail Addresses

Documents can contain e-mail addresses in many formats, and often the name of the sender or recipient is contained in the same metadata field as their e-mail address.

The EmailAddressNormalisation task searches metadata fields for the names and e-mail addresses of e-mail senders and recipients. It then writes the information back to the document in a standard format. For named e-mail addresses ("Name" <name@domain.com>), the task separates the name from the address. The task also converts all e-mail addresses to lower-case.

For example, a document might include the following field:

<To>"One, Some" <Someone@Somewhere.com>, <user.name@domain.com>, "Else, Someone" < SomeoneElse@Somewhere.com ></To>

The EmailAddressNormalisation task reads this information and adds the following fields to the document:

<to_email>someone@somewhere.com</to_email>
<to_email>user.name@domain.com</to_email>
<to_email>someoneelse@somewhere.com</to_email>
<to_name>One, Some</to_name>
<to_name/>
<to_name>Else, Someone</to_name>

As shown in the previous example, when an e-mail address does not have an associated name, an empty name field is added to the document. This is necessary because the order of the fields in the document is the only way to determine which name belongs with which e-mail address. The first e-mail address is associated with the first name, the second e-mail address with the second name, and so on.

This means that if your source field does not contain any names:

<To>Someone@Somewhere.com, SomeoneElse@Somewhere.com</To>

The task writes the following fields to the document:

<to_email>someone@somewhere.com</to_email>
<to_email>someoneelse@somewhere.com</to_email> <to_name/> <to_name/>

You can configure EmailAddressNormalisation as a Pre or Post task. For example:

[ImportTasks]
Post0=EmailAddressNormalisation:EmailAddressNormalisationSettings

[EmailAddressNormalisationSettings]
FieldNameRegex="To","From","Cc","Bcc"
AddresseeFieldName="to_name","from_name","cc_name","bcc_name"
EmailFieldName="to_email","from_email","cc_email","bcc_email"

The Post0 task runs e-mail address normalisation using the settings in the [EmailAddressNormalisationSettings] section. The FieldNameRegex parameter specifies a list of regular expressions that identify the fields to process. The AddresseeFieldName and EmailFieldName parameters specify the names of the fields to add to the document. CFS adds the name of the sender or recipient to the addressee field and their e-mail address to the e-mail field.