Eduction Grammar Reference

The following tables describe the grammar files that are available in the IDOL PCI Package, and the entities that each provides.

In the entity names, the abbreviation CC refers to a two-letter country code. For a list of available country codes, see Country Codes.

TIP: You can use the Eduction parameter EntityN to specify which entities you want to extract. This parameter accepts wildcards, so you can extract entities of a specific type for all supported countries or languages. For example, to match names for all countries specify a value of pci/name/??.

NOTE: Many entities return components, in addition to the full match. For more information, and examples, see Components.

date.ecr

Entity Description
pci/date/nocontext/eng

A calendar date, written numerically or using words, without context. For example "01.03.1918", or "01/01/2020".

This entity returns dates in the normalized ISO-8601 format YYYY-MM-DD. Partial dates without a day are formatted YYYY-MM.

You can turn off normalization by setting normalize_dates=false in the pci_postprocessing.lua script. This option can improve performance when you do not need normalization.

pci/date/paymentcard/context/eng A card date, with context. For example "Expires end: 01/20".
pci/date/paymentcard/nocontext/eng A card date without context. For example "01/20".
pci/date/paymentcard/landmark/eng A card date landmark. For example "Expires end".

name.ecr

Entity Description
pci/name/CC

A full personal name, in title case or upper case.

This entity returns the names in a normalized format, in the form GIVEN NAME SURNAME, for example JOHN SMITH.

This entity returns components. See Components.

pci/name/landmark/CC A full name landmark. For example "name".
pci/name/given_name/context/CC A given name, with context. For example "Forename: John".
pci/name/given_name/nocontext/CC A given name, without context. For example "John".
pci/name/given_name/landmark/CC A given name landmark. For example "Forename".
pci/name/surname/context/CC A surname with context. For example "Surname: Smith".
pci/name/surname/nocontext/CC A surname without context. For example "Smith".
pci/name/surname/landmark/CC A surname landmark. For example "Surname".
pci/name/pre_title/CC A title that precedes a name. For example "Ms".
pci/name/post_title/CC A title that follows a name. For example "Esq".
pci/name/title_surname/CC A title and surname. For example "Mr. Smith".

name_cjkvt.ecr

Entity Description
pci/name/CC

A full personal name, in romanized text or CJKVT native script. Romanized names can be in title case or upper case, and can be in the order given name surname or surname given name. CJKVT native script names must be surname given name. For Japanese, either form can include honorifics.

This entity returns the names in a normalized format, in the form GIVEN NAME SURNAME, for example KEIKO NAKAMURA.

You can turn off normalization by setting normalize_names=false in the name_stoplist.lua script. You can also turn off score adjustment, by setting rescore_names=false in the name_stoplist.lua script. This option can improve performance when you do not need the normalization or score refinement.

This entity returns components. See Components.

pci/name/cjkvt/CC

A full personal name in CJKVT native script. For example "山田直樹".

This entity returns components. See Components.

pci/name/latin/CC

A romanized full personal name. For example "Hiroshi Tanaka-san".

This entity returns components. See Components.

pci/name/foreign/jp A full foreign person name in Katakana. For example "エミー・ネーター" (Emmy Noether) and "フランクリン・D・ルーズベルト" (Franklin D. Roosevelt)
pci/name/landmark/CC A full name landmark. For example "名前".
pci/name/given_name/context/cjkvt/CC A given name in CJKVT native script, with context. For example "名前: 恵 ".
pci/name/given_name/nocontext/cjkvt/CC A given name in CJKVT native script, without context. For example "恵 ".
pci/name/given_name/nocontext/cjkvt_spaced/CC A given name in CJKVT native script, separated by spaces, and without context. For example "建 國". This entity is primarily to allow you to create patterns that match alternative name formats.
pci/name/given_name/context/latin/CC A romanized given name, with a context landmark in CJKVT native script. For example "名前: Keiko".
pci/name/given_name/nocontext/latin/CC A romanized given name, without context. For example "Keiko".
pci/name/given_name/context/cjkvt/foreign/jp A non-Japanese given name in Katakana, with context. For example "名称ジャン" (Name Jean).
pci/name/given_name/nocontext/cjkvt/foreign/jp A non-Japanese given name in Katakana, without context. For example "アレクサンドラ" (Alexandra).
pci/name/given_name/nocontext/cjkvt_spaced/foreign/jp A non-Japanese given name in spaced Katakana, without context. For example "ア ン ト ニ オ"  (Antonio).
pci/name/given_name/context/CC A given name in romanized text or CJKVT native script, with a context landmark in CJKVT native script. For example "名前: 恵 ".
pci/name/given_name/nocontext/CC A given name in romanized text or CJKVT native script, without context. For example "直樹 ".
pci/name/given_name/landmark/CC A given name landmark in CJKVT native script. For example: "名前"
pci/name/surname/context/cjkvt/CC A surname in CJKVT native script, with context. For example "名字: 山田".
pci/name/surname/nocontext/cjkvt/CC A surname in CJKVT native script, without context. For example "山田".
pci/name/surname/nocontext/cjkvt_spaced/CC A given name in CJKVT native script, separated by spaces, and without context. For example "欧 阳". This entity is primarily to allow you to create patterns that match alternative name formats.
pci/name/surname/context/latin/CC A romanized surname, with a context landmark in CJKVT native script. For example "名字: Nakamura".
pci/name/surname/nocontext/latin/CC A romanized surname, without context. For example "Nakamura".
pci/name/surname/context/cjkvt/foreign/jp A non-Japanese surname in Katakana, with context. For example "姓: シン" (Surname: Singh).
pci/name/surname/nocontext/cjkvt/foreign/jp A non-Japanese surname in Katakana, without context. For example "アッ=サカフィー" (al-Saqafi).
pci/name/surname/nocontext/cjkvt_spaced/foreign/jp A non-Japanese surname in spaced Katakana. for example "ア ッ ツ ォ ー リ" (Azzoli).
pci/name/surname/context/CC A surname in romanized text or CJKVT native script, with a context landmark in CJKVT native script. For example "名字: 山田".
pci/name/surname/nocontext/CC A surname in romanized text or CJKVT native script, without context. For example "山田".
pci/name/surname/landmark/CC A surname landmark in CJKVT native script. For example "名字".
pci/name/pre_title/nocontext/CC

A title that precedes a name in romanized text. For example "Ms".
pci/name/post_title/nocontext/latin/CC A title that follows a name in romanized text. For example "Junior".
pci/name/post_title/nocontext/cjkvt/CC A title that follows a name in CJKVT native script. For example "さん".
pci/name/post_title/nocontext/CC

A title that follows a name in romanized text, or CJKVT native script. For example "Junior" or "さん".
pci/name/title_surname/latin/CC A title and surname in romanized text. For example "Dr Tan".
pci/name/title_surname/cjkvt/CC A title and surname in CJKVT native script. For example "譚医生".
pci/name/title_surname/cjkvt/foreign/jp A non-Japanese title and surname in Katakana. For example "ドルフマイスター君" (Dorfmeister-kun).
pci/name/title_surname/CC A title and surname in romanized text, or Japanese script for Japan (jp). For example "Dr Tan" or "譚医生".

pci_numbers.ecr

Entity Description
pci/magstripe/context/magstripe

Magnetic stripe data with context. For example "Magstripe: %B5641821234567890122^SMITH/JOHN A. ^2011126000000000000000000000?c".

NOTE: To ensure that the entities in this grammar perform correctly, set your TangibleCharacters configuration to include the following characters: %;. For more information, see Configure Tangible Characters.

pci/magstripe/nocontext/magstripe

Magnetic stripe data without context. For example "%B5641821234567890122^SMITH/JOHN A. ^2011126000000000000000000000?c".

NOTE: To ensure that the entities in this grammar perform correctly, set your TangibleCharacters configuration to include the following characters: %;. For more information, see Configure Tangible Characters.

pci/magstripe/landmark/magstripe A magnetic stripe landmark. For example "Magstripe".
pci/pan/context/pan A Primary Account Number with context. For example "PAN: 4485221211756505".
pci/pan/nocontext/pan A Primary Account Number without context. For example "4485 2212 1175 6505".
pci/pan/landmark/pan A Primary Account Number landmark. For example "PAN".
pci/pin/context A card Personal Identification Number with context. For example "PIN: 1234".
pci/pin/nocontext A card Personal Identification Number without context. For example "1234".
pci/pin/landmark A card Personal Identification Number landmark. For example "PIN".
pci/pin_block/context An encrypted or unencrypted PIN block with context (either base-64, base-16 or base-2). For example "PIN block: BABCDEFGHIJ=".
pci/pin_block/nocontext An encrypted or unencrypted PIN block without context (either base-64, base-16 or base-2). For example "BABCDEFGHIJ=".
pci/pin_block/landmark A PIN block landmark. For example "PIN block".
pci/printed_security_code/context/cav2 A CAV2 security code with context. For example "CAV2: 123".
pci/printed_security_code/landmark/cav2 A CAV2 security code landmark. For example "CAV2".
pci/printed_security_code/context/cid

A CID security code with context. For example "CID: 1234".

pci/printed_security_code/landmark/cid A CID security code landmark. For example "CID".
pci/printed_security_code/context/cvc2 A CVC2 security code with context. For example "CVC2: 123".
pci/printed_security_code/landmark/cvc2 A CVC2 security code landmark. For example "CVC2".
pci/printed_security_code/context/cvv2

A CVV2 security code with context. For example "CVV2: 123".

pci/printed_security_code/landmark/cvv2 A CVV2 security code landmark. For example "CVV".
pci/printed_security_code/nocontext Any of CAV2, CID, CVC2 or CVV2 security code without landmark. For example "123".
pci/securities/cusip_internal/context

A CUSIP number from the range reserved for internal use, with context.

A post-processing script is available to set the score for this entity.

pci/securities/cusip_internal/nocontext

A CUSIP number from the range reserved for internal use, without context.

A post-processing script is available to set the score for this entity.

pci/securities/cusip_internal/landmark

A landmark for CUSIP numbers from the range reserved for internal use.

pci/securities/cusip_ppn/context

A CUSIP number that contains characters from the insurance industry that denote private placement numbers, with context. This entity includes all the values from pci/securities/cusip/.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/cusip_ppn/nocontext

A CUSIP number that contains characters from the insurance industry that denote private placement numbers, without context. This entity includes all the values from pci/securities/cusip/.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/cusip_ppn/landmark A landmark for a CUSIP number that contains characters from the insurance industry that denote private placement numbers.
pci/securities/cusip/context

United States Committee on Uniform Securities Identification Procedures (CUSIP) number, with context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/cusip/nocontext

United States Committee on Uniform Securities Identification Procedures (CUSIP) number, without context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/cusip/landmark

A landmark for a United States Committee on Uniform Securities Identification Procedures (CUSIP) number.

pci/securities/figi/context

A Financial Instrument Global Identifier (FIGI) number, with context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/figi/nocontext

A Financial Instrument Global Identifier (FIGI) number, without context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/figi/landmark A Financial Instrument Global Identifier (FIGI) number landmark.
pci/securities/isin/context

An International Securities Identification Number (ISIN), with context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/isin/nocontext

An International Securities Identification Number (ISIN), without context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/isin/landmark An International Securities Identification Number (ISIN) landmark.
pci/securities/sedol/context

A Stock Exchange Daily Official List (SEDOL) number, with context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/sedol/nocontext

A Stock Exchange Daily Official List (SEDOL) number, without context.

A post-processing script is available to validate the checksum and set the score for this entity.

pci/securities/sedol/landmark A Stock Exchange Daily Official List (SEDOL) number landmark.
pci/securities/wkn/context A Wertpapierkennnummer (WKN), with context.
pci/securities/wkn/nocontext A Wertpapierkennnummer (WKN), without context.
pci/securities/wkn/landmark A Wertpapierkennnummer (WKN) landmark.
pci/service_code/context A service code with context. For example "Service code: 123".
pci/service_code/nocontext

A service code without context. For example "123".

pci/service_code/landmark

A Service code landmark. For example "Service code".