Bidirectional data normalization
On various platforms and in relational databases (RDBMS), data can be stored in different bidirectional layouts. When data is transferred from one system to another or is used in data comparison, it must first be transformed to one common layout. Otherwise the logic based on data manipulation might provide incorrect results.
When moving data from one database or platform to another, transformations are performed using the following five attributes:
- Ordering Scheme
Ordering scheme determines the order in which the text is stored.
- Text Orientation (also known as Base Text Direction)
Text orientation specifies the direction governing most of the text. After segmenting the text into directional runs, runs are laid out for presentation using the text orientation specified for the data.
The text orientation attribute must set a default for alignment.
- Symmetric Swapping
Certain characters have an implied directional meaning, such as less-than and greater-than signs, or various forms of parentheses. Symmetric swapping specifies whether such a character is displayed with the glyph of its symmetrical equivalent when this character appears in a right-to-left directional run.
- Text Shaping
Text shaping is specific to the Arabic language. Because Arabic is a scripted language, shape of an individual character is sometimes determined by the character that precedes or follows it. This attribute specifies how each Arabic letter is stored; using either an intrinsic code point or a specific code point. An intrinsic code point represents all possible shapes of this letter, leaving determination of the proper shape for later. A specific code point representing the shape that is used for presentation of this letter at this place in the text.
- Numerals (also known as Numeric Shaping)
This attribute specifies which form of digits to use when presenting regular digits (encoded as 0x30 to 0x39 in ASCII).
A combination of specific values for the preceding five attributes constitutes a bidirectional layout.
Each time data is retrieved from an external RDBMS that uses a layout different from the Integration Composer Bidi layout, it must be transformed to the Integration Composer Bidi layout. This transform must be done before the user can work with it in Integration Composer.
Consequently, when you define a new data source or data schema, you must define the attributes for transforming the data in the data source to the formats needed in Integration Composer.
The following table describes the default values specified for the bidirectional attributes and the options available for Integration Composer.
Attribute | Default | Possible Values |
---|---|---|
Ordering Scheme | Implicit | Implicit: (also
known as Logical) The text is stored in the same order as it is spoken
and, usually, entered. Visual: The text is stored ready for presentation. |
Text Orientation | LTR | (This feature is
also known as Base Direction.) Left-to-Right: The directional runs are laid out from left to right. This direction is appropriate for text that is mostly written with left-to-right scripts but might contain words or phrases written in right-to-left scripts. The default alignment is set to left. Right-to-Left: The directional runs are laid out from right to left. This direction is appropriate for text that is mostly written with right-to-left scripts but might contain numbers, words, or phrases written in left-to-right scripts. The default alignment is set to right. Contextual Left-to-Right: The required direction of the text is determined based on the text itself. If the first strong character belongs to a left-to-right script, the direction resolves to Left-to-Right. If the first strong character belongs to a right-to-left script, the direction resolves to Right-to-Left. If there is no strong character in the text, the direction resolves to Left-to-Right. Contextual Right-to-Left: The required direction of the text is determined based on the text itself as described for the preceding Contextual Left-to-Right value. However, in this case, if there is no strong character in the text, the direction resolves to Right-to-Left. |
Symmetric Swapping | Yes | Yes: Replace characters
with their symmetric equivalent in right-to-left runs. No: Do not replace characters with their symmetric equivalent in right-to-left runs. |
Text Shaping | Nominal | Nominal: Arabic
letters are encoded with intrinsic code points (in the "06xx" range
for Unicode). Shaped: Arabic letters are encoded as presentation forms which can be Initial, Middle, Final, or Isolated. Initial Shaping: Arabic letters encoded with intrinsic code points must be transformed to Initial shapes for presentation. Middle Shaping: Arabic letters encoded with intrinsic code points must be transformed to Middle shapes for presentation. Final Shaping: Arabic letters encoded with intrinsic code points must be transformed to Final shapes for presentation. Isolated Shaping: Arabic letters encoded with intrinsic code points must be transformed to Isolated shapes for presentation. |
Numerals | Nominal | (This feature is
also known as Numeric Shaping.) Nominal: Display digits as Arabic-European digits. National: Display regular digits as Arabic-Indic digits (National format). Contextual: Display regular digits as Arabic-Indic digits if following Arabic letters. Otherwise, display as Arabic-European digits. |