The Martian presents,
Bidirectional text embedding and override
Howdy Earthlings! “-----“ here again! *sigh* Yeah, some of you Earth people still confuse me with being Martian. For the record AGAIN, I come from the planet “-----“ (sorry, our names just don’t translate to English or any other earthly language) Now, I have joined with the Office Global Experience team, because people on your planet need to know how to take advantage of the great global features that Office has to offer. I know, it has only been two short weeks on Earth, since I last shared my wisdom with you in my post of ‘What happens when the same font is used to display multi-language text?’. But, do you know how many years have gone by on some other planet, in some other Galaxy far far away?! No time to be wasted, besides, we “------“ians just have so much brain power to share. If I don’t offload my knowledge frequently enough, it could literally overflow my neural system. (A few seconds for you to grasp this…) Alright, I am being a bit dramatic here. I am just too excited to share with you the technical insights of Office 2010. Let me walk fly you through how Office handles bidirectional text embedding and override. Don’t worry, if you have no clue what this means right now, I’ll be able to wirelessly beam the concept into your brain. Fasten your seatbelt. Houston, we are ready to take off…
Howdy Earthlings!
“-----“ here again! *sigh* Yeah, some of you Earth people still confuse me with being Martian. For the record AGAIN, I come from the planet “-----“ (sorry, our names just don’t translate to English or any other earthly language) Now, I have joined with the Office Global Experience team, because people on your planet need to know how to take advantage of the great global features that Office has to offer.
I know, it has only been two short weeks on Earth, since I last shared my wisdom with you in my post of ‘What happens when the same font is used to display multi-language text?’. But, do you know how many years have gone by on some other planet, in some other Galaxy far far away?! No time to be wasted, besides, we “------“ians just have so much brain power to share. If I don’t offload my knowledge frequently enough, it could literally overflow my neural system. (A few seconds for you to grasp this…) Alright, I am being a bit dramatic here. I am just too excited to share with you the technical insights of Office 2010. Let me walk fly you through how Office handles bidirectional text embedding and override. Don’t worry, if you have no clue what this means right now, I’ll be able to wirelessly beam the concept into your brain. Fasten your seatbelt. Houston, we are ready to take off…
Microsoft is a leader in the software industry when it comes to offering comprehensive bidirectional text support for its products. Today, with every new feature that they add to Microsoft Office applications, we continue to ensure that Bidirectional support is built into the feature at design time. In addition, we continue to look for opportunities to further enhance this support to meet customers’ needs. If you are a Microsoft Word user who often creates and edits text in a right-to-left language (like Arabic or Hebrew), you’re probably aware of the explicit directional tagging of text; this is where applications rely on the language of the keyboard to determine the direction of the text. This design provides users a simple method to control the layout of neutral characters (such as SPACE) and numbers regardless of the surrounding characters.
In Word 2010 we added two new features that will extend this control of Bidirectional text. Are you ready to super activate your brain for these exciting new advanced features called Bidirectional Text Embedding and Bidirectional Text Override?
Embedding is an advanced feature specified in the Unicode Bidirectional Algorithm. This feature solves the problem of trying to insert a sentence into a paragraph with a different directionality. For example, inserting an English quote in the middle of a right-to-left paragraph.
For clarity, in the examples below lower case represents English text characters and UPPER CASE represents right-to-left characters.
Assume that you would like to write the following line in a left-to-right paragraph according to the keyboard input sequence:
= he said “I WILL CALL sandy TODAY”
If you try to write this sentence today without the help of embedding, the text will display as follows:
he said “LLAC LLIW I sandy YADOT”
●──────► ◄─────────● ●───► ◄────●
1 2 3 4
What you really would like to see is the following:
he said “YADOT sandy LLAC LLIW I”
●──────► ◄───● ●───► ◄─────────●
1 4 3 2
Using Office 2010, this can easily be done by inserting two special hidden control characters before and after the sentence you would like to embed. In the example above it would be the quote. These control characters are the “Right-to-Left Embedding” (RLE) character at the beginning of the quote and the “Pop Directional Formatting” (PDF) character at the end of the quote. Your keyboard input text will be:
= he said [RLE]“I WILL CALL sandy TODAY”[PDF]
*To insert these special characters, please refer to the instructions at the end of this document.
Override is another advanced feature that is specified in the Unicode Bidirectional Algorithm. This feature allows users to force the layout of a group of characters to a specific direction regardless of their classification. For example, you want to write a part number and you want to ensure that all the characters flow left-to-right. This number could consist of numerals and right-to-left characters. Without using the overrides, the right-to-left letters will flow right-to-left, while the numbers will flow left-to-right and both will influence the layout of the surrounding text.
For clarity, in the examples below lower case represents English text characters and Upper case for right-to-left characters.
For example, you want to write the following sentence according to the keyboard input sequence in a right-to-left paragraph:
= PRODUCT NUMBER IS ABC632XPS
Without the override feature, the display will be as follows:
SPX632CBA SI REBMUN TCUDORP
◄─●●─►◄───────────────────●
3 2 1
What you would like to see is:
ABC632XPS SI REBMUN TCUDORP
●───────► ◄───────────────●
2 1
Using Word 2010, this can easily be done by inserting two special hidden control characters before and after the part number. In the example above, you need to precede the part number with the “Left-to-Right Override” (LRO) control character and the “Pop Directional Formatting” (PDF) character at the end of the part number. Your input text will be:
= PRODUCT NUMBER IS [LRO]ABC632XPS[PDF]
*To insert these special characters, please refer to the instructions below.
Control character name
Unicode abbreviation
Character code
LEFT-TO-RIGHT EMBEDDING
LRE
202A
RIGHT-TO-LEFT EMBEDDING
RLE
202B
LEFT-TO-RIGHT OVERRIDE
LRO
202D
RIGHT-TO-LEFT OVERRIDE
RLO
202E
POP DIRECTIONAL FORMATTING
PDF
202C
You can use the method above to insert any Unicode character using its code. You can also do the opposite. Position the cursor after any character in your document and press ALT+X to display its Unicode code.
Well I hope you enjoyed our journey into the Bidirectional text world, and that your brain neurons are firing at full blast…. … …
Martian
I would have talked to you about these features in binary, but thanks to Ayman Aldahleh for helping me to translate it into English! He is truly a multi-lingual Earthling! Ayman is the Development Manager of the Office Global Experience Platform (GXP) team at Microsoft. His team specifically focuses on making sure the Office applications are ‘world-ready’! Ayman is originally from Palestine, but he and his team work in Redmond, Washington, USA, The Earth, The Solar System, The Milky Way Galaxy, The Universe. He has been working at Microsoft since 1991. Prior to GXP, Ayman had lead and worked in the development teams that enabled several Microsoft products in multiple languages including complex scripts. In his spare time, he enjoys parenting, photography, travel, cooking and various outdoor activities. Assisting Ayman with this article were Ziad Khalidi, Gwyneth Marshall and Murtuza Shakir.
I would have talked to you about these features in binary, but thanks to Ayman Aldahleh for helping me to translate it into English! He is truly a multi-lingual Earthling! Ayman is the Development Manager of the Office Global Experience Platform (GXP) team at Microsoft. His team specifically focuses on making sure the Office applications are ‘world-ready’! Ayman is originally from Palestine, but he and his team work in Redmond, Washington, USA, The Earth, The Solar System, The Milky Way Galaxy, The Universe. He has been working at Microsoft since 1991. Prior to GXP, Ayman had lead and worked in the development teams that enabled several Microsoft products in multiple languages including complex scripts. In his spare time, he enjoys parenting, photography, travel, cooking and various outdoor activities.
Assisting Ayman with this article were Ziad Khalidi, Gwyneth Marshall and Murtuza Shakir.
The example companies, organizations, products, domain names, email addresses, logos, people and events depicted herein are fictitious. No association with any real company, organization, product, domain name, e-mail address, logo, person, or event is intended or should be inferred.