Welcome to NeoOffice developer notes and announcements
NeoOffice
Developer notes and announcements
 
 

This website is an archive and is no longer active
NeoOffice announcements have moved to the NeoOffice News website


Support
· Forums
· NeoOffice Support
· NeoWiki


Announcements
· Twitter @NeoOffice


Downloads
· Download NeoOffice


  
NeoOffice :: View topic - Arabic
Arabic
 
   NeoOffice Forum Index -> NeoOffice Testing
View previous topic :: View next topic  
Author Message
Shalkar
Red Pill


Joined: Nov 03, 2004
Posts: 5

PostPosted: Thu Nov 04, 2004 3:17 am    Post subject: Arabic

Howdy,
Downloaded NeoOffice/J yesterday and started using it right away on my Powerbook G4, 400mhz, running 10.2.8, and custom Arabic keyboards installed.
When writing in Arabic in the document, the letters wouldn't join. I tried Riwaj, TITUS Cyberbit, Arabic Typesetting, the PakType fonts, and a host of others, including Devalipi fonts. Funny thing is, only the devalipi letters would join properly.

I saved to a pdf to see, and same issue. Crying or Very sad

HOWEVER, I "saved as" to html, got some notice saying certain characters weren't processed, but a html file was output. When I opened that in Firefox, well whaddayaknow. The letters for all of the fonts were now joined properly, except the devalipi ones! Confused

Any input as to what this issue is, and how I might reliably write in Arabic?
Thanks!
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Thu Nov 04, 2004 4:15 am    Post subject:

The problem with the letters not being joined is that (AFAIK) NeoJ uses native Mac OS X font-rendering routines, which seem not to support joining Arabic characters in Windows TTFs. Alternatively, I have seen an explanation somewhere which says Windows TTFs are missing some "resource" or marker to enable Apple's routines to properly join them. The exact cause, or which argument is correct, is beyond my knowledge. Sad [Edit: there's a bit more info at this thread in Apple Discussions.]

Using Apple's Arabic fonts (Geeza Pro, DecoType Naskh, Baghdad, etc.), NeoJ has no problem properly joining the letters (at least on 10.3.x).

(OOo X11 does join the letters in the Windows TTFs because it uses FreeType for rendering instead; I imagine something similar is true for Firefox--its cross-platform/Unix background enables it to render Windows Arabic TTFs properly (although Firefox only seems to support Geeza Pro and not any of the other Mac Arabic fonts!). Aside from these two apps, I've yet to find a Mac app which properly joins the letters in Windows TTFs--and these to do it because they're only "partially" native. The price of a nice, OS X-integrated native app is apparently loss of compatability with Windows Arabic TTFs Sad)

Shalkar wrote:
HOWEVER, I "saved as" to html, got some notice saying certain characters weren't processed, but a html file was output.


This is because the default charset for HTML export is set to ISO-8859-1 (at least if the default OS language/locale is US English), and of course Arabic requires ISO-8859-6 (or the DOS or Windows codepages) or UTF-8. In most cases the export will be successful and be rendered by most browsers, but I did once have to change to UTF-8 to make a few of the Turkish letters in an otherwise English document display.

The default charset can be changed in Tools>Options>Load/Save>HTML Compatibility.

I'm not terribly familiar with the Devalipi fonts (I recognize the name, but don't have them--are they free/shareware?); it's possible that they have a "Mac only" encoding that NeoJ can handle since it uses Apple's routines but Firefox cannot. What happens when you load the page in Safari? (You might want to try both with the current page and after re-exporting it with the proper charset selected.)

Hope this helps a little. We have some other members here who, unlike me, are native speakers and are likely more familiar with the quirks of Arabic under Mac OS X....

Smokey


Last edited by sardisson on Thu Nov 04, 2004 4:31 pm; edited 1 time in total
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Thu Nov 04, 2004 9:07 am    Post subject:

Thanks Smokey for the detailed explanation. I was struggling to explain this.

FYI. I did notice, however, that I introduced a bug in "patch-3" that causes some beginning-form Arabic characters to get swapped with the space immediately preceding it. I have fixed this particular bug and the fix will be in the next Neo/J Alpha 2 patch.

Patrick
Back to top
Shalkar
Red Pill


Joined: Nov 03, 2004
Posts: 5

PostPosted: Fri Nov 05, 2004 3:17 pm    Post subject:

Thanks for the feedback. The html bit makes sense. So do the font-rendering routines section. I would use the Mac Arabic fonts but they do not have certain characters requried in the language I work in, Kazak (similar to uighur and Kirghiz), as far as I know.

I am working too with OO, 1.1.2, using xfree86 and OroboOSX v 0.9, but not Apple's X11 since that doesn't work with 10.2.8. I haven't been successful though with OO to get arabic fonts to work correctly. I'll head over to those forums to cast for help.

The Devalipi fonts are part of the shareware apps found at www.devalipi.com

I will try using Safari and get back to you. (I trash my test files rather quickly otherwise I'd have hundreds in a short period of time).

Where else would you recommend I read more about charsets? I am just getting familir with unicode, and this phrase keeps popping up. Is there NeoOffice/J specifics sites I could try?

Thanks again.
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Sat Nov 06, 2004 8:55 am    Post subject:

Shalkar wrote:
I would use the Mac Arabic fonts but they do not have certain characters requried in the language I work in, Kazak (similar to uighur and Kirghiz), as far as I know.


Check Geeza Pro if you haven't already, using either the Character Palette or something like Unicode Font Info or UnicodeChecker. I know it includes some more obscure Arabic letters/hamza combos than the rest of the Arabic fonts, so it has a larger set in general, but I don't know what's needed for Kazakh. Sad

Shalkar wrote:
I am working too with OO, 1.1.2, using xfree86 and OroboOSX v 0.9, but not Apple's X11 since that doesn't work with 10.2.8. I haven't been successful though with OO to get arabic fonts to work correctly.


It shouldn't be OOo that's the problem--at least, it certainly works under 10.3 and Apple X11 (with a number of little quirks, mostly keyboard shortcuts and some deadkeys not working). I spent some time during the testing cycle making sure Arabic worked to at least a useable level.

Shalkar wrote:
The Devalipi fonts are part of the shareware apps found at www.devalipi.com


Ah, yes, the Arabic Genie folks. Now I remember. Well, I was unable to get those fonts to work correctly in NeoJ (or anywhere outside of their own Devalipi Pro application)--not even in other Carbon apps. Actually, text I wrote in Devalipi Pro and exported to HTML displayed properly in Safari and Firefox, but that's because DP exported the file as "MacRoman" encoding and referenced the DevalipiA font. They explicitly call the positions where the appropriate forms of the letters are in their font (if you were to add a space in the middle of a word in TextEdit and then load again in Safari, there would be medial forms on either side of the space).

The problem with the Devalipi fonts is that they are based on some pre-Unicode, non-standard encoding. Basically, all the glyphs are in the wrong positions. Pre-Unicode fonts were limited to 256 glyphs/characters, when someone wanted a font for a non-Roman script on any OS, they (generally) "replaced" all the Roman glyphs with the ones needed for their script. Sometimes they followed an existing encoding standard--aka charset when we arrive at the internet--(there were ASMO encodings that were early Arabic standards), but more often than not at some logic only they knew.

[A charset is roughly synonymous with encoding and is the group of characters needed by a particular script (Roman, Arabic, Cyrillic, etc.) and the order in which they appear in a font and the decimal or hex code for the character that gets inserted into a document on a "behind-the-scenes" level.]

For example, "A" is at position 65 (decimal) in a standard Roman font (pre/post-Unicode)--for sake of argument, the 65th character in the font. When making an Arabic font, someone might put the alif in at 65, the baa' at 66, and so forth. When something written in this font was sent to another computer, the Arabic text would be Roman gibberish without the special font. For all intents and purposes, it was not an Arabic text.

ISO-8859-6, the international standard charset/encoding that predated Unicode, and on which Apple's Arabic fonts were based, assigned alif position 199 in a font (because this encoding also included the ASCII Roman characters in their normal positions--bilingual/bi-script fonts!). The problem with the ISO-8859 encodings is that they only allowed two scripts at a time, Western European Roman (missing chars for Polish, for instance) and the non-Roman script. One couldn't easily write in Arabic and Hebrew, for instance, in the same document.

[On the Mac one could, by switching keyboard layouts (which switched scripts and fonts and put an invisible marker in the document saying 199 in this section is alif but in this other section it means hebrew-character-x), but the document couldn't be exchanged widely.]

Enter Unicode, which put each character in "every" language and script in a separate position in the encoding table and thus the fonts. No more possible confusion. Alif is 1517; no other character can have spot 1517.

Under Mac OS X, Apple's letter-joining routines expect letters to be in their Unicode positions in order for the OS to do the contextual analysis and pick the right form, etc. That's why even the old ISO-8859-6-based Arabic fonts no longer work; they don't have a 1517 (which is what an Arabic keyboard layout looks for when one presses "h"). One can still "type" the Arabic letters from the old font using a Roman keyboard, but the OS doesn't think it's Arabic, doesn't do contextual analysis, and doesn't join the letters. It's like the guy who made the Arabic font by putting alif in position 65 in the old, old days--the text is essentially gibberish.

In DevalipiA, alif is 71! If they had designed their fonts (and input system) to follow ISO-8859-1, there would at least have been a migration path (Text Encoding Converter) which could make the texts into useable Arabic on which the OS can perform contextual analysis and display the correct letter forms....

Unfortunately, on the long road to Unicode (and multi-lingual computing before that), these non-standard fonts, and applications based around them, gained a lot of traction. Microsoft used to make people buy a special version of Windows and a special version of Word to work in Arabic. Apple was a bit better, with WorldScript on the system level that every application could link to and gain multilingual, mulit-script support, and then only the Arabic Language Kit for end-users to add the Arabic-specific pieces (i.e., to "turn on" the latent Arabic support); ALK was $100 but it lasted, no new purchase needed, from System 7 until Mac OS 9 included the language kits by defaul with the OS for free. But these were still obstacles of varying degrees, and some companies thought they could do better for cheaper by re-inventing the wheel in a non-standard way.... (Sorry, this became a pet peeve of mine in the 90s Sad)

Shalkar wrote:
Where else would you recommend I read more about charsets? I am just getting familir with unicode, and this phrase keeps popping up. Is there NeoOffice/J specifics sites I could try?


Oops. My explanation got a little long-winded and went into a fair amout of that. Aside from reading the OOo/NeoJ helpfiles, if they have anything, there's nothing too specific to NeoJ/OOo--they support documents in Unicode and some of the legacy ISO encodings/charsets. But there are a couple of sites for more general info: Alan Wood's Unicode Resources on Unicode in general (although he's also a Mac user) and Tom Gewecke's Unleash Your Multilingual Mac; Tom's the dean of the multilingual Mac knowledge these days. And of course the original, Knut Vikør's The Arabic Mac, which is not quite as up-to-date as it once was.

Going back to the Windows TTFs, there are two avenues where we need to lobby in order to get them working on Mac OS X: Apple for full OpenType support and more Arabic glyphs (Mac OS X feedback) and the makers of the fonts for AAT instructions in addition to their OpenType ones. I'm not sure how effective it will be, particularly regarding the fontmakers, but otherwise we're stuck with Apple's five fonts Sad

Hope this was helpful and not too confusing!
Smokey
Back to top
Shalkar
Red Pill


Joined: Nov 03, 2004
Posts: 5

PostPosted: Sat Nov 06, 2004 10:52 am    Post subject:

Thank you so much. I have been searching and searching for info, reading posts like this, then realized no one had my questions. Your answer helps more than anything I've read the last months in helping me understand.
I shall start the lobbying!

One last question: can we not re-encode these ttf fonts using VOLT, or pfaedit, or even fontographer (i'd mention fontlab but I can't afford that baby).? Where can I got to discuss these questions?

Again, thank you.
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Sat Nov 06, 2004 7:43 pm    Post subject:

Shalkar wrote:
Thank you so much. I have been searching and searching for info, reading posts like this, then realized no one had my questions. Your answer helps more than anything I've read the last months in helping me understand.


Glad my vast store of arcane and "useless" knowledge has been helpful! Smile

Shalkar wrote:
One last question: can we not re-encode these ttf fonts using VOLT, or pfaedit, or even fontographer (i'd mention fontlab but I can't afford that baby).? Where can I got to discuss these questions?


In theory, yes. In practice, it might be more difficult than it seems. At the end of the thread in Apple Discussions I mentioned in my first post, the conclusion seems to be pfaedit (now FontForge) currently won't install on the Mac and it's pretty difficult to add the needed AAT tables (so the newly re-encoded fonts will have their letters joined by the OS routines) Sad Fontographer apparently only runs in OS 9/Classic on the Mac, which means it hasn't been updated in a while, so while it can probably re-encode the fonts to Unicode, it might not be able to do AAT tables either.... Creating/re-encoding fonts is unfortunately beyond my realm of knowledge, so who knows for sure. It seems that FontForge does have several mailing lists, so that's one place to go http://fontforge.sourceforge.net/#Mail.

Shalkar wrote:
Again, thank you.


My pleasure. Please keep me posted if you find any solutions Smile

Smokey
Back to top
Terry Teague
Guest





PostPosted: Sun Nov 07, 2004 2:56 pm    Post subject: Re: Arabic Fonts

sardisson wrote:
In theory, yes. In practice, it might be more difficult than it seems. At the end of the thread in Apple Discussions I mentioned in my first post, the conclusion seems to be pfaedit (now FontForge) currently won't install on the Mac and it's pretty difficult to add the needed AAT tables (so the newly re-encoded fonts will have their letters joined by the OS routines) Sad Fontographer apparently only runs in OS 9/Classic on the Mac, which means it hasn't been updated in a while, so while it can probably re-encode the fonts to Unicode, it might not be able to do AAT tables either.... Creating/re-encoding fonts is unfortunately beyond my realm of knowledge, so who knows for sure. It seems that FontForge does have several mailing lists, so that's one place to go http://fontforge.sourceforge.net/#Mail.
Smokey

Although I just skimmed over the discussions here, and I am NOT a fonts guru, I thought it worth mentioning free font tools that Apple supplies - maybe something in there will be of use.

http://developer.apple.com/fonts/

Regards, Terry
Back to top
Reza
Guest





PostPosted: Thu Nov 11, 2004 9:20 am    Post subject:

I want Arabic translation of NeoOffice please. OpenOffice already exists in Arabic so please inlude it in NeoOffice.

Thank you.
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Thu Nov 11, 2004 7:34 pm    Post subject:

Reza wrote:
I want Arabic translation of NeoOffice please. OpenOffice already exists in Arabic so please inlude it in NeoOffice.


As I mentioned in the Hebrew language, it doesn't seem like the Arabic and Hebrew translations/localizations exist in the official OOo source that is used to build NeoOffice/J. I will pm M-Rick, who planned to try and merge the unofficial Arabic OOo Linux files with NeoJ, and see if he can report how he did it if he was successful.

What really needs to be done, though, is for the people who did the Arabic translation to "donate" it to OOo and get it checked in to the OOo CVS so it will be available to everyone....

Smokey
Back to top
OPENSTEP
The One
The One


Joined: May 25, 2003
Posts: 4752
Location: Santa Barbara, CA

PostPosted: Thu Nov 11, 2004 9:22 pm    Post subject:

Also though, don't think that translations need to be within OpenOffice.org itself to be incorporated.

OpenOffice.org has its own restrictions as to what can and what can't be incorporated into mainline OOo, particularly licensing restrictions and copyright assignment limitations. NeoOffice, OTOH, is full GPL. Many of the dictionaries and a couple localizations are GPL licensed only and aren't able to be incorporated into OOo, but we can incorporate them without problem into Neo/J.

So if there's even a GPL licensed one we can use it. All of the Mac OS X hebrew work I know of, at least, is not OOo mainline and I've not been able to ever find any source code or the like. I'm not sure about the 'unofficial' Arabic thread mentioned, but if it's GPL we can put it in.

ed
Back to top
pluby
The Architect
The Architect


Joined: Jun 16, 2003
Posts: 11949

PostPosted: Thu Nov 11, 2004 9:54 pm    Post subject:

There is Hebrew and Arabic localizations in the OOo 1.1.2 codebase. However, when I was about to release Neo/J 1.1 Alpha 1, I found that all of the new localizations made the Neo/J download almost 200 MB!

Since I am always bending and twisting the Neo/J binaries to fit within the various disk and bandwidth limitations on my download sites, I trimmed out many of the localizations that were added in OOo 1.1.2 that were not in OOo 1.0.3.

In the ideal world, I could release downloads that have all localizations and help files in all OOo supported languages. However, this takes a astronomical amount of disk and bandwidth.

Instead, I have to figure out how to build "language packs" or build separate binaries for each language group (e.g. "Neo/J Asia edition", "Neo/J Europe, North American, and South American edition", etc.). The problem is that this will require an order of magnitude increase in both my time spent doing releases as well as the same increase in bandwidth usage.

If anyone has any simpler ideas, they would be very much appreciated.

Patrick
Back to top
OPENSTEP
The One
The One


Joined: May 25, 2003
Posts: 4752
Location: Santa Barbara, CA

PostPosted: Thu Nov 11, 2004 10:08 pm    Post subject:

Ah, OK, apologies patrick. I was just pulling the licensing rabbit out of the hat when in fact we were dealing with an aardvark...

This was exactly the same issue I had when making the OOo X11 1.0.x "localizer"...including *everything* for each language was just too durned large. I eventually axed the help and managed to get things down to a reasonable size.

One of the great strengths of what Patrick's done with Neo/J is make it "self-localizing" in that it can translate itself without running lots of extra installers. This makes it really a world-class Mac OS X citizen (no pun intended) with one installer whereas traditional OpenOffice.org has one installer per language (!). Needless to say, mirror sites couldn't host 20+ 100 MB images for each language. OOo localization is, as a result, an utter mess.

Previously my mind was going down the line of making a single "english only" installer (still can edit foreign languages, just english interface and help) and then a second "any language" installer that would cover every foreign language we can think of. While the "any language" installer would be probably an order of magnitude larger, it would be the kind of thing that people who really need localization could download and the kind of thing that could be the main installer for CD/DVD distros. I'm not sure if that line of thinking is amenable, though.

I've generally found in the last few years that folks who want a foreign language and only a foreign language are more then willing to support a local build themselves or pay for a third-party to do the bundling and install directions and support in their own language.

For better or for worse, the main languages of Neo and OOo are English and German due to the size of the user and developer communities in those languages respectively.

ed
Back to top
sardisson
Town Crier
Town Crier


Joined: Feb 01, 2004
Posts: 4588

PostPosted: Fri Nov 12, 2004 12:07 am    Post subject:

Wow, I didn't realize they were in the codebase already. Mea maxima culpa. (At least partly. Every clue I found on the OOo website indicated that there were partial translations in progress, only available in "third-party" builds, and I gave up trying to find them in "ViewCVS" which apparently is OOo's way of browsing CVS via the web....)

It sounds like we need a "language summit" now, too, to help come up with better solutions to this problem. Smile

A few preliminary questions and then I'll start a new thread (or Ed or Patrick could split the last few posts off from the main body of this (fonts-related) "Arabic" thread):

1. What UI languages does NeoJ currently ship with?

2. NeoJ It only ships with English help, right?

3. And all the dictionaries are included in the NeoJ package but only the one that matches system language at first NeoJ launch is enabled because the OOo code sucks up too many CPU cycles or something with all the dictionaries enabled.

4. Any other localized components out there?

Smokey
Back to top
Shalkar
Red Pill


Joined: Nov 03, 2004
Posts: 5

PostPosted: Fri Nov 12, 2004 2:06 am    Post subject: wha?

what have I started!
Back to top
Display posts from previous:   
   NeoOffice Forum Index -> NeoOffice Testing All times are GMT - 7 Hours
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest © Planamesa Inc.
NeoOffice is a registered trademark of Planamesa Inc. and may not be used without permission.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.