Sunday, August 14, 2005

oKular beta1 'monoKle'

API work



While working on oKular I faced or was asked to consider a list of things the file format handling generator/backend might need. First problem was search, kpdf when branched did not have a page abstraction class, it did not handle searches between the lines too, it could only search for words on the same line.



Page abstraction



So the first thing I wrote was the page abstraction class, which holds a vector of text entities and implements the search functions. Text entities are a character or a set of them with a rectangle box describing a normalized ([0;1] coordinates) position on the screen. The text entity class has also information about whether new line follows this entity (used for getting the text for given coordinates) and the standard rotation and baseline data. The search algorithm is quite fast and includes a simplifying function when returning a result.



The result is a regular area, which is a finite set of normalized rectangles that describe positions of the found words. The great thing about it is being a template which can be used for virtually anyobject, or even for regular areas if we need regular areas of different shapes! Don't you just love object oriented coding ?



Information pushing



How should the file format backend inform the program about an error? Or maybe it would want to display a message or a warning. It is a simple as emitting a signal now! The user can configure whether he wants the messages displayed using kpdf's supreme popups in the page view or a standard KDE message/error/warning box (the latter not included in beta1).



Settings



Support for generator settings is easy when you think about allowing to set settings only when the generator is loaded with the document, but what about allowing to configure all backends at all time? The solution was simpler then I thought but at a cost. All the generators that have settings support are loaded if user chooses to configure them, later if the user opens a filetype and a generator for it is already loaded, its not loaded again - just reused. Who would have thought it would be that easy.



Adding pages to the configuration dialog is done by the generator in Generator::addPage(KConfigDialog *).



Intercepting the event of the page and adding components to the GUI



The need for a possibility to intercept the events, such as a right mouse click, in the page view was reported by Wilfred Huss of kviewshell. To be generous, I am allowing the generator to intercept nearly all events page view receives, just reimplement the bool Generator::handleEvent(QEvent *) function! Return true if the page view should parse the event itself after your function finishes.



Adding components to the GUI was another important aspect, the most important GUI items in kpdf are the QToolBox in the navigation pane and the menu. You can specify an XML menu layout that is to be merged with the standard menu by subclassing the newly added getXMLFile function, while the QToolBox and KActionCollection objects are available to the generator in setupGUI (QToolbox*,KActionCollection *).



The Ghostscript generator for Postscript



The next deliverable I pointed in the SoC proposition for Google was a backend using Ghostscript. I admit taiming Ghostscript library (libgs) was hell. A partially documented API along side with some typos (like sending void** instead of void*) and the library having a C object API made it really no fun. But with help of ghostscript developers (great thanks to ghostgum of #ghostscript, the author of the API) I managed to understand how it works and wrote a perfectly good wrapping class. The road was harsh though, as ghostgum lives in Australia and I live in Poland, so it usually ment to me no sleep at night if I needed to talk to him.



Asynchronous renderer



The asynchronous renderer based on ghostscript is the first fully synchronous generator in kpdf. libgs was not thread aware and did not allow more then one instance in the generator, reusing the synchronous renderer was too slow for me to accept so I wrote a helper application that uses the wrapper class to generate a pixmap and uses X11 functions so do an ultra fast transmission of the QPixmap. The asynchronous renderer is a masterpiece, the wrapper around it includes even asynchronous killing of the no longer used instances of helper apps. I'm proud of it.



Finally one night i managed to overcome the problems.




Replacing kghostview



oKular is now a full replacement of kghostview without support for PDF files, which is easy enough to implement later. The replacement in KDE4 will close 90% of kghostview bugs, which kpdf implemented earlier like review support or better navigation panel.



No text search support



Unfortunately there is no currently available solution for postscript searching, pstotext handles only some PostScript files and does not work with anything other then latin1. Also note there is no such thing as encoding in PostScript, it uses a set of glyphs and when more are needed the unneeded ones are removed and new ones are added, locally on page, so the encoding migh change several times inside a document. Currently there is no sane possibility to work this out.



Ghostscript library



To use the Ghostscript generator you need a Ghostscript compiled with shared library. To do it manually, take the sources and after configuring do:


make so

and to install
make soinstall

A list of RPMS with libgs can be found here. Debian does not currently provide any package with libgs.



Make sure you have it in your library path because using the --with-gs-library switch does not work yet (no idea how to add it to the list of directories checked by KDE_CHECK_LIB).



oKular Beta 1 "monoKle"


I am releasing oKular SVN revision as beta 1. It should work correctly with PostScript, Image and PDF files. For now my working version includes CHM support I did not put in this beta, due to some pagesize calculation problems. The CHM support is based on the chmlib wrapping code from kchmpart (three files actually), I did a few changes to make it usable in oKular, wrote the code that uses khtml to render the compressed html pages myself. Beta 2 is planned for mid October, when I have most of the kviewshell plugins ported to oKular. So get: the tarball for oKular beta1 'monoKle' Do not report this to slashdot. If the server gets slashdotted, there will be no betas and you'll have to wait until KDE4.

Monday, July 25, 2005

Bad PR?

Well. One of the polish translators of Firefox published an unofficial build of Firefox (codename deerpark). It is a great way to humiliate the Qt/KDE community. While being mildy unfair, making fun of the gecko Qt port is easy. Wrong. It is just a nasty way to do bad PR for Qt/KDE. Another UPDATE, seems it was just a matter of sense of humor I didn't get. Apologies to marcoos (the polish translator), for not understanding this.



He writes in one of the comments (my translation):

The qt/mozilla project works like this
1. Someone says GTK is ugly (bullshit, but hey)
2. Somoene starts a qt port
3. Three months pass
4. A port appearts, it even compiles, still it remains unusable.
4. A year passes.
5. Port doesnt even compile now, because its been unmaintained for over 6 months
6. Someone removes the port, which at this point is out of sync, from the official mozilla branch
7. 10 months pass
8. GOTO 1.


UPDATE: Well at first I thought this was he has a damn point and although on one hand it is unfair to make fun of this, on the second couldn't Zack, or anyone, just finish it during the whole year? Many projects that get started in KDE are not finished. Its normal, but they mostly do not go outside the KDE repository. Those which do, could at least get finished and maintaned, for PR sake.


Well now that I talked a little more to other KDE developers, I am posting better remarks about this:


  1. Mozilla Foundation is not making it easier to maintain and develop this port, a sufficient argument is the time for Dirk Mueller to get SuperReview. I bet every project would welcome such a skilled and experienced developer like Dirk, without making all those troubles. Well, Mozilla is not an AOL/Netscape offspring for nothing, right?

  2. I would like to apologize to Dirk, for the earlier version of this post. The Mozilla/Qt and the Gecko/Konqueror integration are maintained and will be a subject of presentation at the KDE Akademy 2005.


Saturday, July 02, 2005

Localization, now!

Localization, what an important domain of commercial project development it is. Unfortunately its often forgoten in the OpenSource world. This seems reasonable, since the OpenSource software does not aim to conquer national markets, vendors hardly ever do local marketing. It is important to distinguish localization from translation. The latter takes care of translating the user interface and documentation to the desired language, while localization is responsible for making sure that programs have the same functionality in different languages.

How important is localization? Imagine buying a ferrari only to learn that you can not speed greater than 60km/h with it outside Italy...



Websites providing content


This is about applications that use external websites to serve additional content. Two examples, both from amarok.

The information tab that uses Wikipedia to display information about the artist of currently played song. The information is searched in English Wikipedia, even if running with a Polish locale, amarok still chooses to search the English Wikipedia by default. Most of Polish artists are not available there. The user is informed that no page was found. While it is very probable that such a page exists in Polish Wikipedia (Pudelsi is an example band name). Absolute l10n no-go.


A much bigger failure in localization is the lyrics tab. Searching for most of Polish lyrics there is just a mistake. This desperately needs localization. Even the encoding of the artist name is broken by the search engine used on the Lyrics tab backend. This is of course much harder to localize, since this process would need coding skills. Still it remains a huge bug, I doubt a commercial vendor would release this feature on a non-English market with its current state.



Amarok is the first application that is using website providing contents to such extent and therefore it is only natural that the amarok team made those mistakes. I love amarok and I am full of respect for what the amarok folks wrote, please treat this article as a hint and not as a depreciation of amarok.




Speech synthesis


The speech synthesis via festival localization was beautifully done by the kttsd developers. One Polish voice (male) is present on the voices list. It seems there is one more Polish voice (female) for mbrola, but I can not check if it works with festival. Still kttsd has two issues. First when I checked the voices file, there was no non-European/USA voice listed. I could bet there were Arabic voices for festival. Quite an localization issue in a big part of the world. Second one is no predefined command plug-in synthesizers. In Polish there is one free synthesizer for linux - powiedz, command line tool. It could be supported by default too.



Translation service


Half of KDE applications use the famous babelfish and google translation services. From konqueror's addons, through kopete's translator plug-in to kbabel. For years no one noticed that translation services exist even for languages different than the several ones supported by those two translation engines. An example is: translantica - English<->Polish translation engine created by the academic workers from Polish Academy of Sciences. I am sure there exist similar languages for the Arabic languages. Even if there are no Spanish<->Polish or Japanese<->Polish services, one can still try doing Spanish<->English via babelfish or google and English<->Polish via translantica. It can be done for any language with bidirectional English translation engine. No localization whatsoever here.



Search providers


I have set the polish locale but I am using the English language. Now I want to google in Konqueror. Guess which google version does Konqueror load when using a search provider? Yes, the English one. It is impossible to describe how annoying this situation is. But wait there are more services like that (ex. wikipedia). Ofcourse this flaw is only partial, since with KDE with Polish translation should at least choose the Query[pl] URL over the Query one. Still what about the polish search providers? I have made collection of them for the PLD Linux I am working on and I will commit them once my todo for them is complete. Still I am completely astonished by the fact that no one else from the community did not submit such data. Two localization issue here then.



Thesauruses and dictionaries


The last localization aspect but how much underrated in the KDE community now. How many koffice thesauruses can you count? I have yet to see an non-European thesaurus for koffice. But the same goes for dictionaries, looking at the quality of Microsoft's dictionaries in the latest Office Suite, one will notice that there is no Polish dictionary with the support for punctuation and other advanced language rules around, even in commercial OpenOffice suites. But there is an open thesaurus project in Poland which provides koffice the thesaurus also in koffice format. Still what about non-European languages?



Conclusions


The situation is not perfect and while most Western European languages are not lacking good localization, the need for localization in the other languages is vast. There are two things to be done by two sides of KDE.

Developers


The developers need to create the localization possibility. Much of it already exists like thesauruses and kspell framework. But still there are things missing: a possibility to add translation engines and specify local equivalents of content-providing websites. One could dream about providing those possibilities without requiring the localization team to have programming skills.


Users


Well it is up to users to generate localization teams, developers cannot know about every aspect of localizations. Hey, go for it!


Sunday, June 26, 2005

Summer of code

Well. I'm in. Google accepted the proposal i sent, although not sure what to do now (check taxes in poland, read strange pdf's google sent), I'm really happy about it. I started the project on sourceforge already. So what's the idea? I'll take the proposal I sent to google and try to specify several development milestones. I would like to say a special Thank You to Albert Astals Cid, who agreed to mentor this project from the KDE side :) It is not the first project I'm doing for KDE, but it's the first one which consists of so much code :) I am really happy about working with you Albert :)



oKular - taking kpdf beyond just pdf's


The goal is to make a unified viewer that would use the kpdf shell and have all the superb features it provides, while supporting as many formats as possible (while the formats are rationally chosen, no use to have support for video files in kpdf).

What does KDE get? A viewer for many formats with:

  1. One unified look and feel

  2. Possibility for third party vendors to provide plugins for their format usign the public API

  3. Fast viewer for lots of formats that is easily embeddable in other apps like file browsers, mail readers or any other application that deals with files of different types. (thanks to recently released kpart plugin for Firefox, Firefox users finally get a decent plugin for several formats too)

  4. All the features of kpdf but for many more formats (including the annotation support)

  5. With the kword plugin ready, one will not have to wait for kword to start just to preview a document.



First milestones:

  • have a fairly well designed API for the plugin structure, Stefan Kebekus asked me to check if I could use the (much incomplete) kviewshell API as a basis.

  • port kpdf xpdf backend to the new plugin api

  • write ghostview backend

  • write chm backend

  • write kword backend (and support converting formats we have kword filters internally)




I will start to code all the stuff somewhere on Wednesday, before I want to:


  • publish an article about localisations in KDE, localisations not translations, about where do they suck and where are they done as properly as the non-english speaking KDE users deserve

  • publish an article on one of the biggest opensource portals in Poland - 7thguard.net - about the KDE/Wikimedia cooperation and how can it be used to improve user experience

  • reopen my polish blog :)


Monday, March 28, 2005

No text in kontact's icons pane

My friend from PLD asked me for a patch to allow hiding text in kontact's icons pane. It just took a little too much space. Here's the result:














Large icons
With TextWithout Text
Large icons with textLarge icons without text















Normal icons
With TextWithout Text
Normal icons with textNormal icons without text














Small icons
With TextWithout Text
Small icons with textSmall icons without text














No icons
With TextWithout Text
No icons with textNo icons without text
















RMB
When showing only textWhen showing icons and text is hiddenWhen showing icons text is shown
When showing only textWhen showing icons and text is hiddenWhen showing icons and text is shown



This patch also prevents the user from setting both Only Text and Hide Text, this way we always have some indicators on the icons pane. The drawback of this patch is that it depends on which is the highest IconSize value, because i was unable to make the KPopupMenu take negative values properly. Instead of -3 it took some weird -700sth. And there was no point in making a submenu for it. If anyone knows how to do it with negative numbers or without relying on the IconSize value, please let me know.


Patch version: 1.3 and its webcvs page. Apply it in your kdepim's checkout's root directory with

patch -p1 < kdepim-iconsidepane-showtext.diff

Monday, February 07, 2005

Linux and fonts don't mix well

I did not blog for awhile but for a good reason. With october i got accepted into the 2nd best mathematics and computer science faculty in Poland - Faculty of Fundamental Problems of Technology at the Wroclaw University of Technology, i did expect it to be hard, but not that hard though. During the semester i nearly quit working on any project including PLD. At the end of the year I got badly sick and stopped learning too, I only remember waking up at 6 am going to the lectures, getting back home and sleeping. I'm feeling well since a week now, at least I passed all the exams.



Yes, fonts.

I am an amateur typographer, I like designing stuff, especially websites, but I have a dream of designing a newspaper one day, actually the idea of starting a newspaper was bugging me for a long time and that was the main reason I went into typography, anyway I need to have my fonts.


Widely available tools on linux apparently do not support the idea of having many fonts and organising them in a reasonable manner. I do not use many fonts at one time. At the moment I have 900 fonts loaded in X11 (820 TTF, 15 OTF, 65 Type1) which is enough to make using them very uncomfortable. The font dialogs in Qt and in GIMP are one big mistake. How the hell do you comfortably select a font from 900 others in those dialogs? What about comparing fonts? Grouping them? Annotating them for heavens sake!


What if I wanted to choose from all the 2000 TTFs, 4742 Type1 and 150 OTFs I have on the HDD? Maybe I would need only 5 from them, I'd have to restart X11 several times to be able to preselect some of them and then (having limited the search to ca 50 fonts) I'd still have a hard time comparing between them with the tools that are widely available at the present time.


Not discouraged yet? We have a bonus especially for you sir, yes you! Take a look at your OTF font, how is it displayed in your linux box? Let us take an example: Antykwa ToruĊ„ska (Torunian Antiqua). A typeface designed by Zygfryd Gardzielewski - one of the most famous polish typographers - and cast in 1960 in Warsaw, usually used for printing accidenses, poetry and titularities. It was digitalised in 1996 by Polish Tex Users Group and is being maintained by Janusz Nowacki who is also the author of the OTF version. It is an extremely interesting free font, since it includes properly encoded latin, greek, vietnameese and cyryllic fonts as well as both nautic and normal numbers to mathematical symbols and other characters a professionaly designed font should have. It is available in PLD-linux as a package called fonts-OTF-AntykwaTorunska or for download OTF version 2.01 zip. It is GPL'd, you can see screenshots here.

With TTF fonts you only get to see the name of the font on the font list right and you can choose the bold/regular/condensed/etc variants in a separate field right? Well with OTF fonts you don't. I call it uncomfortable, imagine having 150 OTF fonts with 5-10 variants of each of them and browsing them with the currently available dialogs.


There are two other bugs. First and most annoying is the slowness of rendering asian characters on linux. Konqueror freezes for at least a minute while it waits for freetype to render them (check sf.jp, I have never tried Lycoris (they use Bistreams renderer instead of freetype) though, anyway its renderer is not freely available. Another thing with fonts under linux is fontconfig and matching the following screenshots will demonstrate everything:



Verdana 11ptVerdana 12pt


I had high hopes for Qt4 at some point, I really thought it would address the font problem, but apparently i was in error. Anyway good things about Qt4 - it is going to be GPL'd on Win32! This saves me a lot of work. Current TODO is simple for now: finish an evil project i signed an NDA about, finish and release opensource my CMS and make sure the Qt application for managing its content works on windows (mega thank you goes to the trolls for not making me rewrite the app on gtk) and now a new item - the font browser, maybe ill start working on it in Q4 2005.