Html2Wml Documentation
You can also read the man page of Html2Wml either as an
HTML page or as a PDF
file.
Html2Wml converts HTML pages to WML decks, suitable for being viewed on a
Wap device. The program can be launched from a shell to statically convert a
set of pages, or as a CGI to convert a particular (potentially dynamic) HTML
resource.
Althought the result is not guarantied to be valid WML, it should the case
in most common cases. Good HTML pages will most probably produce valid WML
decks.To check and correct your pages, you can use W3C's softwares:
the HTML Validator, available online at
http://validator.w3.org
and HTML Tidy, written by Dave Raggett.
Html2Wml provides the following features:
- translation of the links
- limitation of the cards size by splitting the result into several cards
- inclusion of files (similar to the SSI)
- compilation of the result (using the WML Tools, see LINKS)
- a debug mode to check the result using validation functions
[Note] Most of these options are also available when
calling Html2Wml as a CGI. In this case, boolean options are given the value
"yes
" or "no
", and other options simply receive
the value they expect. For example, --ascii
becomes
?ascii=yes
. See the file t/form.html for an example on
how to call Html2Wml as a CGI.
- --ascii
- When this option is on, named HTML entities are converted to US-ASCII
characters using the same 7 bit approximations as Lynx (the famous
text-mode browser). For example,
©
is translated to
"(C)", and ß
is translated to "ss".
By default, this option is off, so that the entities are converted to
numeric entities: ©
becomes ©
,
and ß
becomes ß
.
- --collapse, --nocollapse
- This option tells html2Wml to collapse redundant whitespaces,
tabulations, carriage returns, lines feeds and empty paragraphs. The aim
is to reduce the size of the WML card as much as possible. Collapsing
empty paragraphs is necessary for two reasons. First, this avoids empty
screens (and on a device with only 4lines of display, an empty screen is
quickly made). Second, Html2wml creates many empty paragraphs when
converting, because of the way I programmed the syntax reconstructor.
Deleting these empty paragraphs is necessary like cleaning the kitchen :-)
If this really bother you, you can desactivate this behaviour with the
--nocollapse option.
- --compile
- Setting this option tells Html2Wml to use the compiler from WML Tools
to compile the WML deck. If you want to create a real Wap site, you should
seriously use this option in order to reduce the size of the WML decks.
Remember that Wap devices have very little amount of memory. If this is
not enought, use the splitting options.
- --hreftmpl=TEMPLATE
- This options sets the template that will be used to reconstruct the
href
-type links. See the Links Reconstruction section for more information.
- --linearize, --nolinearize
- This option is on by default. This makes Html2Wml flattens the HTML
tables (they are linearized), as Lynx does. I think this is better than
trying to use the native WML tables. First, they have extremely limited
features and possibilities compared to HTML tables. In particular, they
can't be nested. In fact this is normal because Wap devices are not
supposed to have a big CPU running at some zillions-hertz, and the
calculations needed to render the tables are the most complicate and
CPU-hogger part of the HTML.
Second, as they can't be nested, and as typical HTML pages heavily use
imbricated tables to create their layout, it's impossible to decide which
one could be kept. So the best thing is to keep none of them.
[Note] Although you can desactivate this behaviour, and
although there is internal support for tables, the unlinearized mode has
not been heavily tested with nested tables, and it may produce unexpected
results.
- --nopre
- This options tells Html2Wml not to use the
<pre>
tag. This option was added because the compiler from WML Tools 0.0.4
doesn't support this tag.
- --srctmpl=TEMPLATE
- This option sets the template that will be used to reconstruct the
src
-type links. See the Links Reconstruction section for more information.
- --max-card-size=SIZE
- This option allows you to limit the size (in bytes) of the generated
cards. Default is 1,500 bytes, which should be small enought to be loaded
on most Wap devices. See the Deck Splitting
section for more information.
- --card-split-threshold=SIZE
- This option sets the threshold of the split event. Default value is
50. See the Deck Splitting section for more
information.
- --next-card-label=STRING
- This options sets the label of the link that points to the next card.
Default is "[>>]", which whill be rendered as "[>>]".
- --debug
- This option activates the debug mode. This prints the output result
with line numbering and with the result of the XML check. If the WML
compiler was called, the result is also printed in hexadecimal an ascii
forms. When called as a CGI, all of this is printed as HTML, so that can
use any web browser for that purpose.
- --xmlcheck
- When this option is on, it send the WML output to XML::Parser to check
its well-formedness.
The deck splitting is a feature that Html2Wml provides in order to
match the low memory capabilities of most Wap devices. Many can't handle
cards larger than 2,000 bytes, therefore the cards must be sufficiently
small to be viewed by all Wap devices. To achieve this, you should compile
your WML deck, which reduce the size of the deck by 50%, but even then your
cards may be too big. This is where Html2Wml comes with the deck splitting
feature. This allows you to limit the size of the cards, currently only
before the compilation stage.
[Note] Why compiling the WML deck? |
If you intent to create real WML pages, you should really
consider to always compile them. If you're not convinced, here's is an
illustration.
Take the following WML code snipet:
<a href="http://www.yahoo.com/">Yahoo!</a>
It's the basic and classical way to code an hyperlink. It takes 42 bytes
to code this, because it is presented in a human-readable form.
The WAP Forum has defined a compact binary representation of WML in its
specification (this is what we called "compiled WML"): its binary, so you
(human), can't read that, but your computer can. And it's much faster.
The previous example would be, once compiled (printed here as
hexadecimal):
1C 4A 8F 03 y a h o o 00 85 03 Y a h o o ! 00 01
This only takes 20 bytes. half the size of the human-readable form.
And a Wap device can read this way faster! And of course, smaller document
means less time to download.
There is a last argument, and not the less important: most Wap devices
only read binary WML.
|
Resources
- The WAP Forum
- This is the official site of the WAP Forum. You can find some technical
information, as the specifications of all the technologies associated with
the WAP.
- WAP.com
- This site has some useful information and links. In particular, it has
a quite well done FAQ.
- The World Wide Web Consortium
- Altough not directly related to the Wap stuff, you may find useful
to read the specifications of the XML (WML is an XML application), and the
specifications of the different stylesheet languages (CSS and XSL), which
include support for low-resolution devices.
- MobiliX
- This web site is dedicated to Mobile UniX systems. It leads you to a lot
of useful hands-on information about installing and running Linux and BSD on
laptops, PDAs and other mobile computer devices.
Programmers utilities
- HTML Tidy
- This is a very handful utility which corrects your HTML files
so that they conform to W3C standards.
- WML Tools
- This is a collection of utilities for WML programmers. This include
a compiler, a decompiler, a viewer and a WBMP converter.
WML browsers and Wap emulators
- wApua
- wApua is a WML browser written in Perl/Tk. I use for most of my
tests as it is the fatest software available.
- Tofoa
- This is a Wap emulator written in Python.
I don't use it as its installation was quite painful for me (I mean,
really painful compared to some other softwares which were not easy
to install), and it produces strange results, even with valid WML
files.
- Deck-It
- This Wap emulator is available for Windows and Linux/Intel only.
Too bad, because I can't use it for my tests. Another bad point is that
it can't read local WML files. Apart that, it's could be the best
software in this section if it was available on more platforms.
- WinWAP
- This is a Wap browser, available for Windows only,.
- WML Browser
- Its name tells what it is. In practice, I'm waiting for a message
other than "segmentation fault"...
Copyright ©2000, 2001 Sébastien Aperghis-Tramoni