ska: unmasked interrupts
Thursday, 5. March 2009

Javascript - Sourcecode living in promiscuously encoded world

Going through a lot of Javascript debugging with other folks I've noticed a bug pattern that can be due to some not-so-clear statements or documentation of encodings in the Javascript community as large.

Non-english software-developers have a great deal of exposure to econdings and therefore non-ascii characters. In most programming languages we have exactly the same problem as in Javascript: in what encoding is the code itself, as most file systems don't have an explicit encoding for a text-file.

Going with ascii is what most english developers do naturally, as they don't need any language specific characters to express string literals or comments.

In the browser world, however, the problem gets a whole lot worse. Escpecially with Javascript and some of its unique uses.

A lot of people assume, that Javascript itself should be encoded as UTF-8. This is e.g. due to parts of the JSON documentation, which emphasizes this point.

Conflicts arise however in the following scenarios:

  • Javascript included in a HTML page, the surrounding HTML page has an explicit encoding set to non-UTF-8,
  • The web-servers configured standard encoding differs from the encoding expressed in the html-header,
  • The web-servers configured standard encoding differs from the assumption, the guy who built the source code,
  • A web page requests Javascript from several servers, has in explicit encoding, but that differs from the assumptions, the other servers made,
  • You test a static HTML page with an included Javascript-file in your local file system, assume a specific encoding - your browser disagrees,
  • You have a UTF-8 encoded HTML page and include non-UTF-8 encoded Javascript. In IE, included Javascript inherits the requesting pages encoding. Which is bad. Which makes broken Javascript (ISO-8859-1 characters are invalid in UTF-8). Which is bad.

To make things worse, libraries (e.g. for java) to serialize into JSON have their own assumptions, or just-don't-care.

Living in a multi-encoding and interconnected world is not simple - never has been, but padavans in the Javascript world need some guidance and preparation to ease the location of bugs.

My par-force recommendation in this case is kind of harsh, but works always without special care: USE ASCII, Luke.
If it has the eighth bit on: remove it. Use \u-Unicode encoding where needed.
Use only ASCII in your comments.
Use an automatic tool to check your code for 7-Bit cleanness.

You'll never have to worry again about reusing, including, json-fying data or Javascript code again. It just works. With every browser, every web server and every local file system.

... read more stories on the topic int

... permalink... comment  ...xml version of this page


To prevent spam abuse referrers and backlinks are displayed using client-side JavaScript code. Thus, you should enable the option to execute JavaScript code in your browser. Otherwise you will only see this information.

search
 
status
You're not logged in ... login
tweets
unmasked links of interest
Edge 313
David Gelernter: Time to start taking the internet seriously
Linguistic profiling: The...
Speakers with German accents ? even if they stumble into grammatical errors ? are...
The Brads ? a comic about...
How to Download an Audio Book from the Cleveland Public Library
Forrester: The new Era of...
In this era, innovation will be driven by empowered customers and employees and IT...
LukeW | "Mad Libs" Style Form...
Wroblewski: A narrative style in form design increases the conversion rate by 25...
Computer Science for Fun -...
Computational Thinking - a way of thinking that is critical in the 21st Century
How Internet Explorer 8 document...
In Microsoft?s attempt to make Internet Explorer 8 more standards-compliant, the...
Found Functions
Nikki Graziano is a photographer and mathematician at R.I.T. in Rochester, New York. See...
Gartner Reveals Five Social...
Gartner Reveals Five Social Software Predictions for 2010 and Beyond Analysts Share...
Plurk: Instant conversations...
Using node.js for a highly scalable instant messaging system.
more unmasked links...
unmasked recent updates
The web, the web, the...
by ska (2009.12.10, 20:54)
and even faster...
Yesterday, in a quick reaction to its announcement...
by ska (2009.12.06, 12:20)
That was fast
It's only days since google expanded the wave invitation...
by ska (2009.12.05, 13:03)
Found at the Dortmund...
The BVB Rhino.My new desktop mascot.
by ska (2009.12.03, 13:19)
Some thoughts on the...
the #wave hashtag has just been replaced as one of...
by ska (2009.11.20, 10:22)
menu
... home
... topics
... galleries

... Pixelbloxx home
calendar
March 2009
Mon
Tue
Wed
Thu
Fri
Sat
Sun
 
 
 
 
 
 
 1 
 3 
 4 
 6 
 7 
 8 
11
13
14
15
16
17
18
20
21
22
23
24
25
26
27
28
29
30
31
 
 
 
 
 
 

xml version of this page

made with antville

XING