Friday, June 24, 2005

script language="python" - in a web browser

An interesting discussion came up on comp.lang.python about adding python script support to web browsers. Some people were saying it can't be done because Python can't provide an adequate security sandbox. I wrote this post to clarify how the browser provides the sandbox for the scripting language.
Newsgroups: comp.lang.python
Date: 24 Jun 2005 09:15:37 -0700
Local: Fri,Jun 24 2005 12:15 pm
Subject: Re: Python as client-side browser script language

Sorry to resurrect a slightly older topic, but I want to clarify some points about how the browser DOM and the script language interact.

Paul Rubin wrote:
> Huh?  The language itself has to provide the sandbox.
> Remember that scripts have to be able to see
> certain DOM elements but not others, and some of them have to be
> read-only, etc.

Greg Ewing wrote:
> If the DOM objects are implemented as built-in Python
> types, there shouldn't be any difficulty with that.
> Python objects have complete control over which attributes
> can be read or written by Python code.

In web browsers, Javascript does not provide the sandbox. The browser provides scripts with certain objects like "document" or "window" but these are not native script objects at all. They're wrappers for browser-native objects, usually written in C/C++ and exposed via an ActiveX or XPCOM interface. (IE uses ActiveX, Mozilla uses XPCOM). The interfaces to these browser-native objects enforce the security model by restricting what the scripting language is allowed to do within the browser context.

If you attempt any foul play with a browser-native object, it can simply feed an exception back to the script wrapper, and your script code fails. That's the sandbox.

Following are some relevant links for those interested in further details:

DOM Intro (especially the section, DOM vs Javascript): 
"That is to say, [the script is] *written* in JavaScript, but it *uses* the DOM to access the web page and its elements."
http://www.mozilla.org/docs/dom/domref/dom_intro.html

"Mozilla's DOM is coded almost entirely in C++. ... When, in JavaScript, a client tries to access a DOM object or a DOM method on a DOM object, the JS engine asks XPConnect to search for the relevant C++ method to call."
http://www.mozilla.org/docs/dom/mozilla/hacking.html

"The DOM makes extensive use of XPCOM. In fact, to do anything with the DOM implementation you need XPCOM."
http://www.mozilla.org/docs/dom/mozilla/xpcomintro.html

Talking to XPCOM in Python 
"The techniques for using XPCOM in Python have been borrowed from JavaScript - thus, the model described here should be quite familiar to existing JavaScript XPCOM programmers."
http://public.activestate.com/pyxpcom/tutorial.html

So in theory you should be able to create a python script interpreter, at least for Mozilla. In practice you'd either need to be an expert with the Mozilla source code, XPCOM, and Python... or you'd find yourself becoming an expert by necessity.
---

Incidentally when people don't understand the difference between DOM objects and Javascript objects, they end up with lots of memory leaks: http://jgwebber.blogspot.com/2005_01_01_jgwebber_archive.html