October 2005


It’s Friday, so that means it’s time to take a quick break and write something. Today’s topic: Using Javascript in Web Forms

Background
One of my current web-projects, Palmer Datacat (note: authentication required), is an online collection of data, metadata, and dictionaries of units and attributes. One feature we are implementing is the ability to add and/or edit any dictionary item directly from the web page. This requires using html-built web forms to take in user data and send it to a script that processes it and writes to a database.

Problem
Pure html web forms are static. That is, they offer no elements of interactivity. This is not a problem when the form should always contain the same fields. For instance, a guestbook form is static when it always has input fields for name, email, and comment.
However, data about an attribute (in the attribute dictionary) may differ depending on that attribute’s measurement scale. For example, attributes that are “nominal” and “ordinal” require different data input than those that are “interval” and “ratio”. The “datetime” attribute also requires its own specific input fields.

How can we build a form that only shows the needed fields based on previous input values?

Solution: Javascript!
Javascript is a ubiquitous client-side programming language enabled by default in most web browsers. With javascript programs, you can add dynamic effects and layers of interactivity to your web pages.

Before building the attribute form, I did a little research online to learn more about javascript and to see if there were any helpful tools for building javascript-integrated web forms.

quirksmode.org
Quirksmode may be the best javascript reference I’ve found. The author of this site touches many subjects in friendly depth, such as event-handling and best-practices for writing javascript code. I would not have been able to build the attribute form as quickly as I did without consulting this site first.

Form Assembly
Form Assembly has a Form Builder, which is an amazing tool for building a dynamic web form. It’s geared for html-newbies, but it is the best form-generation tool I’ve ever found (most of them are bad). The Form Builder builds compliant javascript-enable forms that work in any browser. (Note that the Form Builder tool does not work in Safari because it requires a built-in XSLT engine available to javascript which Safari lacks. The forms it builds, however, do work).

The features offered by Form Assembly include:

  • Switch behavior – toggling on/off of certain fields based on previous input
  • Repeat behavior – Ability to add/remove fields if more/less data is needed
  • Pagination – Split the form on multiple pages
  • Validation – Client-side form validation

Form Assembly offers a large javascript file that contains functions for each of these behaviors. You do not need to use the Form Builder tool in order to achieve this functionality. Rather, you could develop your own web form from scratch, and attach certain functionality to given fields using class naming conventions.

The downside to Form Assembly is that the javascript file is hefty, with over 1000 lines of messy code and weighing in at 43kb. Though I was initally amazed with the Form Assembly library, I ultimately decided not to use it. I reasoned I could build my own library of javascript functions geared toward specific form elements. This not only keeps the file size low (7kb), but also optimizes the script’s performance since it is not a generic one-size-fits-all library which must accomodate anything.

Prototype
Another javascript library I’ve found is Prototype which appears to be an advanced javascript-based framework for web developers. Prototype is relatively new and is seeing heavy use by lots of “Web2.0” web sites. Using and building on Prototype, you can do lots of cool effects and tricks. Unfortunately, documentation is currently lacking.

Attribute Form
Because the palmer development site is password-protected, I’ve duplicated the attribute form on my local working site. My javascript code is also available.
The two behaviors I’ve implemented are: Switch and Repeat.

Switch Behavior
When clicking on a Measurement Scale value (nominal, ordinal, interval, ratio, and datetime), a different box of fields may show. This is accomplished by registering an “onclick” event to each of the five checkboxes. The javascript then detects an “onclick” event and calls a function toggleMeasurementScale. This function displays and hides certain html boxes (divs) by accessing their styles.

The same function applies in the NonNumericDomain box (for nominal and ordinal measurement scales). When selecting an option in the drop-down box, certain fields will display.

Repeat Behavior
With “nominal” or “ordinal” selected for Measurement Scale, and “Code Definition” selected for Non Numeric Domain, a table of input fields (with 2 columns) appears. This table is used to define a codeset, being a list of pairings of code values and their definitions. Because different attributes may differ in the size of their codesets, an “Add a row” link helps implement a dynamic table that can increase or decrease in size to accomodate the size of an attribute’s code set.

To accomplish this, the “Add a row” link is not a conventional hyperlink like most links on web pages. Instead of taking you to a new page, it instead calls a javascript function called addRowTableCodeDefinition. This function generates html for the new row of input forms and appends it the table element. Likewise, the “Remove” link removes the chunk of html from the table element.

Will this form work everywhere?
Unfortunately no. However, with most modern browsers, the answer is yes, as long as javascript is enabled (which it is by default). I tested this form in Win/IE, Firefox, and Safari, and it works fine in all of them. The form will not work in earlier browsers, such as Netscape 4 and IE 4… but seriously, who still uses these browsers?

Javascript is an added luxury. Not all platforms (mobile phones, pdas, etc.) may have it implemented, and thus having a site rely on javascript could limit your user-base. In our case (and most cases), this is not a worry. Our targeted audience are Palmer personnel with proper privileges to add/edit/delete items from our online dictionaries. Such work should always be done from a computer terminal, and with a modern browser.

Summary
Javascript can be a great aid to building dynamic web forms, and in addition, dynamic web pages. Moreover, as an integral part of AJAX (Asynchronous Javscript and XML), javascript is seeing heavy use by many latest-and-greatest web projects, such as Google Maps. This new frontier of web development helps ensure that javascript is here to stay, and will be around for a long time. Thus, it is advantageous as a web developer to learn where and how to use javascript, whether for enhancing usability or site aesthetics. Building the javascript-enabled attribute form was a great exercise for me to get started, and I am looking forward to developing more javascript applications where needed.

In following up on a researcher request for supplemental material posting on a project web site in coordination with a published journal article, we’ve discussed establishing a simplified path such as (http://pal.lternet.edu/suppl) that can be created as a physical location initially and shifted to a virtual pointer as our web structure matures. The idea is to provide directories tied then to the related database (in this case the bibliographic database with its attendant unique identifier (or LTER contribution#, ie http://pal.lternet.edu/suppl/biblio279)

At the following link

(http://www.elsevier.com/wps/find/journaldescription.cws_home/601265/authorinstructions)
,
the use of the Digital Object Identifier is summarized as follows:

“The digital object identifier (DOI) may be used to cite and link to electronic documents. The DOI consists of a unique alpha-numeric character string which is assigned to a document by the publisher upon the initial electronic publication. The assigned DOI never changes. Therefore, it is an ideal medium for citing a document, particularly ?Articles in press? because they have not yet received their full bibliographic information. The correct format for citing a DOI is shown as follows (example taken from a document in the journal Physics Letters B): doi:10.1016/j.physletb.2003.10.071″

“When you use the DOI to create URL hyperlinks to documents on the web, they are guaranteed never to change. ”

The idea of ‘guaranteed never to change’ brings forward the question of the length of ‘forever’ in contemporary organizational life or in internet timeframes and prompts two thoughts: 1) the lternet virtual pointer has an advantage of stability in addition to the original strengths of network identity and geographic indendence; 2) it might be worthwhile inquiring at the sio library about their insights or plan with respect to this type request.

Every month, a group of web developers at Scripps, collectively referred to as WebHeads, meet to discuss web-related topics. At this morning’s meeting, Edgar Milik talked about his group’s experience using Subversion for web projects.

Subversion is a version control system like CVS. It is newer, faster, and has useful features that are harder to implement in CVS, such as restructuring a file system in the repository. Both Subversion and CVS are powerful tools for versioning software source code, especially with languages like C, Java, etc. Versioning web projects, however, is a little trickier.

There are 3 main problems to consider:

1. Web Server Required
To develop and test any web project, you obviously need a web server. This is analogous to requiring gcc or a java runtime environment for compling and running C and Java files respectively.

2. Separation of Development, Staging, and Production Areas
It is important to define 3 distinct areas for the web project. The development area is where users make local changes. The staging area is where the development team tests the web project to ensure nothing is broken. The production area is the actual web site that is served to the public.

This definition of 3 areas differs from the traditional approach of software engineering, where each user has his/her own working area (collectively the development area) and updates/commits code back-and-forth to the repository. With developing software, there’s usually no worry for making it immediately available. However, because a web project must always be available 24/7, a well-defined process must be followed to move code from the development stage to the production site.

3. Different Databases and Config Info
In addition to the 3 working areas mention above, each area may use a different database, possibly with different user accounts, passwords, and potentially on different servers. This configuration information must be kept locally within the project area (development, staging, and production), and it should not be versioned. It may be a good idea to version a template config file that a user can change whenever he/she checks out a new project.
This may be analogous to changing path variables in a Makefile, for instance, when checking out a project written in C or Java.

Solution: Develop in a Webspace!
Edgar’s teams uses a separate server for each area, creating a more secure system for web development. Each area has its own url, so all internal hyperlinks must be relative paths, never absolute! Edgar suggests using a non-routable domain for the development and staging areas (meaning that it can only be accessed from UCSD and with an authentication scheme in place)? This of course means that working remotely requires work-arounds such as webproxy.ucsd.edu or UCSD’s VPN.

Here’s a rough diagram I re-created from memory of the development workflow from Edgar’s team:

Diagram

Use Virtual Hosts
Example: http://domain.ucsd.edu/users/srhaber/svn/project becomes http://domaindev-srh-01.ucsd.edu
This may help prevent problems where the source code assumes the web root is the domain. (Of course, there are other workarounds to this issue, such as defining the web root relative path in each directory of your web project).

Hide or don’t copy .svn directories
Subversion creates and uses hidden .svn directories in checked-out projects. These files exists purely for subversion and should be kept from being browsed on websites. Thus, they should not be copied/shown on the staging and production servers. Assuming the workflow in the diagram above, using rsync (or another tool/script?) instead of scp may help prevent those directories and files from being copied over to the staging area.

Where to store docs and pdf’s?
Some sites contain links to documents and pdf’s that are not related to the web development process. These files should not be versioned. However, since some web pages may contain hyperlinks to these files, the files should be stored in a shared location where they can be accessed from any area. Another option is to duplicate the files and store them locally within each area. Regardless of the solution, the important part is to make sure the hyperlinks are not broken on the production site.

Trunk, Branches, and Tags
The O’Reilly Subversion book suggests a Trunk, Branches, Tags structure for organizing your repository. The trunk contains the main core of the code. The branches contains personal forks of the project. The tags store version snapshots of the project.
Edgar’s team does not follow this scheme, and perhaps with good reason. The trunk, branches, tags scheme works well for large-scale projects with lots of developers and a constant flux of deadlines and release dates. However, with only a handful of developers, this scheme is overkill. Though it is conceptually a great idea, we can dismiss it for our local projects since we only really use the “trunk” for our versioning needs.

Mounting directories on OS X created crud files?
I’m not aware of the specifics, but OS X can create extraneous files that convolute the web project. These should be detected and deleted.

Can Subversion append log messages at the top of source code files?
A good question brought up during the meeting. We are unsure of the answer. I am unsure of the action anyway, since logs can quickly grow in size and would bloat up the source code files. Perhaps exporting the log to a changelog file is better.

Copying to the staging area is rare, to the production site even rarer
Edgar mentions that copying the source code to the staging area is a rare occurence, maybe happening once a week. This message here seems to be: Never update the staging/production areas arbitrarily! Any time you move code over into those areas, it should be well thought-out before-hand. Note that this is different from committing your source code to the repository, which should happen more regularly…

Commit and Update Regularly
Edgar stresses this as an important practice. Always commit your code and comment it when you make a change. Never commit broken code. If you have errors lingering, be sure to fix them first so that the repository can always contain a working copy. Always update as much as you can to prevent your code from falling out of sync with other developers. Failing to do so may result in a plethora of conflicts and merges later on, so it’s best to keep up-to-date as much as you can.

Comment in Detail
Don’t write half-assed comments each time you commit. Take a minute or two to write a well though-out comment. Try to make it specific about the change you’ve implemented. By committing your code regularly, your comments become more precise and plentiful, resulting in a more informative log.

Communicate
It’s always important to communicate with each other, whether via email, aim, or in person. Using collaborative tools like Subversion is not a substitute, or even a medium, for solid communication.

Last week, Shaun and I met with Robert Thombley to discuss website structure goals and best practices. Before this, Shaun and I had never had to explain our website structure to someone who was completely unfamiliar with it. This gave us a good chance to put into words the philosophies behind the structure of the various Ocean Informatics sites. Some of these ideas we have implemented more fully than others; below is a brief outline of the discussion and where I feel we stand on each point.

1) Website file system structure: We all agreed that the file system structure should mirror the site’s navigation. First, it makes the site more sensible to the designers, and easier to work with. Second, it obviates the need for a lot of absolute links or ‘..’-based relative links, which make sites less modular. Third, it allows you to cleanly automate navigation elements (such as the breadcrumbs on the Palmer site). The OI sites have stayed have stuck close to this design philosophy, with the exception of the inherited Palmer site, which is currently on the list for some file system restructuring.

2) Naming conventions: Strong naming conventions are worth the time to set and observe. If you need to sift through your site’s files - particularly useful for auto-generating content and navigation - good naming conventions will save a lot of time by making your actions scriptable. If you are auto-generating navigation elements based on the file system, it is also important to have easily readable filenames. You may know that 05aug_phmeet means ‘August 2005 Phytoplankton Meeting’, but your navigation-building script won’t know this unless it is explicitly told. The OI sites are fairly good in terms of naming conventions, but the conventions aren’t always consistent between subsections of a site. For example, files in different projects may use different date formats in their filenames. Also, some sites were set up without auto-generated navigation in mind, so user-unfriendly abbreviations are sometimes an issue.

3) Data storage: The main issue we talked about here is storing data in database software versus storing it in flat files. The pros and cons of each approach is a blog post of its own; I won’t go into detail on that here. The most important point that was raised was that, whichever approach you take, if you implement it cleanly migrating to a new approach should not be a difficult task. Well-formatted data is more important than the storage medium itself. On the Palmer site, where we are currently migrating from flat files to a database, we’re finding that individual data files are well structured and easily parsed. However, the data as whole lacks some coherency, because the datasets were not initially recorded with interoperability in mind.

4) Templating and inclusions: Templating is an area in which it is important to strike a balance. A rigid template can be frustrating from a design point of view, but a highly configurable template can become difficult to manage. The OI sites are currently lean more towards rigid designs, with little configuration on a per-page basis. We are implementing some PHP code, however, which allows us to override the default template for those few pages where the standard design is not workable.

5) Database-driven sites: We also discussed the merits of database-driven sites. For cases where being able to quickly and easily recreate the structure and navigation is more important than flexibility of design, DB-driven sites are very useful. Robert mentioned that the site he was designing would be re-implemented for each cruise - four sites a year, all using the same structure and navigation, just different content. In a case like this, the initial, higher overhead of setting up a DB-driven site pays off in the long run. Although we have DB-driven elements on most of our sites, none of the OI sites are fully database-driven (yet).

These five points constitute only a small fraction of the considerations one must make when setting up a website. As we continue to present our sites to others, more strengths, shortcomings, and goals will come to light.

Geoffrey Bowker appeared in a “top story” on CNN.com this morning in an article about wireless technology: Wireless technology changing work and play

cnn.com front page

The article discusses advances in wireless technologies and how it impacts social ettiquette and security.

cnn.com article

Congratulations Geoff for having your name in a worldwide news story!

Last month, we upgraded iOcean to Mac OS 10.4.2. Since then, a series of ‘ripples’ from the update have occurred that show how much impact an environment change has beyond the initial three hours of downtime.

1) After the upgrade, we needed to upgrade to a newer version of the server administration tools from Apple (the 10.3 tools can’t be used to administer a 10.4 server).

2) In order to run to 10.4 administration tools, the OS on the computers from which iOcean is administered had to be upgraded to 10.4 as well.

3) One of the workstations from which iOcean is administered (my workstation, in fact) did not have a DVD drive, and thus OS 10.4 had to be installed on it over the network

4) We did not have a NetBoot/NetInstall server, so the service need to be set up on one of our servers, and the NetInstall images created, a process that took several hours of reading, installation, and troubleshooting.

5) The workstation I was using turned out to be incompatible with NetBoot 2.0. Luckily, we had another, newer computer that was not in use, and became (after some hardware swapping) my new workstation.

In the end, the iOcean upgrade - which was fairly minor, as it didn’t involve a major version change to Apache, PHP, MySQL, or any other services on iOcean - resulted in a great deal of additional work. In this case, the additional work - the 10.4 upgrades, the NetInstall setup, and even the hardware swapping - are all things that were slated to happen soon anyway, so the effort was not really wasted.

However, it is also the case that this additional work could have been planned out ahead of time. For example, although I knew I would have to upgrade my workstation to 10.4, I didn’t think to check that we had compatible media until I sat down with the disk in my hand. Likewise with the NetBoot 2.0 incompatibility - it was discovered during the NetBoot setup process, not beforehand. Knowing that this incompatibility existed would have saved me a few hours in the long run. Having to deal with these issues as the after-effects of the iOcean upgrade, rather than projects in their own right, made them more pressing and meant there was less time for preventative research.

Of course, in hindsight it’s easy to see the problems one should have known about. In practice, there will often be issues that are overlooked, and as such there will often be ripples to deal with.

The article What is Ruby on Rails from onlamp.com helps explain all the fuss.

An excerpt:

You can usually divide web application frameworks and the developers who use them into two distinct categories. At one end of the spectrum, you have the heavy-duty frameworks for the “serious” developers, and at the other end you have the lightweight, easy-to-use frameworks for the “toy” developers. Each of these groups generally regards the other with disdain.

One of the most interesting things is that Rails is attracting developers from both camps. The high-end developers are tired of the repetitive, low-productivity routine that they have been forced to endure, while the low-end developers are tired of battling a mess of unmanageable code when their web apps move beyond the simple. Both of these disparate groups find that Rails provides sustainable relief for their pain. I don’t know about you, but I find this quite remarkable!

I’ve been bookmarking and tagging various sites and articles I encounter on del.icio.us. You can view my at delicious page, and even pull the rss feed.

Delicious is a social bookmarking manager. It’s basically a place to bookmark webpages online. This has the advantage of comfortably marking and referring to webpages from anywhere. No need to rely on one local machine, or to sync multiple local machines so that all resources are kept the same.

Delicious takes things one step further. With each bookmark, you can apply a number of tags (keywords). Tagging is a recent phenomenon in providing meta-information to objects. Just like how Gmail works with labels, tagging frees you from the static placement of an object in a given category. For example, his post is tagged with Hints and Tools (Note: WordPress calls them categories, however, tags would be more proper).

Delicious still takes things even further. It not only shows which sites are most popular (based on how many people have bookmarked it), but it also shows sites grouped by tags. For instance, if you want to find sites that contain cooking recipes, you can view all sites that have been tagged with “recipe”. This alternate approach to Google is somewhat more personal, as the human element in the filtering process is clearly visible.

There are rss feeds for every tag. So I can keep working here while my news aggregator quietly keeps collecting new sites or articles from delicious tagged with “php” or “ajax”.

There are other sites that make use of the del.icio.us API. One of them is populicious. Populicious aggregates recently bookmarked sites within the last day (or two), and organizes them by popularity, allowing you to easily see where the latest trends. Many of the sites/articles I’ve bookmarked I’ve found using populicious and trendalicious.

This stuff is all very worth checking out. I keep finding new interesting artciles every day, and a lot of them pertain to what we are doing here: ontologies, interfaces, css, php, etc.

Some of my recent items include:

If any of you who reads this is intrigued by any of these articles, please browse through my delicious collection, because I have tons of them! Perhaps you may want to start your own account. It’s free!

In response to this

“CMS in a Nutshell” describes one community’s process in selecting, installing, configuring, and using an open-source CMS. They selected PostNuke, explaining in detail the steps they took from selecting a CMS (from a pool of many open-source projects) to migrating their “finished” project to the production server.

Last year, we published an article in LTER Databits about our initial experience with PostNuke, highlighting few basic features, yet providing no real depth into the inner-workings of PostNuke or the thought-process behind setting up a CMS. We barely touched the surface of how PostNuke worked for us. The definition of what tools we needed as a community was blurry. Our article was more conceptual in speculating how emerging technologies like blogs and rss feeds could benefit us internally, and also the greater LTER community.

Last month, we published a second article, this time explaining our short-comings with PostNuke and other “failed” experiments with open-source projects. It’s easy to place blame on the projects themselves (a lot of open-source software is clunky) for not meeting our standards, but I think that part of the blame falls on us. We simply didn’t know what we needed, and hence we didn’t know what to look for. Further more, we never set a procedure like mentioned in the above article for installing, configuring, and assigning user-specific roles.

How is it that one community has much success with PostNuke, while another community struggles with it?

The issue here is that first community started with a clearly defined set of goals and deadlines. They knew what they wanted to accomplish and they followed a straight-forward process in reaching their goals. The second community started with a fuzzy definition of what they needed, and hence were unable to get anything off the ground.

Our failure with PostNuke resulted from a lack of recognizing our needs as a community. This caused us to dive blindly into PostNuke with no foresight of how it may or may not work for us. We were turned on by PostNuke’s vast offerings of features, hoping we’d find something that “worked” and would stick. Instead, we were overwhelmed by the many features, and ultimately gave up on the CMS altogether.

This wordpress blog has been a success so far. Even though we are a small community, we had one clearly defined goal when initializing the blog: to post and share information within the community. Because it does only this one thing, and does it well, we continue to use it with comfort. It may not be the most optimal solution for us (e.g. it lacks a file manager), but it’s strong enough to keep us engaged as a collaborative community as we continue to discover other “solutions” out there.