Environment


Giving WordPress Its Own Directory - This WordPress Codex contains the instructions for storing the core WordPress files in a sub directory. This removes the clutter of wp-* from out document root.

There may be a few buggy links at this time while we complete the transition to WordPress 2.0.4 and install the k2 theme.

I’ve recently made a couple of changes to subversion and the apache2 instance through which we access our repositories.

First, I’ve upgraded subversion to the latest stable release, 1.3.2. This update contains boh new features and bug fixes; you can read about them fully in the relase notes. The repository format has not changed since 1.1.x (we had 1.1.3 before), so our repositories were able to remain in place.

Second, I’ve recompiled apache2 with LDAP and SSL support (this instance can be found in /usr/local/apache2). This changes our repository access in two ways. Because SSL is enabled, connections to the repository need to be made with https, not http. Current working copies can be updated by using this command in the root of the working copy:

svn switch --relocate \
http://oceaninformatics.ucsd.edu:8800/svnrep/[path] \
https://oceaninformatics.ucsd.edu:8800/svnrep/[path] .

Also, LDAP support means that apache2 is now authenticating against our OpenDirectory server, not an htpasswd list. Whenever subversion requests a name and password, use your OpenDirectory name and password (this is the same name and password you use for your coast/IOD email account).

If you find any strange behavior or cool new features with the upgrade, please post about them here.

O’Reilly’s Ambient Findability (Peter Morville) provided a broad contemporary overview of data and information issues. Here are three recent publications relevant to today’s data environments:

-A Nature journal issue with Microsoft on 2020 Vision
http://research.microsoft.com/towards2020science/nature.htm;
http://www.nature.com/nature/journal/v440/n7083/index.html

-The term “dataspace” creates a wider-than-one-system conceptual umbrella for collections
MFranklin,. AHalevy, and DMaier, 2005.
From Databases to Dataspaces: A New Abstraction for Information Management,
ACM SIGMOD Record;
http://portal.acm.org/citation.cfm?id=1107499.1107502

-A look at the handling large amounts of streamed data - historic and real-time
Schandrasekaran and MFranklin. 2004.
Remembrance of Streams Past: Overload-Sensitive Management of Archived Streams.
Proceedings of the 30th VLDB Conference, Toronto, Canada
http://www.cs.berkeley.edu/~franklin/Papers/ChandrasekaranVLDB2004.pdf

Over the weekend we had some suspicious activity involving the wiki module in our abandoned PostNuke project. This prompted a major cleanup in the oceaninfo-dev to remove all unused CMS’s that were used or tested by various projects.

The CMS’s included: PostNuke, Xoops, Mambo, Drupal, WordPress, OpenDocMan, and Plone

The projects included: Ocean Informatics, Interoperability, CCE LTER, and Palmer LTER

For some of these projects, I downright deleted the code-base and corresponding database. I only did this for projects/cms’s I was sure have never (and will never) be used:
- Interoperability/WordPress - we had installed a blog for interoperability, but it was never used
- Interoperability/OpenDocMan - an open source document manager, never used
- Palmer LTER/WordPress - I played with this last month, but have since migrated to a file structure so this is no longer needed
- CCE LTER/WordPress - Same as Palmer LTER
- Plone - Open-source CMS we are not using, nor are capable of running at this time since it requires a ZOPE environment? We’ve had the installation package sitting on our server taking up space. I deleted this (we can always dl it again if we want).
- Additionally, I removed some backed-up code like ‘interop2′ and ‘interop3′ which I once saved mainly as snapshots.

Don’t worry, I didn’t delete everything. For the more significant sites (for historical purposes, etc.), I created tar archives of the code-base and the sql dump so that they can be easily recreated should we want to play with them again. These tar files are stored in project/oceaninformatics/web_archives:
- ccemambo.tar - CCE LTER site in Mambo
- ccepn.tar - CCE LTER in PostNuke
- ccev1.tar - First version of the CCE LTER site. We originally entertained building the CCE LTER in a static file structure and also in a couple CMS’s (above)
- interop_xoops.tar - The interoperability site in Xoops. It ran in Xoops for about a year. Eventually we migrated it out of Xoops, and the entire site got a complete overhaul just last month.
- oi_drupal.tar - Ocean Informatics in Drupal. Not much there, but we can archive it anyway.
- oi_mambo.tar - Ocean Informatics in Mambo. Same deal as Drupal.

Ironically enough, the one Project/CMS I left intact for now is our Ocean Informatics/PostNuke site. Because we have some referential content in there, I would like to make sure I find the best way to archive/migrate the content. I have since disabled the wiki module (since that’s what gave us the warning emails).

The referential content I am referring to is our old News blog: OI Core and OI Dev. My original plan was to save a couple web pages spanning all entries. However, I realized that this approach wouldn’t save comments for any posts.

What would be the best way to archive an array of threads? Should we import the old posts into WordPress? Or should we recreate a structure of nested web pages, where each sub-page shows the entire thread for one post?

The article on Extreme Programming (Is Design Dead? Martin Fowler) that Shaun shared (http://www.martinfowler.com/articles/designDead.html) provides some great vocabulary and insights, comparisons and contrasts. XP is an abbreviation used to refer to Extreme Programming but includes extreme experience, planning, and design as integral elements to create a highly nimble programming process.

This is a story, told by a voice of experience, of the separation of design and programming. The example used is the case of building a skyscraper over time with first a Chief Architect and then a team of programmers who follow the design plans. Fowler presents XP as an alternative where rapid (re)programming is tied to rapid (re)design. The example given is of building a smaller entity (ie let’s say a garden shed or a dog house) and doing it quickly using/developing local expertise that interfaces/integrates local scientific knowledge. XP seems related to our discussions of information infrastructure building. Real world practices are freqently a hybrid of such ‘extreme’ cases but from the article comes the thought to consider the ramifications of the ’size’ of the need at the time of project initiation. I’m still pondering whether the meaning of ‘refactoring’ is related to ‘iterative design’.

Meanwhile decided to represent the post as both an approach to an informatics environment and/or as a conceptual tool.

An exchange of emails with the UCSD library staff has given me some new insight into adapting UCSD’s Google search appliance to our web sites. The ‘proxystylesheet’ parameter can be set to the URL of a XSL file that is then used to style the appliance’s XML output. The search function on the UCSD Libraries page uses the following XSL file:

http://gort.ucsd.edu/itd/libraries_stylesheet.xsl

I haven’t tried developing an XSL file for the CCE search yet. I’m actually not sure if it is a better approach than the PHP wrapper approach I described in an earlier entry. The XSL file would have to be maintained separately from the CSS files that define our sites’ layout and appearance. This means that any time we made a change to the look and feel of a site, we’d have to duplicate that change in the XSL file. If anyone else has any thoughts on the pros and cons of the different approaches, I’d be interested to hear them.

I installed DokuWiki on iOcean a couple days ago. For those of you who weren’t there, DokuWiki is a code/project documenting tool that was demoed at last week’s WebHeads meeting. I haven’t had much chance to play around with it, but you’re all welcome to go take a look and try it out. The URL is:

http://oceaninformatics.ucsd.edu/dokuwiki/doku.php

There are no access restrictions in place right now, so don’t post any code that contains sensitive information (e.g. MySQL passwords). Also, you may want to look over the DokuWiki syntax if you want to try adding an entry.

The ongoing effort to bring Google Search to the OI sites has made some progress recently. While it turns out that we can’t register sub-domains of ucsd.edu for Google Public Service Search, we can tap in to UCSD’s Google Search Appliance. There are two ways we can do this:

1) We could create a custom front end, called a ‘client’, that would get registered with the search appliance. We could then create forms that queried the search appliance and requested that the results be displayed in a specific client. Hidden fields in the form could restrict the searches to one or more sub-domains.

2) We could use a PHP script as a proxy to send queries to the search appliance, again restricting searching to one or more sub-domains. The script would then get the results back as XML, parse them, and redisplay them in any format we choose.

So far, I’m not even sure if the first option is offered by UCSD - and even if it is, the second option seems to offer more autonomy and flexibility. So, I’ve started work on a PHP script that implements option two. Using the PHP XML parser functions and the Google search protocol, it’s been relatively painless to get a simple search interface up and running. My first experiment has been a simple test search page on the CCE site (I’m still working on the script, and I make no assurances that this page will be working when you click the link).

So what comes next? The search interface needs more polishing, especially if we want to implement feature similar to Google’s advanced search page. Also, now that we’re implementing searches, some issues have come to the forefront. For example, we need to consider page titles - right now, most of the pages on a given site have the same title, which makes search results less informative (thanks, Shaun, for pointing this out). Our current design templates weren’t created with a search bar in mind, so we’ll need to work this element in somehow. We might also want to start adding keyword metadata to our pages for more meaningful search results. I’m sure there are other issues to be addressed as well - any thoughts?

As we’ve been expanding the scope of the services offered through the various Ocean Informatics web sites, the issue of authentication has come up more than once. When a user connects to one of our web applications claiming to be Mason Kortz, how do we know he really is Mason Kortz? Answer: we put the user to the test by requiring a password that only Mason Kortz would know.

The first issue we dealt with was how to securely transmit passwords over the Internet. Sending a password from your computer to a server is like writing it down on a piece of paper and passing it through many hands to another person. If the wrong person grabs it and looks at it along the way, it’s not very secure any more. That’s why last month we enabled SSL on our Ocean Informatics sites. Sending a password over an encrypted connection, like the ones formed with SSL, is like writing it down on a piece of paper and passing it through a cast-iron pipe to another person - significantly more secure.

So, with a safe way for users to submit their credentials, the next issue is to create and maintain a list of names and passwords to check the submitted information against. There are currently two lists of users in the OI environment that could be considered authoritative - the system users for iOcean and the personnel directory. An ideal authentication system would use one of these lists, and avoid the need to maintain a third, independent list.

Our first take on authentication was to use the Apache mod_auth module. Mod_auth allows you to secure your web space on a per-directory basis, using .htaccess files. A simple command line tool, htpasswd, is provided to maintain the text files that form mod_auth’s backend. When a user authenticates through mod_auth, their username is available to PHP through the $_SERVER superglobal array, so web apps can make use of this form of authentication. Mod_auth does have its downsides. It slows down a bit with large numbers of users, and the user/password list it uses stands completely alone, so it would have to be maintained independently of the server users and the personnel directory.

These issues brought us to the next possibility - creating our own authentication app using PHP over MySQL. Specifically, the login app would draw on the personnel database for user/password matching, and then set a cookie that could be accessed by all other web apps, effectively creating a single sign-on service for the OI web space. This solution would have a more robust backend than mod_auth, and would tie the login information to the contact information in the personnel directory. Again, though, there are downsides. Cookies are stored locally on the user’s computer and are easily read, so keeping our sites secure against ’spoof’ cookies would mean devising some security measures of our own. Also, the personnel directory is still divorced from the server users list - so although this solution provides single sign-on for the OI web space, it does not extend to the rest of the OI infrastructure.

This brings us to LDAP/Kerberos, which IOD is moving towards for network authentication. Whenever you sign on to a network resource - which can mean logging in to a computer, connecting to a shared drive, or accessing a secured webpage - your username and password are checked via the Kerberos server against the LDAP database. If they match, you are given a Kerberos ticket, verifying that you are indeed the user you claim to be. This ticket is good for all other (Kerberized) network resources as well, so you have strong single sign-on capabilities. Also, the LDAP database can contain more than just a username and password - it has a customizable schema, and by default supports extensive contact information. This would allow us to maintain one user list that acted as both the authentication database and the personnel directory. Of course, there are drawbacks. Setting up an LDAP server is not a quick task - at the very least it is on par with building our own authentication system. Furthermore, LDAP isn’t a familiar technology, so even once the server is running there will be learning curve as we figure out how to work with the schema files and customize the server to our needs.

It’s time to recap some of the major changes made to the Ocean Informatics Site this week. Beware, this post is hefty. Let’s get started!

New Theme

As mentioned in the previous post, the Ocean Informatics blog now employs a new theme. For those not in the know, this Ocean Informatics blog is powered by WordPress, an open-source php/mysql application. WordPress is a very popular and widely used blogging platform. Additionaly, it benefits from having excellent documentation and strong community support, resulting in a on-growing collection of plugins (functionality) and templates/themes (presentation).

The original theme we used for this blog was a modification of the default WordPress theme. It was pretty bland. This new theme comes from a 3rd-party designer, and is also found on WordPress’s page of featured themes.

This theme is entitled Connections. I changed the hues of all the images and styles from green to blue to fit better with our original Ocean Informatics header image. I also moved the search bar to the top right of the page, and changed the nav links and the sidebar blocks to better fit our needs.

Development/Modification of a WordPress theme is fairly simple yet time-consuming. Fortunately, the bulk of the layout and structure work had already been accomplished, so my main focus was in changing the colors.

Oceaninformatics.ucsd.edu and the old OI Site

Up until 2 days ago, going to oceaninformatics.ucsd.edu would redirect you to Ocean Informatics Portal. This portal site was virtually a scaffolding site, resulting from a rapid development that wasnt given much thought. To be honest, I’m not sure what its purpose ever was, and I know for sure that no one has ever been using it.

I’ve read that WordPress can be used as a simple CMS tool in addition to a blog. Afterall, WordPress allows for the creation of static pages in addition to blog posts. Also, with the armada of external plugins available, it should be no problem to find (or develop) WordPress plugins to bolster our specific needs. Thus, I decided that the dormant OI Portal site can be integrated with WordPress. Given the power and simplicity of WordPress, we can update extra pages/data in addition to blog posts, creating a multi-functional Ocean Informatics web space.

Our WordPress installation resided in oceaninformatics.ucsd.edu/wordpress. I moved the installation source up one level to oceaninformatics.ucsd.edu. Now, instead of redirecting to the OI Portal Site, the domain points to this blog’s home page (which has the same content as the portal’s home page). The link to the OI blog remains the same.

Technical Notes on the Installation Migration

  • Before the move: The original WordPress location resided in a directory called “wordpress”. A symbolic link named “blog” pointed to the wordpress directory. This is how the oceaninformatics.ucsd.edu/blog link worked. The oceaninformatics.ucsd.edu/wordpress link also worked, though we always used “blog” instead. (Both links are still active at the time of this writing… try them out).
  • The move: I didn’t actually “move” the source code. Instead, I copied it up on level. Thus, the source code is duplicated at 2 places: The oceaninformatics.ucsd.edu root, and the oceaninfomatics.ucsd.edu/wordpress directory. Since changes (hacks, plugins, themes) are only being made to the source code at the root level, the oceaninformatics.ucsd.edu/wordpress is now obsolete.
  • After the move: Originally, oceaninformatics.ucsd.edu pointed not to a home page, but to the blog page. I installed a plugin that enables a WordPress site to have a static front page. To maintain the blog url, I added apache mod_rewrite rules to the .htaccess file so that oceaninformatics.ucsd.edu/blog really points to oceaninformatics.ucsd.edu/category/blog. (This is how the blog url stays the same). More on categories in the next section….

Categories

Our list of blogging categories continues to evolve. WordPress supports hierarchal categories, and thus, our categories have become hierarchal. You’ll notice this next time you write a new post. The current state of categories looks like:

- Blog
– Announcements
– Community
– Data/Metadata
– Environment
– Hints
– Questions
– Tools
– Visualization
- Page
- Private
– … (not typed here to protect cat. name)
– … (not typed here to protect cat. name)
- Public
– Reading Group

The only categories this blog is concerned with are those that fall within the parent Blog category. Hence, the blog category page becomes our blog. You do not need to explicity categorize a blog post with “Blog”. As long as any one of the child categories are used, the Blog category is inferred. By default, all new posts are categorized under Blog.

The Page category used to be called “Uncategorized”. By default, WordPress categorized all pages under Uncategorized. I simply changed its name to Page. There is no interface in the admin panel to change a Page’s category.

The Private category (and its children) may be used to store information that only registered users should access. Likewise, the Public categories may contain information that anyone can access. This information can be stored in posts and also the custom meta fields. At this time, I am not sure if or how I would implement this functionality. My reasoning for private vs. public stems from the fact the the OI Portal site contained extra “private” information that was only accessible once you were logged in. The addition of these categories is currently serving as an experiment, and they may not ever be used.

Plugins

Plugins are a way to add extra functionality to WordPress. We’ve had some plugins installed since day one. Here’s a run-down of plugins we are currently using.

Active Plugins

  • Extended Live Archive - This recently installed plugin provides an AJAX-based interface for browsing through our entry archives. It rocks.
  • Email Notify Comment Authors - This plugin emails all authors in a thread anytime a new comment appears in that thread.
  • Search Everything - This plugin extends the WordPress search capabilites to look in comments and pages in addition to just posts.
  • Static Front Page - This plugin allows the front page of your WordPress site to be a static page instead of the blog. (To display the blog page, we point to the Blog Category Page).
  • Subscribe2 - This plugin sends email notifications to users anything there is a new post in the blog.
  • Time Zone - This must-have plugin tells WordPress to justify time changes related to Daylight Savings Time

Additionally, I am also considering a plugin entitled Private Categories 2 which allows you to mark a category as “private” so to hide any posts under that category from unregistered users. As of now, I am not impressed with the robustness of the plugin, and may search for alternative solutions. (Hence, I may not use those Private/Public categories at all).

Hacks

Jerry discovered that WordPress lacks the ability for registered users to edit their own comments. What a surprise… honestly! I figured that implementing this functionality would be somewhat trivial, and that surely someone had already developed a plugin to add this feature. Apparently, it’s a tough plugin to write.

Fortunately, for me it was an easy hack. As an aid to group blogging, users can now edit any of their own comments. (See the previous thread for the back-and-forth conversation between Jerry and myself regarding this issue).

Pages

I am in the process of porting over some of the “static” pages from the portal site to WordPress. The Pages links can be seen on the home page in the Pages block in the sidebar. One example of a ported page is Reading Groups.
» New Reading Groups page
» Old Reading Groups page

Concluding Notes and Thoughts

  • I’m partially convinced that the old OI portal site never got use because it wasn’t well supported, especially as a CMS tool (for common users to edit content). I’m not sure if migrating it to WordPress will resurrect any usage, but I feel that providing some simple means for collaboration may help.
  • I turned on the permalinks function for WordPress, which uses mod_rewrite. Basically, what this means is we now have “pretty” urls. So we see http://oceaninformatics.ucsd.edu/2005/11/15/testing-new-theme/ instead of http://oceaninformatics.ucsd.edu/blog/?p=76, even though they both point to the exact same place.
  • Regarding email notifications: No offense, but email notifications are sooooooo 1997. The real method now is RSS. We have 2 plugins used for email notifications, but WordPress offers 2 RSS feeds as part of its core functionality: One for new posts, the other for new comments. For those who like to evolve with technology, consider using these instead.
  • Regarding editing comments and posts: I mentioned this in the previous thread, but any edits to comments and posts should be done with care. It’s one thing to fix a small typo, or to rephrase a sentence. However, it’s bad practice to edit a bulk of content, or remove it completely. If you need to update any information mentioned in a previous comment or post, the best way to do it is to append the new information at the top (or bottom) of the post, headlined with “UPDATE”. This gives the users a clear visual clue that something has changed.

Next Page »