Tools


On 6 March, 2006 there was an alert from iocean regarding probable failure with the RAID disk system:


==============================================================
A problem has been detected on this server.

Status Summary

Reason(s) for notification:
Drives

Server:
Host : iocean
Model : RackMac3,1
Uptime : 67016 minutes
OS version : Mac OS X Server 10.4.4 (8G32)
Processor : 2 x 2000 MHz
Memory : 1024 MB
BootROM : $0005.17f1
Serial : QP41703XPNK

Memory:
Memory Slot "DIMM0/J11" : 512MB, ECC DDR SDRAM, PC3200U-30330
Memory Slot "DIMM1/J12" : 512MB, ECC DDR SDRAM, PC3200U-30330

Drives:
Drive 1 (disk2) : Normal
Drive 2 (disk1) : Normal
Drive 3 (disk0) : Warning
==============================================================

Over the course of the few days leading up to the failure warning, a number of users reported slow database performance in some web applications and services provided by iocean (Hlab Forum, etc.). The data disk[s] on iocean (/Volumes/iodata) is a RAID 1 (mirrored) diskset consisting of two (disk0 and disk2) of the three disks on the server (the third disk, disk1, is a standalone volume, ioceanHD, which contains the operating system and applications). The RAID volume, iodata, is a software-based RAID.

iocean is under AppleCare warranty which provides hardware support for up to three years after purchase. We ordered a new disk from Apple through the Upper-Campus Tech Services and it arrived late afternoon on 7 March. We replaced the failed disk with the new disk the next morning, 8 March, and tried to start rebuilding the RAID array using the Disk Utility (GUI) tool on iocean. This is accomplished by dragging the new disk icon into the RAID window. However, this didn’t work. The GUI tool gave no indication as to why this failed. We next went to the commandline tool, ‘diskutil’ to diagnose the problem. Using the command:


===========================================
iocean:~ jrw$ diskutil list
/dev/disk0
#: type name size identifier
0: Apple_partition_scheme *233.8 GB disk0
1: Apple_partition_map 31.5 KB disk0s1
2: Apple_Boot 128.0 MB disk0s2
3: Apple_RAID 233.6 GB disk0s3
/dev/disk1
#: type name size identifier
0: Apple_partition_scheme *76.7 GB disk1
1: Apple_partition_map 31.5 KB disk1s1
2: Apple_HFS ioceanHD 76.6 GB disk1s3
/dev/disk2
#: type name size identifier
0: Apple_partition_scheme *233.8 GB disk2
1: Apple_partition_map 31.5 KB disk2s1
2: Apple_Boot 512.0 KB disk2s2
3: Apple_RAID 233.8 GB disk2s3
/dev/disk3
#: type name size identifier
0: Apple_HFSX iodata *233.6 GB disk3
===========================================

This showed that the old disk, ‘disk2′ (formated in Panther) had a slightly different partition mapping than the new disk, ‘disk0′. The disk2 had a smaller “Apple_Boot” (disk2s2) partition than the new disk formatted with Apple Disk Utility, thus, the main data partition (disk2s3) was larger than the old disk. The software RAID application won’t allow RAIDs with disks of dissimilar partition maps.

The solution:

Create a single filesystem disk out of the new disk using the Disk Utility partitioning option (HFS+ w/Case Insensitivity and Journaling) .

Copy the current filesystem running on the remaining disk of the degraded RAID set over to the newly formatted disk using Carbon Copy Cloner (or the Restore tab of Disk Utility).

Create a new unpaired mirror RAID set on the new disk, using the “enableRAID” command under the commandline ‘diskutil’ application.

Delete the old RAID array (do this with extreme caution because all of the data on this disk will be erased, i.e., make sure all of the data on this disk has been copied to the new unpaired RAID array before taking this step) using either Disk Utility (GUI) or diskutil (commandline).

Repartition the old disk (the remaining good disk in the old RAID array) so it matches the partition map of the new disk

Using the “repairMirror” option of the commandline ‘diskutil’ application, or by dragging the newly partitioned old disk into the new RAID set in Disk Utility, this disk is incorporated into the newly established RAID array.

The RAID repair, or rebuild, is run as a background process, which means that the computer continues to function online, though with somewhat degraded performance, throughout the rebuild. All user activity should be unaffected. If the problem with the different partition maps hadn’t cropped up, the entire process outlined above could have happened without ever taking the system offline. The physical disks are “hot-swappable” (they can be removed and inserted without taking the system offline) so they can be replaced and the RAID rebuilt without a break in service.

I’ve been using phpMyAdmin to generate database schemas. Since I use phpMyAdmin pretty much every day, it’s very convenient as opposed to DBDesigner, which resides on a separate server and has the overhead of connecting to and opening another app (just for one function).

The schema diagrams generated by phpMyAdmin are pretty, with colorful straight lines to represent table/field relations. However, the major pitfall is that these schema diagrams don’t show the data type. It’s usually helpful to know whether a field is an int, float, char, or text. This is one thing that DBDesigner does quite well.

Given that we have a couple meetings next week involving schemas (one of them being the start of a series of schema meetings), I decided to hack the phpMyAdmin code a bit so that it shows the data types side-by-side with the field names. This way I can continue to use phpMyAdmin to generate diagrams without the comprising of less information.

Beware, the rest of the post gets a bit techinical (aka geeky).

The only file I hacked was pdf_schema.php. This is located in the root dir of the phpmyadmin install.

This file contains the PMA_RT_Table class, which is where all the hacking was done.


class PMA_RT_Table {
// lots of code...
}

First, I added a new private var to store an array of types. This corresponds to the existing array of fieldnames for the class.


class PMA_RT_Table {
   var $fields = array(); // existing code
   var $types = array(); // my code
}

Next, in the constructor function, I added a line of code to populate the types array:


function PMA_RT_Table(...) {
// ...snip

       // load fields
       while ($row = PMA_DBI_fetch_row($result)) {
            $this->types[] = $row[1];  // my code
            $this->fields[] = $row[0];
        }

// ...snip
}

I then added code for the table width calculation to ensure that the drawn table diagrams would be wide enough to display both the field name and data type:


function PMA_RT_Table_setWidth($ff) {
// ...snip

      foreach ($this->fields AS $key => $field) {
           // srh hack to set width for field and type
          $type_arr = split(" +",$this->types[$key]);
          $type = $type_arr[0];
          if ('enum' == substr($type,0,4))
                $type = 'enum';
          if ('set' == substr($type,0,3))
                $type = 'set';
          $field .= " [{$type}]";

// ...snip
}

Finally, I added the same code to actually splice the field name with the data type:


function PMA_RT_Table_draw(...) {
// ...snip

      foreach ($this->fields AS $key => $field) {
           // srh hack to show data types next to field names
          $type_arr = split(" +",$this->types[$key]);
          $type = $type_arr[0];
          if ('enum' == substr($type,0,4))
                $type = 'enum';
          if ('set' == substr($type,0,3))
                $type = 'set';
          $field .= " [{$type}]";

// ...snip
}

I won’t bother going into depth about how and why I hacked the code. Doing so would be hard without presenting more context from the pdf_schema.php file.

The main purpose of this post is to provide me with a memory refresher so I can come back and reference it later at a critical time (i.e. upgrading phpmyadmin). Plus it’s always good to document work like this, no matter how you choose to document it. I chose the blog, and by doing so I am making visible a minor part of my work that would otherwise remain situated in oblivion.

View the difference!

» Before the hack (pdf)

» After the hack (pdf)

A quick note about setting runtime user config settings for Subversion.

First the background:
Lately I’ve been using TextMate (a really cool text editor) for my development work. TextMate likes to create backup files for each file I edit. It saves these files with a prefixed ._

For example, the UNIX filename backup of file.php is ._file.php

How does this affect subversion?
Everytime I issue a svn status command, it shows me a list of all files that have been modified, added, or deleted in the repository. It also shows a list of all files that are not versioned, including these hidden TextMate backup files.

Fortunately, there is a way to tell Subversion to ignore these files.

Every user has a .subversion in their home directory. In this folder is a config file where you can define a global-ignores parameter (set of regular expressions) which tells subversion to ignore these files. The config file already provides a default set of regex patterns. I simply added ._* to the list. Now those hidden TextMate files don’t show up in the svn status list.

You can also define regex patterns to ignore at a directory level. The command
svn propedit svn:ignore [dir]
opens a file where you can specify which files and patterns you want to ignore for that directory. This is useful in the datacat, for example, because I have a couple unversioned folders: docs, lter

By telling subversion to ignore these folders, I am further eliminating the noise from svn status.

About that propedit command: You need to have defined your editor-cmd in the config file, or pass it in as an argument: svn propedit svn:ignore [dir] --editor-cmd vi

I set my editor-cmd to vi in my config file.

The article on Extreme Programming (Is Design Dead? Martin Fowler) that Shaun shared (http://www.martinfowler.com/articles/designDead.html) provides some great vocabulary and insights, comparisons and contrasts. XP is an abbreviation used to refer to Extreme Programming but includes extreme experience, planning, and design as integral elements to create a highly nimble programming process.

This is a story, told by a voice of experience, of the separation of design and programming. The example used is the case of building a skyscraper over time with first a Chief Architect and then a team of programmers who follow the design plans. Fowler presents XP as an alternative where rapid (re)programming is tied to rapid (re)design. The example given is of building a smaller entity (ie let’s say a garden shed or a dog house) and doing it quickly using/developing local expertise that interfaces/integrates local scientific knowledge. XP seems related to our discussions of information infrastructure building. Real world practices are freqently a hybrid of such ‘extreme’ cases but from the article comes the thought to consider the ramifications of the ’size’ of the need at the time of project initiation. I’m still pondering whether the meaning of ‘refactoring’ is related to ‘iterative design’.

Meanwhile decided to represent the post as both an approach to an informatics environment and/or as a conceptual tool.

Quick note:
With the new interoperability site comes a search box. We should implement the Google searching here.

The interoperability site is now completely under Subversion (once again). Code can be updated either in the production area or the dev area, and the two areas remain in sync quite smoothly.

The caveats:
Images and docs are not stored in the repository. The production area contains 2 non-versioned folders:
/docs
/images

Since the development area uses the same server, I created a symbolic link for each folder above to point to the production area folders.
interop-dev/docs -> interop/docs
interop-dev/images -> interop/images

Note:
The dev area is “down”. We need to reestablish a virtual pointer in Apache to view the dev area.
[UPDATE 2/3] - http://interoperability-dev.ucsd.edu/ is up. Thanks Mason!

I’ve successfully imported the iodpersonnel code into Subversion.

To do this, I created a new directory called svn. I copied all files and folders that belong in the repository into this folder. This file structure looked somewhat like this:


svn/
- README
- control/
- css/
- includes/
- index.php
- js/
- model/
- view/

Then I imported the project:
svn import . http://domain.com/path/to/repos/projectname \
-m "Importing initial iodpersonnel project"

Subversion automatically creates the directories in this path if it needs to. I called my project ‘iodpersonnel’ (which replaces ‘projectname’ in the code above).

To checkout the project, move to what would be the parent directory of where you’d like to execute the checkout. For this project, I wanted to store it under /oceaninfo-dev/iodpersonnel. From the oceaninfo-dev folder, I typed:
svn checkout http://domain.com/path/to/repos/iodpersonnel

That’s all. Now I have a working copy of the project straight from the repository. No trunks, no branches, no tags… nothing else to worry about.

Some of the straight-forward commands which mimic UNIX terminal commands:
svn add - Add a file to the repository
svn delete - Delete a file from the repository
svn copy - Copy a file within the repository
svn move - Move a file within the repository

The three most commonly commands:
svn commit - Commit changes to the repository
svn update - Update your working copy from the repository. Together with svn commit, this completes the 2-way cycle of pushing and pulling to and fro the repository.
svn status - Check your working copy against the repository

Now that the iodpersonnel project is in Subversion, it’s time to checkout the code to the production area as well to fully complete the circuit of this project’s workflow.

Note there are some files excluded from the repository:
- config.php is not included, though config-sample.php is (see post below)
- .htaccess is not included. This file should be created manually for each working area.

Next projects are the interoperability and cce-lter web sites. These are more advanced projects, with images and docs. As explained in the post below, the docs folder that sits at the web root will be excluded from the repository. Additionally, any distributed folders that contain documents as part of an integrated web app (iForum) are also to be excluded.

Mason and I talked a bit about using Subversion to store our web projects. I had originally thought it would be best for each web project to be stored in its own repository (Palmer LTER web is stored elsewhere from CCE LTER web). However, we ultimately decided to stick with the many projects to one repository approach.

This decision comes easily. We already have a repository set up, so there’s no need to go about creating new ones and configuring apache to recognize them. Additionally, none of our projects are on a large enough scale to warrant being separated. Some of the projects may share some common code, and it would easier to merge like components of different projects within a single repository (opposed to distributed repositories). Lastly, it’s easy to migrate projects into seperate repositories should we ever want change to the one-project-per-one-repository approach.

The main issue we discussed is how to store documents: docs, pdfs, imgs, etc… basically files that don’t really need to be versioned. Our solution is to store all such files in a /docs folder that sits at the root level of a given web site. This folder and its contents will not be imported into the repository. When checking out the project, the docs folder will be missing. It will either need to be copied over from another working area, or a symbolic link can be created assuming both working areas are on the same server.

This solution implies that some images may fall under the docs folder. This makes sense, especially if these images are not on the peripheral, but serve a purpose to provide information, be it group photos or mapping grids. Thus, some images embedded into pages’ content will not be versioned.

Some integrated web apps (photo gallery, iforum, etc.) store files at a path deeper than the web root. Any such folders and their contents should also be excluded from the repository. In this case, a README file should be created explaining which folders are missing.

Most of our projects have a config.php file. This file should not be versioned since its settings are usually pertinent only to the local working copy. However, we should commit a config-sample.php file, which is a copy of config.php, but with all sensitive data removed (e.g. database connection settings). A global README file should contain instructions for creating the config.php file:

e.g. Rename config-sample.php to config.php and enter the settings.

A global README file (which rests at the webroot) should also contain information on which folders are being left out of the repository. Each distributed web app can also contain a README file at its root for more in-depth explanation on the information related to that project.

That’s all for now. I’ll add more to this as we make progress loading our projects back into Subversion. The first tests will include the IOD Personnel admin code, and the CCE LTER site.

Last week I began moving the current Palmer LTER site into WordPress, mainly as an experiment to see how well WordPress works as a CMS-lite solution. I had already moved the CCE LTER site into WordPress successfully (being a much much smaller site compared to Palmer). The CCE site performed smoothly under WordPress 1.5. For the Palmer site, I downloaded and installed the newly released WordPress 2.0.

The result was not impressive.

After migrating a dozen pages, I noticed that WordPress 2.0 performs awfully slow, even compared to version 1.5. There were a lot of lag time (wait) between page navigations. Though WordPress 2.0 contains some improved features in the administrative area, the cons of the sluggish performance outweigh the pros of the beefed-up admin interface.

At this point I had 3 options:

  1. Continue with the migration to WordPress 2.0. Maybe the slow performance is a transient issue that may go away (a future release of WP 2.1?, or a server-related issue, etc.)…. this however is a fat chance.
  2. Downgrade back to WordPress 1.5. Afterall, the CCE site runs fine using it, where performance is much faster.
  3. Throw out WordPress completely. I considered this briefly, and it quickly made sense. The Palmer site will be easier to maintain if we go back to the good ol’ file structure, and ignore a CMS-type solution completely. I’ll explain why a little later…

First, my reasons for originally choosing WordPress:

  • Success with Ocean Informatics site - We’ve been using WordPress for our Ocean Informatics blog/home site since August, and it is undoubtely a success. We’ve already established a grounding of understanding and learned the intricacies of WordPress including plugins, hacks, and themes.
  • CMS-Lite Solution - We’ve spent the last year and a half (though not recently) exploring various open-source CMS applications, including PostNuke, Mambo, and Xoops. Though each of these programs excel in providing various features for building a community portal, they required too much overhead for our simple Ocean Informatics group. We needed something lighter. Thus, I played around with the idea that WordPress could function as a CMS-lite solution: ability to post static pages in addition to blog posts, hierarchical ordering of categories, etc. (I still believe that WordPress is a great CMS-lite platform.)
  • Pretty Themes - The most superficial reason of them all, but maybe also the most important? :-)
    Once I downloaded and tweaked (changed colors, graphics, etc.) the Connections theme for WordPress, I felt a lot more comfortable. Building the navigation and sidebar menus became fun. The pages became colorful but not too distracting. The page title was pronounced, and it was enjoyable to read the page content…. basically it was a really well-designed theme. And I didnt have to do any design work! (Except the tweaking of course). Because all the visualization work is done for me, it relieves me of the burden of having to design the website, so I can focus more on coding applications and re-structuring the file system.

Why the Palmer site works better with a file structure:

  • What to do with files? The Palmer site contains many non-php, non-html files, like docs, pdfs, images, and text files. All of these need a place to be stored and be accessable with an associated URI (web link). Using a file structure, we can logically place such files in the semantically meaningful folders, nested appropiately according to the site’s navigation scheme. This method works much better than in WordPress, where the basic solution is to throw all of these files into a centralized folder, thus flattening the semantic layout and the URI construction of the file’s path.
  • The Connections theme can work outside of WordPress. It’s basically just a stylesheet, a few images, and a few php files (header, footer, etc). Even though the site no longer lives in WordPress, it can look exactly the same.
  • Performance is much much faster… even faster than WordPress 1.5. Navigation between pages is lightning fast compared to within WordPress, where it stores all page content in the database. Thus, whereas WordPress takes some time to process the code and query the DB, the file structure solution simply retrieves the php file which has all the contented embedded inside.

What we lose when we move away from WordPress:

  • No Web Editor - All editing takes place in the file, usually using a unix editor like vi or emacs. This means that anyone who wishes to edit a file must contain some knowledge of html. This also limits the users who can edit files to those who have server accounts or another means of access (ftp). The advantage of WordPress was its built-in web editor which generates some html for you, making it easier for non-html users to add and update content.
  • Blog, RSS, etc. - WordPress is mainly a blogging platform, with rss publishing and aggragation. If we ever decide to setup a Palmer web blog, we can still use WordPress. The rest of the site, however, would stay outside.

Just why does the Palmer site need to be moved anyway?

The Palmer site doesn’t just need to be moved… it needs to be completely re-organized. The file-structure behind the Palmer site is ridiculously out-0f-sync with the web site’s navigation scheme. This is confusing because the path you would take to navigate to an item on the web site differs from the path you would take to find the same item on the hard drive. A web site with a good navigation system is one that mimics its directory structure. The Palmer site is suffering from 10+ years of data build-up and low maintenance, creating a messy file system with lots of noise and little logic.

The experiment to move the Palmer site into WordPress was a first attempt at rebuilding the file structure. Because we have abandoned that attempt, we are simply rebuilding the directory structure from the ground-up, creating a much cleaner file system and navigation scheme.

What about the CCE site?

The CCE site is being moved out from WordPress as well. Although it worked a lot better with WordPress 1.5 than Palmer did with version 2.0, it is in our best interests to keep these 2 LTER sites as related to each other as we can, for ease of development and maintenance. Additionally, the CCE site also performs faster outside of WordPress.

When will these sites be ready?

The CCE site is basically done. Mason may still be working on fine-tuning some specific applications (IForum, etc.). You can compare the 3 sites using the links below. CCE new is the newest revision of the site (using the file system). Compare it the WordPress site. It’s the same theme (with a revamped nav-bar).

CCE site: http://cce.lternet.edu
CCE wordpress: http://ccelter-dev.ucsd.edu/wordpress
CCE new: http://ccelter-dev.ucsd.edu/new

The Palmer site, being much larger, will need more time. Hopefully by the end of this month, but depending on what other projects come up, it could be even later. I spent the last two days alone migrating the entire Datacat app from the Palmer dev site to the Palmer new site.

Pal site: http://pal.lternet.edu
Pal dev: http://pallter-dev.ucsd.edu
Pal wordpress: http://pallter-dev.ucsd.edu/wordpress
Pal new: http://pallter-dev.ucsd.edu/new

Future plans I’d like to implement is to throw these sites into Subversion, using a ‘2-way door’ method. The dev area and the production area would contain a checkout of the repository. Anytime someone makes an update (in either the dev or the production area), they should simply commit that change to the repository. Likewise, to update either area, you simply update from the repository. Nothing else fancy here… no other users or working-spaces, and definitely no trunks or branches! I feel this may be the simplest way to incorporate subversion into our workflow.

We may also want to merge (normalize) some of the code shared betweem the Palmer and CCE files, notably the templating files, and a common library of php functions. Both files are using the exact same template, differed only by the stylesheet. Any differences like these (stylesheet, site title, etc.) can be stored in the site-specific config file. Using a shared library helps ensure the sites are in sync with each other, mainly from the developer’s point-of-view. Mason and I have already noted that slight additions we make to the Palmer code-base are not in CCE, and vice versa. While this is typically a minor (if yet insignificant) impact, it requires extra manual labor to ensure that both sites are kept up-to-date with the latest additions of new functions, etc.

I don’t buy many peripherals (no pda or mp3 players) - a much used small camera and mini voice recorder complete my digital toolkit along with my mac powerbook. I purchased recently a Mac iPod for reasons I’m still trying to figure out:

-it had a 60GB disk meaning it could now serve as the archive for an ethnographic body of work accumulated in 2002
-it was demoed in use for me at the LTER CI meeting without any reference to music functionality which previously seemed to have backgrounded other functions important for me such as calandar sync, remote mount availability on borrowed laptop, and disk archive options.
-it’s ability to make available in color selected photo archives
-the availablility of video viewing
-the value for gaining experience with alternative portable modularized functions

Some items noticed in using the iPod
-no zoom capability with photos (common on most cameras) is a severe viewing limitation
-the Mac iPod video format is ‘ipod mp4’ so a video converter application is required to transform from standard formats (quicktime (.mov), realplyer (mpg4), and windows media player (.wmv)) to the iPod mp4. Although a number of free converters are available online, to minimize number of applications on my Mac, I purchased for $30 the upgrade of QuickTime to QuickTime Pro with its built-in converter. Interesting sidenote, I was not able to purchase the QTPro key unlock at the UCSD bookstore (last copy they received and sold was more than a year ago) nor from local Apple stores.My only option was to purchase it online from the MAC store - which worked flawlessly. I have tried to keep my own online transaction sprawl to a minimum - one company - but now I’m on record with two companies (amazon and apple).

So far I have used the iPod to ask others a question from a photo, to share a photo collection, and to demo a short video. To be continued …

« Previous PageNext Page »