I found and added a plugin that enhances the searching power to search comments as well as posts. Try it out!… search for dos2unix.
August 2005
Tue 16 Aug 2005
Tue 16 Aug 2005
Two days after I downloaded WordPress, they issued an upgrade. We are now running version 1.5.2. Previously was 1.5.1.3. From what I can tell, there are no known security holes with this version. We can refer to the WordPress Development Blog to keep track of future updates.
Tue 16 Aug 2005
Often I would edit a file on my local machine using applications such as Dreamweaver or Excel. When I read these files in a UNIX terminal, the newlines are not displayed and instead are replaced with ^M characters. This makes the file appear convoluted and virtually unreadable. Here’s a fix to convert those ^M characters back to the newline characters so that the files display correctly.
Open the file in vi and type the command
:s/\r/\r/g
This performs a global search and replace through the file, replacing \r characters with \r. I don’t know why this works, but it does!
Note: You can change your settings in Dreamweaver so that newline chars are saved for UNIX format. Go to Preferences > Code Format and select Line Break Type to LF (Unix). This eliminates the problem at the source for Dreamweaver files. I have not yet discovered a solution for Excel files, so the vi workaround works well for those.
Tue 16 Aug 2005
I was unable to post comments yesterday, despite having been logged in. (Logged in users can post comments, anonymous users cannot). I kept getting prompted to (re)login. I discovered the root of the problem in the Admin settings: Options > General. I changed the WordPress URI value from “http://oceaninformatics.ucsd.edu/wordpress” to “http://oceaninformatics.ucsd.edu/blog”.
I had originally created the symbolic link “blog” to be an alias for “wordpress”. Before I changed the settings, cookies had been stored with the path “/wordpress”. These cookies were only retrievable if our site url read “http://…/wordpress”. Because we are using the url “http://…/blog”, the site was unable to read any cookies and authenticate its users. Changing the WordPress URI value in the settings fixed the problem.
To recreate this problem as an example, follow these steps:
- Log in to the site. You will be taken directly to your admin “Dashboard” page.
- View the blog site (click on View site)
- Click on the title of the first post you see. You are taken to a new page that displays only that entry.
- Scroll down the page and you will see a text box to insert a comment.
- Change the url in the address bar by replacing “blog” with “wordpress” and go to that new address.
- Scroll down the page again. The text box is gone and you are instead prompted to log in.
The lesson here is to be careful when using symbolic links with cookies.
Mon 15 Aug 2005
I installed an email notification plugin called Subscribe2 which notifies users each time there is a new post. This entry will act as a test to see if it works. I also added the list of users (Authors) in the sidebar. This provides another means to filter through entries, in addition to archives by date and categories.
Jerry mentioned Blojsom, the blog engine that runs on Tiger (Mac OS X Server). One review suggests that WordPress has more advanced features (for now), but that blojsom is a nice blogging platform. Furthermore, Blojsom would use the same integrated authentication system for our server that other distributed applications use. WordPress stores its userbase in mysql. This makes user-integration with other applications a bit more tricky (requires scripts!), but possible.
Random things to do:
- Remove Xoops from underneath interoperability.ucsd.edu. Converting this to static site with distributed tools.
- Rewrite tasks on white board
Let’s hope the email notification works….
Mon 15 Aug 2005
I was a member of the PaCOOS mapping team taking on the role as a “Tools Specialist”. I was responsible for working with VINE, a Java application used to perform the mappings. PaCOOS datasets are similar to CTD datasets; Both seem to share many terms. Unlike the other groups, which focused on biological variables, we focused mainly on physical variables: latitude, longitude, temperature, depth, date, time, etc.
Some observations:
- Latitude and Longitude variables can be stored as a decimal, or a union of degrees, minutes, and seconds. We created a new relationship “unionof_#_of_n” where n is the total number of elements in the union. This relationship helped us mapped longitudes and latitudes that differed in the two stored formats.
- Salinity may be a “proxy” for Conductivity. Depth may be a “proxy” for Pressure. Though we detected the psuedo-relations in those pairs of variables, we did not capture those relations in our mapping file during the workshop.
- Date/Time is a beast that is broken into 6 individual components: year, month, day, hour, minute, second. After much debate on how these components are related, we came to the conclusion that each component is a class, and that the classes are organized in heirarchal fashion. Year is the super class, Month inherits from Year, Day from Month, and so on with Second the bottom child class. This heirarchy marks an order of significance: It is possible some datasets may only contain a year variable. Others may only contain year and month. We made the assumption that no dataset contains a single date/time component without those higher in the heirarchy. Though we detected how various date and time variables may map to each other, it was not intuitive (or possible) to accomplish this mapping using the VINE tool alone.
It is possible to map each term to one of the 6 classes. For instance, a date variable with the format yyyy-mm-dd maps the Day class. Likewise, a time variable with the format hh:mm:ss maps to the Second class. Further, should the Second class should be “narrowerThan” the Day class because it specifies a higher precision?
Notes from our sessions can be viewed here.
Ontology Observation
I agree there will never be a definitive global ontology that covers everything. David Remsen cited modern biology during his keynote, explaining how the classification system is flawed because the same species may have multiple scientific names. A few reasons for this:
- Some species names change over time
- Some scientists assign new names without realizing older ones exist
- Scientists may disagree on the class of a specific species, resulting in that species being stored in two separate places and thus having 2 separate paths through the biological taxonomic tree.
It is a tedious and on-going task to keep up to date with all possible names for a single species. Further, with constant new discoveries and disagreements between scientists, it is inherently difficult (and near impossible?) to maintain a definitive ontology.
MMI focuses on merging some terms between ontologies, and mapping other terms across ontologies, but not on merging all ontologies together into one global source. If we can provide search engines with enough information of how separate vocabularies are related to each other, we eliminate the need for a single definitive source.
CTD Data Merging
On Wednesday, Cyndy gave a talk on merging CTD data (back) together to provide scientists with a convenient way to retrieve data with a single integrated search rather than multiple brute searches. She stressed the importance for scientists to capture metadata in addition to data in aiding the integration process. The problem with scientists is they tend to ignore capturing the metadata because they already remember everything for themselves. The task is tedious and feels like an unnecessary waste of time. Unfortunately, it becomes more difficult to refer to data in the future without its proper metadata. This includes physical variables such as date, time, and location, but should also include more granular information such as cruise event, ctd cast, and bottle number.
Roy Lowry made a comment of how “nightmareish” it is to splice datasets together keying off of depths, particularly when some scientists adjust depth values in their datasets to correct for offsets while others may leave their depth values unchanged. It would be much easier if all scientists recorded the bottle number for each ctd cast.
Mon 15 Aug 2005
I attended the Marine Metadata Interoperability Project workshop in Boulder, CO last week. The MMI Project has long-term goals that include merging scientific data together from various sources to ease data querying using keywords across separate ontologies. Currently there exists many datasets, each with their own set of vocabularly words (for attributes, units, etc). Searching for data in one dataset may require different vocabularly keywords than those searched in a second dataset. It is a limitation for the user to not realize all possible keywords when performing a data search. MMI aims to alleviate this problem by mapping and merging like terms spanning various ontologies, ultimately providing more power to search engines by recognizing relationships between ontologies and returning more results.
We were separated in different domain groups for the workshop:
- CTD
- Waves and Currents
- Chlorophyll
- Sensors
- Benthic Habitat
- PaCOOS
Each team mapped terms across ontologies that were specific to that team’s domain. The three basic relations used in the mapping process were ’sameAs’, ‘narrowerThan’, and ‘broaderThan’. All of the ontology and mapping files are stored as OWL documents, and the mapping files were made available online at the end of the workshop. This initiative is a hopeful stepping stone for future mapping work and crosswalks between ontologies.
Fri 12 Aug 2005
I decided I want to try using a blog again to help keep notes and capture ideas. I found WordPress (from Ethan’s blackrimglasses.com) and it’s a very nice open-source project. I like this better than Blogger because it uses categories (which Blogger lacks!) to organize posts, and also works as a stand-alone app (no need to go through a 3rd-party’s server).
I’m not going to worry about defining categories now. They can be created over time. For this post, I created the “Tools” category. Posts under this category may involve discussions about various technologies used in our Ocean Informatics environment.

