The ongoing effort to bring Google Search to the OI sites has made some progress recently. While it turns out that we can’t register sub-domains of ucsd.edu for Google Public Service Search, we can tap in to UCSD’s Google Search Appliance. There are two ways we can do this:

1) We could create a custom front end, called a ‘client’, that would get registered with the search appliance. We could then create forms that queried the search appliance and requested that the results be displayed in a specific client. Hidden fields in the form could restrict the searches to one or more sub-domains.

2) We could use a PHP script as a proxy to send queries to the search appliance, again restricting searching to one or more sub-domains. The script would then get the results back as XML, parse them, and redisplay them in any format we choose.

So far, I’m not even sure if the first option is offered by UCSD - and even if it is, the second option seems to offer more autonomy and flexibility. So, I’ve started work on a PHP script that implements option two. Using the PHP XML parser functions and the Google search protocol, it’s been relatively painless to get a simple search interface up and running. My first experiment has been a simple test search page on the CCE site (I’m still working on the script, and I make no assurances that this page will be working when you click the link).

So what comes next? The search interface needs more polishing, especially if we want to implement feature similar to Google’s advanced search page. Also, now that we’re implementing searches, some issues have come to the forefront. For example, we need to consider page titles - right now, most of the pages on a given site have the same title, which makes search results less informative (thanks, Shaun, for pointing this out). Our current design templates weren’t created with a search bar in mind, so we’ll need to work this element in somehow. We might also want to start adding keyword metadata to our pages for more meaningful search results. I’m sure there are other issues to be addressed as well - any thoughts?