Since I recently got into this whole blogging thing, and I’m someone who tends to exhaustively research anything I’m interested in ( I guess that’s why I like my job), I wanted to share a few tips for WordPress that I’ve worked out that may help others. One of those is determining exactly what is the right robots.txt file to use for your WordPress site. The goal is not so much SEO (search engine optimization) as it is to make sure the right content is being indexed by sites like Google, and the wrong stuff isn’t. I’ll break this down into somewhat basic terms for people who may be new to the process. There are a variety of blog posts on the subject, and I think I’ve compiled my own spin on the issue. The key is that you don’t want to block too much, so try to only block things that are meaningless to readers (like script files).
The root folder of your site can have a text file in it named robots.txt. This file contains some rules that you set that determine what files and folders you want to allow search engines to find, and which ones you want to label as being off-limits. Google has a bad rap for ignoring robots.txt files, but I believe that is coming from some confusion as far as how Google interprets this file. By playing with their robots.txt analysis tool I found something that I think many neophytes are missing.
First, a general primer. Below are the first few lines from my robots.txt file.
User-agent: *
# disallow all files in these directories
Disallow: /blog/wp-*
Continue reading ‘A robots.txt for WordPress’
As I set up my blog I want to post my minireviews of different components that I find useful. After a brief stint with Google Analytics I’ve convinced myself to buy Mint.

First, a word about how web analytics works. There are really two different classes:
- Log parsers. Your site keeps logs of visitor information in standard log files. There are many packages out there which read those logs and prepare reports. The information they can extract is limited by the kind of data that happens to be logged. A popular example would be AWStats.
- Javascript triggered loggers which interrogate the visitor for additional information and save those data to a SQL database. Examples here would be Google Analytics and reinvigorate. Generally this second option provides you with much more information than the first, but the data are all collected by (for example) Google, stored and analyzed by google, and presented back to you by Google. The examples listed are free to the web site owner, but have a cost in that you are essentially trading away your user’s browsing habit information. This may or may not be a concern for you.
Now back to Mint. Functionally, Mint falls into the second category in that it collects additional information from your users and stores it in a SQL database but with two key differences. First, Mint is a PHP software package that you buy for a one-time price of $30 per domain (sub-domains are included in the main domain license) and install it on your hosting in the /mint subdirectory. So it is running on your own server, and you own the data. Note that you also need to provide it with a SQL database to store data, but for most hosting companies this is a 10 second process to set up a new database, and it will happily coexist on an existing database if you prefer not to create another.
Continue reading ‘Mint web statistics’
I came across this pretty neat site that compares search terms across two search engines, Yahoo and Google.

Enter a search term and the row of dots indicate the order in which results appear (e.g. the far left result is the first result for each search engine). The blue dots are results which occur in both Yahoo and Google, and they have a blue line connecting the matching results. This is a away to visually see whether Google and Yahoo have similar rankings for pages on a topic.