Search Analytics for Your Site. Louis Rosenfeld

Search Analytics for Your Site

different.

Whatever patterns emerged for you, you probably performed some or all of these four pattern analysis tasks without even realizing it:

You sampled the content to get a sense of what was there.

You grouped things that seemed to go together.

You may have sorted them to get a different collective look at them.

And you likely iterated your way through a few passes at them before you were satisfied with what you came up with.

If you’re not sure how to begin pattern analysis, try these four tasks and let your mind wander through the data. And remember, no single pattern is the “right” one!

The Fun Part

When I teach workshops on site search analytics, I have my students do a hands-on pattern analysis exercise. Even though I intentionally provide minimal instructions, it’s amazing how quickly they become absorbed in the process of detecting patterns and categorizing queries. (And no, my students aren’t exclusively data modelers, librarians, or other data nerds.) They arrive at conclusions that aren’t the same—their groups overlap, their interpretations differ—and they greatly enjoy comparing their results. Some revel in their differences; others are, frankly, a little uncomfortable with the lack of a “correct” set of patterns. That’s the precise moment at which I recommend that they consider following up their pattern analysis with a more qualitative technique, like card sorting, to determine the most common, if not “correct,” groupings.

Some of my students are skeptical of the idea of “playing” with the data. They feel they should be engaging in more serious statistical analysis. Yet many statisticians will tell you that you should perform what they call “Exploratory Data Analysis” before you tackle formal statistical testing.^[8] Until you first have a sense of the data—and its patterns—you might not have a good idea of which statistical tests you should be applying.

So let’s have some fun.

^[8]See Wikipedia for more on Exploratory Data Analysis: http://en.wikipedia.org/wiki/Exploratory_data_analysis

Getting Started with Pattern Analysis

Good news: you already have the tools—your brain included—necessary for pattern analysis. I’ll wager that you already own a copy of Microsoft Excel; if not, you could certainly create a spreadsheet in a free tool like Google Documents or OpenOffice. To get started, you’ll need some minimal data: queries (at least from the short head) and how frequently they were searched on your site. You might grab these by exporting them from your analytics application or by using this PERL script

www.rosenfeldmedia.com/books/downloads/searchanalytics/loganalyzer.txt to parse them from your server log.

Next, create two columns in your spreadsheet—one for your unique queries, the other for their frequency counts—and import or paste in your data. If you know the date range for your data sample, mention it in the spreadsheet so you won’t forget it later. Here’s an example of such a spreadsheet that contains common queries from Michigan State University’s Web site in Figure 3-1. We’ll return to the MSU example throughout this chapter.

A week’s worth of Michigan State University queries, sorted by frequency.

http://www.flickr.com/photos/rosenfeldmedia/5690405511/

Figure 3-1. A week’s worth of Michigan State University queries, sorted by frequency.

I’ve created a souped-up version of this spreadsheet (shown in Figure 3-2), which I encourage you to download and use as a template. (You can get it here:

http://rosenfeldmedia.com/books/searchanalytics/blog/free_ms_excel_template_for_ana/.) Here’s what the spreadsheet contains:

Rank: Each query’s rank in terms of frequency.

Percent: The percentage of overall search activity that each unique query is responsible for (out of all your site’s search activity).

Cumulative Percent: The percentages of all the queries added up. If you’re looking at query #3 (registrar), the Cumulative Percent shows the sum of the first three queries’ percentages (4.6391 = 3.0436 + 0.8509 + 0.7446).

Count: How often each unique query was searched.

Unique Query: The query itself.

Link: I’ve done a little fancy programming to provide a live link to execute each unique query on the Michigan State Web site. This just makes it easier to test each query.

I’ve also provided some other information at the top—such as the average number of terms per query—as a pair of fancy Zipf distributions to help you visualize the data.

The same data as in —now all gussied-up.

http://www.flickr.com/photos/rosenfeldmedia/5690405491/

Figure 3-2. The same data as in Figure 3-1—now all gussied-up.

Patterns to Consider

Now go ahead and take a deeper look at and start playing with the MSU queries. Stare at them for a bit, scan up and down a bit, and then stare again. Do you detect anything interesting, or surprising, about the language searchers are using in their queries? Were you surprised that stuinfo is used more frequently than stu info? Or that map was as high (or low) as it was? Did you happen to notice lots of queries that seemed to deal with places on campus and others that seemed to be about courses?

With each new pass at the data, you’ll come up with more questions. Following are some of the types of patterns you might encounter when analyzing your own query data.

Конец ознакомительного фрагмента.

Текст предоставлен ООО «ЛитРес».

Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.

Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного

Скачать книгу