Data Strategy

October 12, 2007

Controlled experiments on the Web

Filed under: Datamining, People and Data, Statistical experimentation — chucklam @ 2:49 am

Ronny Kohavi of Microsoft (previously Amazon) presented a paper this year at KDD called Practical Guide to Controlled Experiments on the Web. As far as I know, it’s the first “academic” paper on what’s often called A/B testing. I say “academic” in quotes because the paper is relatively lightweight and is geared towards an audience of industry practitioners.

Most people who work on A/B testing are computer scientists who know more about systems and databases than statistics, and unfortunately this paper doesn’t do much to correct that. (And by statistics I mean a specific body of knowledge that has been accumulated over the last couple hundred years, not psuedo-scientific Web 2.0 marketese like “long tail” or gratuitous name dropping involving Gauss, Bayes, and Pareto.) However, the paper does point out some system design and usability issues that Amazon and others have learned from their experience. For example, to maintain a consistent user experience, each user must be assigned to the same experimental group on multiple visits to the site. Since maintaining state under distributed servers introduces scaling and performance issues, group assignment based on user ID hashing is the preferred approach.

Given the lack of good technical publications on doing controlled experiments on the Web, this paper is certainly a welcome start.


August 7, 2007

Advertising’s digital future

Filed under: Advertising, Datamining, Personalization, Statistical experimentation — chucklam @ 12:23 pm

The New York Times yesterday had an article on advertising’s digital future. It mostly discussed the view of David W. Kenny, chairman and chief executive of Digitas, the advertising agency in Boston that was acquired by the Publicis Groupe for $1.3 billion six months ago.

The plan is to build a global digital ad network that uses offshore labor to create thousands of versions of ads. Then, using data about consumers and computer algorithms, the network will decide which advertising message to show at which moment to every person who turns on a computer, cellphone or — eventually — a television.

“Our intention with Digitas and Publicis is to build the global platform that everybody uses to match data with advertising messages,” Mr. Kenny said.

That is, advertising in the future will be much more data driven. Now, if we take that vision for granted, then the interesting question will be Who will end up controlling what data? No doubt Mr. Kenny would love to see advertising agencies being the central gateway, if not the outright owner, of all such data. However, privacy advocates, media companies, new “intermediaries”, and search engines like Google all have different ideas about their ownership of data and their place in this advertising future. It’s too early to tell how things will turn out, and everyone is making educated guesses.

“How do we see Google, Yahoo and Microsoft? It’s important to see that our industry is changing and the borders are blurring, so it’s clear the three of those companies will have a huge share of revenues which will come from advertising,” said Maurice Lévy, chairman and chief executive of the Publicis Groupe.

“But they will have to make a choice between being a medium or being an ad agency, and I believe that their interest will be to be a medium,” he added. “We will partner with them as we do partner with CBS, ABC, Time Warner or any other media group.”

I wonder if Mr. Lévy has considered the possibility that in this digital future, Google may in fact be CBS, ABC, and Time Warner combined.

Blog at