Searching for the Moon

Shannon Clark's rambles and conversations on food, geeks, San Francisco and occasionally economics

Posts Tagged ‘metrics’

Abstractions for Metrics and Targeting – extending OpenSocial

Posted by shannonclark on February 20, 2008

Tonight I attended the MIT/Stanford Venture Lab event “Shaking the Money Tree of Multi-Platform Social Networks” which my friend Jeremiah Owyang moderated. It was a sold out event which drew a very diverse crowd of students, brand advertisers, technologists, entrepreneurs and analysts. The event was great with short presentations and an engaging panel discussion. During the panel discussion I asked a question, which in turn sparked an idea I am exploring in this post. In the next few weeks and months I will be engaging with many people around these ideas and I look forward to comments, criticism and suggestions about how to accomplish these two main ideas.

In the interest of full disclosure, when I asked my question tonight at the event I noted that I was not an impartial questioner – I have a stake in this. To elaborate further, the company I am in the midst of co-founding, Nearness Function, is an ad network working to bring brand advertisers to select applications – including very likely applications running in Social Networks and on OpenSocial. If both of my proposals below happen it certainly will help Nearness Function and our partners and clients – and I hope, will help the entire industry.

Tonight Kevin Marks of Google discussed three important ways in which OpenSocial creates abstractions.

  1. Abstracting the Friend networks of the “viewer” and “owner”. Allowing these to queried and traversed.
  2. Abstracting data persistence for applications
  3. Abstracting the event (“news”) feed which the use of an application can generate

My question and now proposal would add two abstractions – to OpenSocial and likely to more of the web in general.

  1. Abstract metrics
  2. Abstract targeting data

Taking these points in detail, here is what I am suggesting. These are my initial thoughts – I welcome feedback and further discussions.

Abstract metrics

The web 1.0 metrics resolved around “pageviews” and later, slightly more refined around “impressions” or “uniques”. In the past few years with the rise of pay-per-click advertising both against search results and increasing elsewhere across the web, “clicks” and a resulting calculation of “ecpm” (effective cost per thousand) has been a commonly used metric for success. And terms like “uniques” and “impressions” get used a lot – though exactly how to define and calculate them is not always clear in the least. Even “clicks” have to be recalculated to take into account “ClickFraud” – i.e. automated or malicious attempts to game pay-per-click systems, often by automating clicks on links (sometimes to generate income, but more subtly to exhaust a competitor’s budget).

For OpenSocial, and for much of the web of 2008, I would suggest that we start to think about abstractions for metrics that fit this new environment.

My initial suggestions would be to define active vs. inactive states so that an application can report back when a user is active (and we define what that means) within the application. A further refinement to this abstraction would be to measure the time in each state again with uniform ways to start and stop that clock.

Additionally a defined way to count events within the use of an application potentially including a measure of where within the application attention is paid could be highly useful as well. This might start by building on similar tools that are already used to track web activity and interactions. In the OpenSocial (and widget case more broadly) one complication being how to log and report back these metrics in a standard manner.

Ideally these metrics probably should flow back to the hosting social networks, to the application provider, and potentially (and again this needs clarification) be shareable with third party providers – such as an ad network (like the one I’m building).

Abstract targeting data

In the panel tonight when I asked about this the conversation shifted to a discussion about what an ad network can and can’t store based on the terms of service of a given social network. That is important, but it missed the point of my suggestion.

Here what I would be proposing is a bit more complex than the metrics, it would be a set of abstractions around what data flows to the application (which in turn might flow to the systems used to target advertising) which could be employed for targeting. Abstractions are important because even seemingly “simple” elements can, in many cases, prove complex.

Take “gender” – in many, but not by any means all, social networks this is relatively simple “male” or “female” – however this is not always the case. For one there are often many people who leave the field blank (i.e. undefined) and in at least some networks people of another gender (“transgendered” to take one example) can specify that. An abstraction might not resolve all possible nuances – but, for example, it might require the “undefined” case (and likely an “other” case) to be handled.

The issue that advertisers, marketers, application developers and social networks all face is nearly everyone recognizes that targeting messages – if done well and reasonably – adds greatly to the impact and effectiveness of those messages (however you choose to measure that). But each party also defines what and how they think that targeting should (or could) happen in very different ways.

My suggestion would be to create some standard and abstracted ways to think about a common set of data that could be available at the point when targeting could occur. Note that this would be done in a manner that could also be kept in compliance with a given social network’s terms of service. i.e. on FaceBook that data which is shared would not be retained for more than 24hrs etc.

Here are a few of my suggestions for areas where a discussion could (and should I’d say) happen, I’m sure I’ve missed or overlooked some things – and in some cases the standard may be very simple.

  • Gender
  • Age – I’d suggest by ranges vs. specifics – with a standard set of ranges
  • Geographic location – potentially in two parts a) of viewer, calculated from IP address etc, at time of use and b) “home” (possibly “homes”) as stated in user profile
  • New user/viewer of a given application vs. returning user/viewer vs. has application installed on own system (ideally even if “own” profile is on a different social network)
  • Path to current session – i.e. via internal to social network search, via link on friend’s page, via link on stranger’s page, via external search, via external deep link
  • Technology – browser type, speed of connection, mobile phone vs computer vs console
  • Measure of frequency of interaction (with social network, with a given application) – i.e. you could target people who use the site every day and have for the past 6 months differently from people who use that particular site only once a week. You might also want to target users who are in their first X days of using an application or the underlying social network in a different manner than users who have been using it for months.

I’m sure there are others.

The key points here is that what needs to be defined is not just the categories but some abstract and standard ways to pass the relevant data. Keeping in mind that at the end of the day the goal here would be to make:

  • The user experience better by presenting more likely to be relevant commercial messages
  • The advertiser purchasing opportunities to be more clearly defined so advertisers can compare apples to apples
  • The developer have an easier set of tools to understand the users and to offer, if desired, opportunities to advertisers
  • and for Third party providers, such as an ad network, to have at least a minimum set of expected to be available data which could be used

These abstract targeting data would not preclude additional information being used to enhance and improve results (where that data can be used if covered under terms of service) but it would help improve targeting especially for OpenSocial applications which cross multiple social networks. The final results (i.e. which specific ad to show if any) might take a variety of additional factors (which ads were shown to that user or to similar users recently, what the actions of those users were, what various advertisers are willing to pay at the moment, etc)

This is very much a work in progress. I’m sure there are some overlaps here with activities of various industry groups. I welcome suggestions, enhancements, and other comments!

Posted in advertising, Entrepreneurship, internet, networks, web2.0 | Tagged: , , , , , , | 2 Comments »

One more reason why Comscore and other “surveys” are unreliable

Posted by shannonclark on January 2, 2008

And that’s being kind…

Over the past few days the CA Security Advisor Blog has been posting about the spyware which is installed by Sears and KMart when you join their “community”. Spyware which leads back to Comscore and which, in essence, tracks every single web action – including secured transactions, that the infected users take. And it is this very pool of spyware infested users which Comscore then relies upon to make sweeping statements about the traffic and online activity across the Internet.

This is not a minor issue. These seriously flawed and troubling methods result in the numbers which, in turn, get cited as fact at major events (such as from keynote speakers on stage at AdTech NYC this past Fall) and quoted heavily in the major press on and offline. Further these comscore numbers are then used to drive much of online ad spending.

On the Pho list I wrote the following analysis a few months back, I am quoting my email in the entirety,  the context was a discussion on the list (which is focused on digital music) on the Radiohead In Rainbows experiment and a comscore report that claimed that 60% of all users who visited the Radiohead site had downloaded the music without paying at all. A statement which the band itself vigorously rejected.

A few observations and further fodder for discussion.

This past week I spent the last four days at Ad:Tech NYC. (I was
covering Ad:Tech for my friend Allen Stern’s blog, Centernetworks –
see for my coverage).

At MULTIPLE times over the course of the conference, most notably at
many of the keynote presentations from senior ad industry leaders, the
comscore study was cited without question as being authoritative and
proof that most people won’t pay.

I see any number of very serious flaws in Comscore‘s processes and methods.

Here are a few.

1. The underlying, basic assumption of any survey is that your sample
population can serve as a proxy and basis to extrapolate up to the
whole population. HOWEVER I think that especially online today this is
dangerously flawed. People do not act independently – instead people
are deeply influenced by the behaviors of their peers – and online
this effect can be multiplied many, many fold. In my own personal,
online networks literally dozens upon dozens of people have sent
twitters, emails, and written blog posts about In Rainbows so my
awareness of it (and the purchases of it often down to the exact price
paid) is extremely high. Amongst my circle – a very very high
percentage of people likely have visited the site,and most have
downloaded the album (and in most cases paid over $5 for it)

2. On a related front – about zero percentage of my population I
mention above are part of comscore‘s surveys population. Indeed given
that they: install an explicit piece of spyware (with permission – but
inherently it is spying on your every action) in return for “server
based virus scanning, sweepstakes and helping the internet” I doubt
anyone I know would participate – nor would I allow or suggest it to
ANYONE I know or advice. Not to mention that almost certainly most
corporate security processes would not allow such technology on
corporate machines (and with extremely good reasons).

Thus almost certainly their survey population, though over 2M are
almost entirely home/personal computers (even while more and more
workers have internet access at work and use that access for some
personal use). Furthermore since they are installing tracking in the
browser they miss: people with multiple browsers which they use on the
same machine, potentially people who have multiple logins to the same
computer (parents sharing a computer with children for example),
people with multiple computers, people with multiple internet
connected non-pc devices (i.e. browsing via a game console for
example), mobile phone (such as iPhone) access.

I use multiple browsers on both my Vista tablet and my iMac desktop –
not to mention my occasional use of Parallels on the mac. I also make
extensive use of my iPhone’s web access.

3. I would need to know much more in depth technical details of how
their browser plugin works – but on Vista computers to take one very
large example by default the OS firewall will block many types of
outbound reporting by applications without authorization (but this may
happen as part of their install).

4. I find it somewhat telling that nowhere which I could find at least
on could I find a means to choose to join their
survey population (they may do this deliberately in that they want to
have some “randomness” to their survey population.

BUT this implies that they are using online ads and other means to
attract people to join their survey population.

I don’t know about the rest of you, but almost all savvy Internet
users I know generally never click on any survey driven offer (and/or
never give anything accurate in an online survey). I certainly don’t
answer surveys via online ads. Nor most via emails or popups on a
given page. Occasionally I will follow up from a conference by filling
out their survey of attendees (but I would NEVER allow such a survey
to install spyware on my computer).

5. While comscore is tracking a survey population designed to measure
the “typical” Internet user (though with billions of “internet users”
this alone may be essentially meaningless on a global scale) Radiohead
never intended to reach the “typical” user.

Radiohead wants to reach radiohead fans first – mostly current fans
but also to grow and gain new fans. They sing in English so a large
portion of their fans speak English (though now 100’s of millions of
people online do not) – further while they tour worldwide almost
certainly countries and cities where they have performed in the past
have more fans than those where they have not.

With the millions of existing radiohead fans (people who have bought
their past albums, gone to their shows etc) I would guess that the
percentage who visited their website is very very high – and that the
percentage who paid is also quite high.


Posted in advertising, internet, networks, web2.0 | Tagged: , , , , , | 1 Comment »