How would you model the correlation between Foursquare checkins and actual attendance at a venue? Al, Jeff, and Nick have some fun ideas, but they're tough to keep track of in Twitter form. Maybe a Branch will help.
How would you model the correlation between Foursquare checkins and actual attendance at a venue? Al, Jeff, and Nick have some fun ideas, but they're tough to keep track of in Twitter form. Maybe a Branch will help.
erg: this is why you'd wanna use subway ridership, I think: gist.github.com
Ok, we're most of the way there, I think. Checkins/area population gives us an idea (Times Sq and SoHo will probably have a lot relative to population, somewhere like Jamaica or Bay Ridge relatively few.) We'd use subway frequency as our first way of checking, even though not many people use subway checkins.
What if you took a step back from the subways? They're inherently tricky because they're simultaneously a location and a mode of transportation. What if you looked at venue types that used primarily by people who live nearby, e.g., post offices or grocery stores (though real-attendance data on that might be tougher), and then controlled for census demographics?
I see some problems: 1) What if people check in to different types of venues differently in different neighborhoods. (eg someone checks into bars in Williamsburg but not the bars that he goes to in Murray Hill, where instead he checks into his dry cleaners OR obviously some restaurants in a neighborhood are patronized by more of a Foursquare-using population than others in that same neighborhood.)
2) Some neighborhoods' check-in ratios will be distorted by one major destination, like Yankees Stadium and the South Bronx. SoBro probably has a relatively low Foursquare using population, but it will have a relatively high number of total check-ins to population.
Yeah, then I bet you'd have to do a survey of venues to tune your model parameters. Say send out 1000 surveys to gather actual attendance so you could tune the ratios.
Spatially, you'd need to find out which subways people would get off of for each venue, you could do this many ways, Voronoi for example, but I bet a simple as the crow flies calculation would work wonders.
You'd have to assume people are on average equally mobile, though you could also tune the ratio by user, I expect we could use a refinement algorithm like EM[1] or something to tune the model outside of the availability of survey data, though it'd be tricky to apply it.
[1] en.wikipedia.org
The other necessary multiplier would be a category multiplier. I check in at most bars and restaurants, I don't check in at the doctor's office or the bodega. So you'd have to figure out relative frequency, but it's not clear to me that that would vary by neighborhood. I guess we'd have to check, but do different populations of people use Foursquare differently?
David Yanofsky (@YAN0) suggests another venue type that strikes me as super-useful: movie theaters.
twitter.com
Yeah, anything with tickets is great. Beyond that, we'd need to get data for maybe 8 of each kind of establishment, to see how much they vary by check-in multiplier, then see how much that correlates to neighborhood. We'd then have predictions for everything, and we'd check that against actual real-life visit counts for a larger sample of places.
I can give you one example from my firm's work. Neomonde Bakery and Deli (foursquare.com) has 1,871 total check ins from 981 people. Compare this to another local restaurant that they are partnered with, Sitti, who have 4,133 check-ins from 2,007 people.
If you compare the two I would argue a few reasons exist:
1. Sitti attracts the downtown crowd which is younger and has more first adopters.
2. Neomonde, a client, just began to invest in social media this year while Sitti, and their sister restaurants, have used social media from the beginning — including offering specials for a longer period of time.
Demographics matter, of course, but the differences also come from use.
Thanks for your feedback! Team Branch
Please refresh the page and try again.