The Video Store Example

There are two main problem that have bothered people in the past when I have tried to explain my views  about social technology and social network optimization to friends and family. One problem is that the concepts are too hard to follow. I keep trying to address that problem with new text, but I am probably doomed to keep doing so for a long time to come, because nothing I write is quite easy enough to understand. The other problem is more approachable, because it concerns only the technical feasibility of what I am proposing.

In general I find it much easier to address technical problems, which are more concrete than some overall failure to understand the conceptual foundations behind them.

So this page is addressed to people who have some idea about what I'm trying to do, but don't see any mechanism for it to work.

I am going to start with the hypothetical video-rental-store problem, an example that first came to mind a decade ago when I found myself taking my daughter Clara and niece Marika over and over again to the video store in search of something to watch.

I assume that this hypothetical store, like many others, maintains a membership list for customers and provides them with a membership card so they don't have to keep providing a lot of ID when renting videos. I will assume they have a computer to handle this membership and that they keep a database of information on each customer -- for the moment let's ignore the privacy and confidentiality issues.

What follows are more unusual assumptions, but quite technically feasible:

So the basic idea is quite obvious. Customers return videos through the appropriate slots and thus rate the video. Ideally the video store wants each customer returning videos to rent some new ones, so as soon as the customer returns the videos from last night the printer on the counter must spew out a page of recommendations, and they should be good ones, truly personalized to that customer's tastes, however bizarre they may be.

It is important to emphasize at this point that I see this all as technology, not science, but for the sake of explaning how this all fits together I will add a video-scientist to the mix at an appropriate time.

But let's see what the technology can do by itself, without an scientist being involved. First of all, as customers rent videos the computer adds the names or numbers of those videos to the database. When videos are returned the customers have been instructed to deposit them in the appropriate slot, to give the video store an electronic review -- did the customer like it or not.

The computer can then maintain two sets of records, one for each customer, and another for each video. Essentially the record for each customer lists each video rented and the slot through which it was returned: liked, disliked, or whatever. The record for each video lists which customers rented it and the same mini-review from the return slots.

So far so good, but that's just record keeping, but what matters to both the customers and video store is the accuracy of the recommendations make in the printout provided to the customer on entering the store. If these recommendations do truly match the customer's tastes, the customer will be happy, and will probably rent more videos, thus making the store happy too.

Essentially the recommendations are predictions, and these predictions will be confirmed or refuted when the customer returns the videos.

The underlying technology for doing this is entirely straightforward. To predict the customer's response to a video the system can compare each customer to all the other customers and find some customers with similar tastes; it can also compare each video to all the other videos and find which other videos are most similar.

Two videos are similar if they appeal to similar customers, and two customers are similar if they like similar videos. That might seem like a paradox in which you can't tell what videos are similar until know which customers are similar -- but can't tell which customers are similar until you know which videos are similar -- but it is not a paradox at all.

One starts the process by just noting which customers liked the same videos, and which videos are liked by the same customers -- then you can use successive approximation to include similarity data as well.

If the number of videos available for rental is reasonably small, a few thousand perhaps, and the number of customers is also not too large -- again a few thousand, then the amount of math required to do this quite small.

Essentially the computer looks up your record, then searches the other customer's records to find similar lists of videos liked and disliked. Then videos liked by customers whose taste seems to be like yours can be recommended to you. Not a difficult concept.

But suppose that instead of a single store doing this on their own what we have is a national or international chain of video stores, like Blockbusters, and let us suppose they collect this information for all of their customers all over the planet, a hundred million or more. I don't know how many they carry, but a typical book of video reviews has a little more than 20,000 listed.

This is now a much juicier problem now, with lots more data to crunch. It is no longer possible to simply compare your record to those of all the other customers, so a lot more math is needed. Most likely this will be something to do a kind of data-compression based on eigenvectors -- but you don't need to know the details, all you really need to know is that the math attempts to simulate the performance of an uncompressed database.

If it was possible to compare your record of video rentals and ratings with those of the other 100 million customers, it would be quite easy to pick a hundred or so people (1-in-a-million) whose tastes are almost identical with yours, and they recommend to you things they liked. But we can't do exactly that since it would require too much computer power, so we use a compression scheme, which attempts to simulate what we can no longer do because of the size of the database. The better the math, the better it can simulate it.

There is actually a nice side-effect of the compression scheme, most of which do a little interpolation within the data or extrapolation beyond it and can make predictions about videos that nobody quite like you has seen yet. More about that another time.

Well, the time has come to introduce a video-scientist. Let us suppose that some intrepid doctoral-student manages to talk the chain of video stored into letting him use their database -- the whole thing -- in his research. This researcher has read many papers on the use of factor analysis in psychology, and decides to write his dissertation on the most important factors in choosing a video. He performs factor analysis on the data and discovers that the most important single factor is ... well, whatever, and fills out his analysis by listing the other 19 most important factors, for a total of 20.

But just as he finishes his research and written his dissertation he stumbles across another, much smaller database of compressed data that the stores use for making their recommendations, and discovers that instead of using a huge list of customer preferences they have a compressed table that describes each video in 20 numbers based on the eigenvectors of the autocorrelation matrix of the rows in the large uncompressed table.

Of course the video stores don't know what those numbers mean, they're really just compressed data. But compressing by eigenvectors that way is more-or-less the same as the factor analysis used by so many psychologists in their research -- and in this case it turns out that the 20 numbers per video of the compressed table are exactly the same as the 20 largest factors extracted by this young scientist.

So we actually do have a kind of technology that does something not unlike a kind of science, extracting the major factors from some large table of data. But of course the technology just uses the compressed data for its own purposes, it doesn't name the factors or describe them, and it certainly doesn't write them up for publication in Psychological Reviews.

The important point I want to get across is this: all of what I have just written about is perfectly straightforward stuff using known techniques -- known technology, that is. It does not require some "theory of video preferences" to work, although in some cases part of what the technology generates may be not unlike such a theory. It's all known technology, though, nothing mysterious, nothing controversial.

I could have written this about libraries or bookstores instead of video stores -- and now that I think about it I think I did at some point a while back.

When it comes to people the same basic ideas are involved. If it was just a question of searching for compatible people it would be almost exactly the same. It is not hard to imagine a large bordello which operated exactly on this principle and was able to keep each customer quite happy by using its computer to do exactly what the video store did in my example. As long as customers return their girls through the right slots it would work perfectly. Known technology.

There is a lot more math in what I have called Social Network Optimization , but it is not a question of needing to figure out a new theory of personality. We don't need to do that any more than we needed to figure out a new theory of video preferences -- its just a question of collecting and using the data. The extra math only comes in the attempt to simultaneously satisfy a large number of people and keep them satisfied, and again no theory is required -- no science is required -- it's just technology.


Copyright © 1998 Douglas P. Wilson        



Copyright © 2009   Douglas Pardoe Wilson

Other relevant content:

New: Social Technology through Diagrams

New: Social Techs novel online

New: Social Technology Blog

New: Social Technology Wiki

Please see these web pages:

The main Social Technology page.

Find Compatibles , the key page, with the real solution to all other problems explained

Technological Fantasies , a page about future technology

Social Tech a page about Social Technology, technology for social purposes.  I think I was the first person to use this phrase on the Internet, quite a long time ago.


Roughly corresponding to these web pages are the following blogs :

Social Technology the main blog, hosted on this site, with posts imported from the following blogger.com blogs, which still exist and are useable.

Find Compatibles devoted to matching people with friends, lovers, jobs, places to live and so on, but doing so in ways that will actually work, using good math, good algorithms, good analysis.

Technological Fantasies devoted to future stuff, new ideas, things that might be invented or might happen, such as what is listed above and below.

Sex-Politics-Religion is a blog about these important topics, which I have been told should never be mentioned in polite conversation.  Alright that advice does seem a bit dated, but many people are still told not to bring up these subjects around the dinner table.

I believe I was the first person on the Internet to use the phrase Social Technology -- years before the Web existed.

Those were the good old days, when the number of people using the net exceeed the amount of content on it, so that it was easy to start a discussion about such an upopular topic.  Now things are different.  There are so many web pages that the chances of anyone finding this page are low, even with good search engines like Google.   Oh, well.

By Social Technology I mean the technology for organizing and maintaining human society.  The example I had most firmly in mind is the subject of  Find Compatibles , what I consider to be the key page, the one with the real solution to all other problems explained.

As I explained on my early mailing lists and later webpages, I find that social technology has hardly improved at all over the years.   We still use representative democracy, exactly the same as it was used in the 18th century.  By contrast, horse and buggy transporation has been replaced by automobiles and airplanes, enormous changes.

In the picture below you will see some 18th century technology, such as the ox-plow in the middle of the picture.  How things have changed since then in agricultural technology.  But we still use chance encounters, engagements and marriages to organize our home life and the raising of children.  

I claim that great advances in social technology are not only possible but inevitable.  I have written three novels about this, one preposterously long, 5000 pages, another merely very very long, 1500 pages.  The third is short enough at 340 pages to be published some day.  Maybe.  The topic is still not interesting to most people.   I will excerpt small parts of these novels on the web sometime, maybe even post the raw text for the larger two.


This site includes many pages dating from 1997 to 2008 which are quite out of date.  They are included here partly to show the development of these ideas and partly to cover things the newer pages do not.  There will be broken links where these pages referenced external sites.  I've tried to fix up or maiintain all internal links, but some will probably have been missed.   One may wish to look at an earlier version of this page , rather longer, and at an overview of most parts of what can be called a bigger project.

Type in this address to e-mail me.  The image is interesting.  See Status of Social Technology

Copyright © 2007, 2008, 2009, Douglas Pardoe Wilson

I have used a series of e-mail address over the years, each of which eventually became out of date because of a change of Internet services or became almost useless because of spam.  Eventually I stuck with a Yahoo address, but my inbox still fills up with spam and their spam filter still removes messages I wanted to see.  So I have switched to a new e-mail service.  Web spiders should not be able to find it, since it is hidden in a jpeg picture.   I have also made it difficult to reach me.  The picture is not a clickable link.  To send me e-mail you must want to do so badly enough to type this address in.  That is a nuisance, for which I do apologize, but I just don't want a lot of mail from people who do not care about what I have to say.


Cross-References:

Doug Wilson's Home Page

The Idea of Social Network Optimization

Another Old Index Page

What's New?


Copyright © 2009   Douglas Pardoe Wilson