Predictive algorithm and what’s so great about the Netflix Prize

AlgorithmWell, on the surface, nothing much. But it represent increasing interest on a very promising field.

It’s million dollar prize for computer expert and statistician that can improve a set of algorithm (a set of rules to be executed by computer).

Netflix, the companies who pay for this $1 million reward, is a movie rental company in USA (trying to establish international presence). You choose movies on their website, they send the movie to you by mail, you send back the movies when you’re done with the movies you borrow.

But, have you found that there are movies out there that might not be heavily advertised but are really nice? I mean, i found Layer Cake accidentally. I like the movie, really like the movie, it’s not advertise here in Indonesia, but it’s a really good movie. I wish there is someone who can advise me on these kind of movies, someone who really understands my taste and can tell me based on his vast knowledge of movies, what other lesser known movies would i like.

Well, Netflix has such adviser, it’s a computer algorithm called Cinematch

Cinematch can predicts, based on your past movie rentals, what kind of movie would you like, despite those movies being unpopular in the market.

These kind of algorithm are called predictive algorithm, a set of rules that when are executed through a data set, produce some prediction. Example, i have a database of customer static data (age, gender, zip code, education, income) and a set of transaction data (purchase of groceries). Based on existing data set, i try to predict what kind of people prefer to purchase certain type of groceries.

Now, analyzing this data won’t allow me to come up with totally accurate linkages among data sets (e.g. a person with higher income positively prefer x brand of canned milk compared to other canned milk), but it allows me to see higher correlations in the data set (e.g. there’s a higher correlation between a person with higher income and brand x of canned milk compared to other canned milk). Not perfect, but it’s better than nothing.

Based on this correlation, i try to predict what other people having certain characteristic would like to buy.

You can use both static and transaction data for predictive algorithm (or predictive models, another term for the same thing)

Now, Cinematch use the movies you have rent, and your rating on that movie (this produce a set of transaction data), and the rating from millions of other guys who also rent and rate movies, to predict what kind of movies you would like in the future.

Pretty cool no? Netflix actually provide 2 kinds of service, renting movies (where their money comes) and providing suggestion on movies you might like (to keep customers to keep on coming)

Due to it’s business model, Netflix is highly dependent on predictive algorithm. They’re dependent on predictive models on at least 2 ways:

1. As their competitive edge to customer, Cinematch allows them to differentiate themselves from other more established movie rental companies

2. To maintain efficient inventory level of movie. They can’t afford to have to few or to many movie of the same title.

That’s why the $1 million Netflix Prize is so important for Netflix.

The Netflix Prize is provided for those who can improve Cinematch algorithm by 10%.

But guess what, by creating this prize, Netflix get many many damn smart people working to improve it’s algorithm, not just the ones who won the prize.

Now, beyond Netflix own survival, what’s so great about the prize so i’m willing to quote a news from NY Times?

It shows a growing interest for predictive modeling.

Predictive modeling allows many decisions to be automated.

The existence of  software like MS Word or MS Excel allows the automation of many aspect of work. Ctrl Z to undo your work rather than retyping, formulas to automate calculations.

But one aspect of human work, making decisions, has not been able to be automated.

Predictive Modeling aim to bridge this gap.

An example of the application of predictive modeling.

As a bank, we process ten thousands of credit card application each day, each day.

Now, imagine near hundred of credit analyst spending their time to analyze these credit card application.

Poor guys no? spending their years analyzing credit applications. A routine job like these is a burden for the guys whose doing it (danger of over specializing, risk of being pigeon holed) and the organization (managing more workforce increase organization complexity)

Now, just like MS Word frees us from the shackles of retyping, i (and many others) hope that predictive modeling can free these credit analyst from the shackles of their job (and allocate their talent to more challenging roles)

That’s just one of the expected use of predictive modeling.

It’s a great and growing field, and the Netflix Prize help to support this field.

I hope more similar prizes to come up.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: