Releases version 0.3.x of Seeks, due for early October will introduce personalized results.
There are many ways to personalize results and major search engines already do it. This article from a few years back is a good introduction to the different types of personalization.
Roughly, personalization uses your past or current behavior to predict, recommend or re-rank search results so that they better fit your expectations. Typically, this is what Google does, among other things, to personalize your results (an option that is now on by default). To do this it tracks your behavior as it can, through cookies and many other means. So in its essence, personalization implies a loss of privacy since the recommender system, or ‘personalizer’ needs to know you well in order to satisfy you well.
Seeks is different, because Seeks is designed to run on your machine. What Seeks 0.3.x will do is collect locally the URLs you are visiting, much like an automated bookmarking system. When possible it will associate the URLs you visit with the search queries you have performed, and store that locally.
This will basically allow two personalizations to take place:
- things you have done in the past will influence the ranking of search results you get. Typically, the domain names or URLs you visit often will get a boost in your results.
- results you have clicked on related, past searches, will also influence your present and future search results ranking. This is also known as re-finding.
This means that if you’re a Django developper, you’ll see less Django Reinhardt’s related results, and more about the development framework. Similarly, if you’re finding much of your technical answers on stackoverflow.com, results from this domains will get a boost.
So what is the difference between what Seeks intends to do and what Google does ? There are two main differences. First, Google does this, and does it well, but needs to track you on its servers. Seeks instead runs locally, preserves and uses the data locally so your privacy is respected. Second, Google tracks you, knows you well, even more than what you can imagine (e.g. possibly through google-analytics) but is not (yet!) on your machine. This means that Seeks sees your Web habits with a closer eye. Also, what Seeks stores is for you to control and reuse (e.g. as an enhanced bookmarking system). An API will be released for you to play with these data and reuse them the way you like.
For those who wonder, what will happen to the public nodes ? Every public node will act as a big fat user, you may see it as the ‘mean’ of all users using this public node. While we do recommend you install Seeks on your machine, none of the data stored on the public node will carry user identification. Basically, what Seeks will store are data of the form (URL,i) where i is the number of hits for this URL. From there, Bayesian inference does the personalization.
Finally, Seeks 0.3.x will introduce personalization based on your own data as a preparation for the 0.4 release series that will bring the P2P network. In 0.4.x other users’ habits will influence your search results and ranking, a scheme refered to as collaborative filtering. Privacy will be respected by giving every user full control on what data fragments are allowed to be shared with others in order to improve its search results.

