Mon rang est-il crédible ?

January, 31, 2010
Sylvain

On le sait tous, les algorithmes de classement basés sur l’utilisation d’un rang induit par une popularité « linkificatrice » sont biaisés d’avance à cause des méchants spammeurs qui obtiennent des liens de manière plus ou moins morale depuis des pages plus ou moins pertinentes. Bien sûr de nombreuses méthodes ont été mis au point dans le but de déclasser l’effet des « mauvais » liens : TrustRank, AntiTrustRank, Topical pagerank, Weighted Pagerank etc… Mais la question qu’on peut globalement se poser est la suivante : peut-on repérer quels sont les liens crédibles ? On peut également se demander si […]

Read More

Constants matter

December, 19, 2009
Sylvain

Today I am writing a very small post to encourage you (my beloved readers) to directly go to Richard Lipton’s blog to read this very good post The 3 percent solution. Richard Lipton (professor of Computer Science at Georgia Tech) is talking about a question that arises in its mind about the role of theory w.r.t. real world problems. Indeed, using again and again notation to assess the interest of our work, don’t we loose something? I often tell to my students that the choice of a sorting algorithm in practice does not only depend on the asymptotic complexity, but […]

Read More

New paper on compressive sensing

December, 3, 2009
Sylvain

Together with Alexandre Borghi and Jérome Darbon, we wrote a paper entitled Exact Optimization for the -Compressive Sensing problem using a Modified Dantzig-Wolfe method. the paper is available as a UCLA CAM Report here: cam09-101.pdf Here is the abstract: This paper considers the -Compressive Sensing problem and presents an efficient algorithm that computes an exact solution. The idea consists in reformulating the problem such that it yields a modified Dantzig-Wolfe decomposition that allows to efficiently apply all standard simplex pivoting rules. Experimental results show the superiority of our approach compared to standard linear programming methods. Comments are more than welcome!

Read More

Testing triangular properties (2/2)

November, 18, 2009
Sylvain

To find the beginning of the triangular adventure, you should read this post. To prove the lemma we use the following theorem that I am not proving in this post (if you want the proof, just let me know). Theorem Consider urns each filled with balls, and the following process: pick a ball in an urn chosen uniformly at random until the chosen urn is empty. The probability that one had to pick more than balls tends to a nonzero constant as goes to infinity. Proof (of the lemma) To each triangle in associate an urn, filled with balls labeled […]

Read More

Testing triangular properties (1/2)

September, 19, 2009
Sylvain

A few years ago, I worked together with Yves Verhoeven on the problem of testing triangular properties. We did some proofs, wrote a few pages but decided not to publish our work (since it is far from being optimal, and also far from being really interesting). However I think that this is worth a few lines in this blog. It is clearly more technical than the average post of my blog, so don’t force yourself into reading it. Let be the complete graph on vertices and be a set. Let be a function, and let be a real number. We […]

Read More

MapReduce?

September, 16, 2009
Sylvain

A friend recently asked me to explain her the concept behind MapReduce. The last time we saw each other, I had no time to do that, so I now use my  blog for this purpose. Map and reduce are particular functions that can be found we in numerous functional languages (such as python, ocaml, etc.). Map is a function that takes as input an other function, let’s say and a data structure (most of the time a sequence), that calls the function on each of the structure’s items and returns a list of the return values. Here is an example: […]

Read More

About recommendation systems (3/3)

September, 14, 2009
Sylvain

This is the last of the three posts about recommendation systems, the first can be found here, the second here. In order to validate the effectiveness of our approach we decided to make an experiment with actual products and users. For this purpose we collected from a french medias reviews website krinein a sample of 4400 movies.  From these 4400 movies, we selected uniformly at random 160 of them. This was the data set for our experiment. We then extract uniformly at random from this data set 9 movies. These 9 movies were our witness products set. Our methodology was […]

Read More

About recommendation systems (2/3)

September, 11, 2009
Sylvain

This is the second post about recommendation systems, the first one can be found here. It is pretty obvious that the goal of a recommendation system is to provide users with “good” products. I am going to first introduce our notations and then explain what is a good recommendation in our framework. In the work done with Sebastien hemon and Thomas Largillier, we consider that users belong to a set of distinct users and that products come from , a set of distinct products.  We also suppose that we are given, even implicitly, a function that gives for every couple […]

Read More

About recommendation systems (1/3)

September, 9, 2009
Sylvain

This post is highly inspired from the introduction of a research paper, written by Sébastien Hémon, Thomas Largillier and myself, entitled Partial ranking of products for recommendation systems. Recommending products to customers is a field far older than computer science.  With the tremendous increase of the commercial potential of the e-commerce it has become of the utmost importance to perform well in this field.  Moreover, companies have now the ability to store information about consumption habits of customer: their previous purchases, their tastes. They also have access to information that often allows to cluster customers into communities that share some […]

Read More

Après le PageRank, le AncestorRank (non ce n’est pas une blague)

June, 1, 2009
Sylvain

Je vous ai déjà parlé sur ce blog de Brian D. Davison de l’Université de Lehigh et de son laboratoire : le WUME (Web Understanding, Modeling, and Evaluation Lab). Bon, grosso modo Brian D. Davison est un chercheur connu dans le domaine académique du web, et il n’est pas (plus en fait car il vient de Teoma) affilié à un moteur de recherche plus qu’à un autre. C’est un gage d’indépendance dans les recherches et les résultats présentés, c’est aussi synonyme de moyens plus faibles, mais bon on ne peut pas tout avoir. Bref, tout ça pour dire que ce […]

Read More
Picture: courtesy of Abby Blank