Welcome
The goal of the Web Spam Challenge is to identify and compare Machine Learning (ML) methods for automatically labelling structured data represented as graphs. More precisely, we focus on the problem of labelling all nodes of a graph from a partial labelling of them. The application we study is Web Spam Detection, where we want to detect deliberate actions of deception aimed at the ranking functions used by search engines. 数据挖掘研究院

The Web Spam Challenge is supported by the EU PASCAL Network of Excellence Challenge Program.
During 2007, the Web Spam Challenge will have two tracks:
Track I: mostly directed to researchers on Information Retrieval and Machine Learning, jointly organized with the AIRWeb 2007 workshop.
Track II: mostly directed to researchers on Machine Learning, being planned for the second half of 2007.
Timeline
- January 2007: More feature vectors will be available
- 30 March 2007: Deadline for submitting predictions
- April 2007: Assessment and evaluation phase
- 8 May 2007: Results of the evaluation announced at the AIRWeb workshop
Mailing list
If you are interested in participating in the challenge, please subscribe to our mailing list. 数据挖掘研究院
History
- September 2006: The challenge was submitted to the PASCAL Network
- November 2006: The AIRWeb Workshop was accepted at WWW'07.
- November 2006: Corpus was made available
- December 22, 2006: host graph is available.
- December 2006: First set of feature vectors is available
- January 16, 2007: Evaluation metrics are available
- January 17, 2007: Challenge accepted by PASCAL Network
- February 2007: new features vectors available

