To participate in the RecSys challenge you should follow some rules which are listed below.
Sharing the dataset outside of teams is not permitted. However, publishing or sharing your code is encouraged.
Team mergers are allowed and can be performed by the team leader. In order to merge contact the organizers.
There is no maximum team size.
You can update the progress of your algorithms once a week.
Each team should submit a paper describing the algorithms that they developed for the task. More detailed information about submission can be found in the Submission page.
The dataset has not been anonymized or modified other than described in the dataset section. This means that the tweet ids, user ids and so on, are in fact the original ids. We realize this opens the door for using the Twitter API to match the tweet ids with the actual retweet_count and favorite_count values. This is of course not allowed since it defies the whole purpose of the challenge. What is allowed is to use any other public data source to expand your models with other knowledge you may find useful. Collecting more training data in the same format as the original dataset is however not allowed, since it may overlap with the final evaluation dataset and would reduce the challenge to a simple scenario of 'who collected the most data'. Keep in mind that the final evaluation will be performed by the organizers (on a private evaluation dataset) and that a paper describing the approach needs to be submitted to the challenge workshop.
If you are unsure of whether something is allowed or not, contact us and we will be happy to help you. Above all remember it's all for science, so be creative, not evil!