Difference between revisions of "TOPICSROOT/corpora/facebook"
Latest revision as of 15:18, 4 October 2007
This directory contains Facebook data files. Please refer to libs/corpora/facebook for libraries useful for dealing with this data. Tools in libs/scrapers were used to obtain the files in this directory.
The files in this directory are:
- userlist. A list of user ids which were found using Facebook's browse feature (automated via a Firefox extension) and processed using facebook_users.py.
- userprofiles. The main database of user profile information. For each entry in userlist, various fields are fetched. The format of this file is as a series of records of profiles. Each profile begins with the user id and then is followed with each field and its value separated by a colon. New fields always start a new line. Note that some fields may be multiline so take care when parsing.
- users/. This subdirectory contains the raw data as fetched by the extension. You should not need to use this except to generate userlist.