Read orderings from .soc, .soi, .toc or .toi file types storing election data as defined by PrefLib: A Library for Preferences.

read.soc(file)

read.soi(file)

read.toc(file)

read.toi(file)

# S3 method for preflib
as.aggregated_rankings(x, ...)

Arguments

file

An election data file, conventionally with extension .soc, .soi, .toc or .toi according to data type.

x

An object of class "preflib".

...

Additional arguments passed to as.rankings(): freq, input or items will be ignored with a warning as they are set automatically.

Value

A data frame of class "preflib" with first column Freq, giving the frequency of the ranking in that row, and remaining columns Rank 1, …, Rank p giving the items ranked from first to last place in that ranking. Ties are represented by vector elements in list columns. The data frame has an attribute "items" giving the labels corresponding to each item number.

Details

The file types supported are

.soc

Strict Orders - Complete List

.soi

Strict Orders - Incomplete List

.toc

Orders with Ties - Complete List

.toi

Orders with Ties - Incomplete List

Note that the file types do not distinguish between types of incomplete orderings, i.e. whether they are a complete ranking of a subset of items (as supported by PlackettLuce()) or top-\(n\) rankings of \(n\) items from the full set of items (not currently supported by PlackettLuce()).

The numerically coded orderings and their frequencies are read into a data frame, storing the item names as an attribute. The as.aggregated_rankings method converts these to an "aggregated_rankings" object with the items labelled by the item names.

A Preflib file may be corrupt, in the sense that the ordered items do not match the named items. In this case, the file can be read is as a data frame (with a warning) using the corresponding read.* function, but as.aggregated_rankings will throw an error.

Note

The Netflix and cities datasets used in the examples are from Caragiannis et al (2017) and Bennet and Lanning (2007) respectively. These data sets require a citation for re-use.

References

Mattei, N. and Walsh, T. (2013) PrefLib: A Library of Preference Data. Proceedings of Third International Conference on Algorithmic Decision Theory (ADT 2013). Lecture Notes in Artificial Intelligence, Springer.

Caragiannis, I., Chatzigeorgiou, X, Krimpas, G. A., and Voudouris, A. A. (2017) Optimizing positional scoring rules for rank aggregation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.

Bennett, J. and Lanning, S. (2007) The Netflix Prize. Proceedings of The KDD Cup and Workshops.

Examples

# can take a little while depending on speed of internet connection
# NOT RUN { # url for preflib data in the "Election Data" category preflib <- "http://www.preflib.org/data/election/" # strict complete orderings of four films on Netflix netflix <- read.soc(file.path(preflib, "netflix/ED-00004-00000101.soc")) head(netflix) attr(netflix, "items") head(as.rankings(netflix)) # strict incomplete orderings of 6 random cities from 36 in total cities <- read.soi(file.path(preflib, "cities/ED-00034-00000001.soi")) # strict incomplete orderings of drivers in the 1961 F1 races # 8 races with 17 to 34 drivers in each f1 <- read.soi(file.path(preflib, "f1/ED-00010-00000001.soi")) # complete orderings with ties of 30 skaters skaters <- read.toc(file.path(preflib, "skate/ED-00006-00000001.toc")) # incomplete orderings with ties of 10 sushi items from 100 total # orderings were derived from numeric ratings sushi <- read.toi(file.path(preflib, "sushi/ED-00014-00000003.toi")) # }