algebraic-graphs-io-0.5.0.1: I/O utilities and datasets for algebraic-graphs
Safe HaskellSafe-Inferred
LanguageHaskell2010

Algebra.Graph.IO.Datasets.LINQS.Cora

Description

Cora document classification dataset, from :

McCallum, A. and Nigam, K., "Automating the construction of internet portals with machine learning" Information Retrieval, 2000

Qing Lu, and Lise Getoor. "Link-based classification." ICML, 2003.

https://linqs.soe.ucsc.edu/data

The dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 01-valued word vector indicating the absencepresence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.

Synopsis

1. Download the dataset

2. Reconstruct the citation graph

sourceCoraGraphEdges Source #

Arguments

:: (MonadResource m, MonadThrow m) 
=> FilePath

directory of data files

-> Map String (Int16, Seq Int16, CoraDoc)

content data

-> ConduitT i (Maybe (Graph (ContentRow Int16 CoraDoc))) m () 

loadCoraGraph Source #

Arguments

:: FilePath

directory where the data files were saved

-> IO (Graph (ContentRow Int16 CoraDoc)) 

Types

data CoraDoc Source #

document classes of the Cora dataset

Constructors

CB 
GA 
NN 
PM 
RL 
RuL 
Th 

Instances

Instances details
Enum CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

Generic CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

Associated Types

type Rep CoraDoc :: Type -> Type #

Methods

from :: CoraDoc -> Rep CoraDoc x #

to :: Rep CoraDoc x -> CoraDoc #

Show CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

Binary CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

Methods

put :: CoraDoc -> Put #

get :: Get CoraDoc #

putList :: [CoraDoc] -> Put #

Eq CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

Methods

(==) :: CoraDoc -> CoraDoc -> Bool #

(/=) :: CoraDoc -> CoraDoc -> Bool #

Ord CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

type Rep CoraDoc Source # 
Instance details

Defined in Algebra.Graph.IO.Datasets.LINQS.Cora

type Rep CoraDoc = D1 ('MetaData "CoraDoc" "Algebra.Graph.IO.Datasets.LINQS.Cora" "algebraic-graphs-io-0.5.0.1-DMAyteJuhT8EAUES5OdwH7" 'False) ((C1 ('MetaCons "CB" 'PrefixI 'False) (U1 :: Type -> Type) :+: (C1 ('MetaCons "GA" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "NN" 'PrefixI 'False) (U1 :: Type -> Type))) :+: ((C1 ('MetaCons "PM" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "RL" 'PrefixI 'False) (U1 :: Type -> Type)) :+: (C1 ('MetaCons "RuL" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "Th" 'PrefixI 'False) (U1 :: Type -> Type))))