Importing and Exporting Graphs

Simple formats
The Pajek format
The GraphML format
Connection to Other Network Analysis Software
Connection to Other Software

The igraph library and the igraph R package do not introduce a new file format to store graphs. Instead if recommends using the GraphML format for smaller graphs and some simple edge list format for larger graphs, perhaps also complessed.

igraph provides the read.graph() and write.graph() functions for reading and writing graphs from and to files. Both of these functions can handle a number of different file formats, which can be selected by the format argument.

Note however that you don't neccessarily need special igraph functions to read and write graphs in various formats, as there is a wealth of R functions for importing and exporting data in a variety of formats. For example you can use the standard R scan() to read an edge list from a file and then graph() to create a graph object from it. Similarly you can use get.edgelist() to create an edge list from a graph object and the standard write() to export it to a file.

As R can also import Excel files via the .csv file format, Excel files can also easily converted to igraph graph objects; just like nearly any other format.

Another possiblity to store graph data in files is to use the save() and load() functions as provided by R. Graph objects can be safely saved and loaded this way.

Simple formats

Simple formats can tipically store the structure of a graph but no or just limited amount of meta-data like edge weights, vertex colors, etc.

Edge List Files

Perhaps the simplest file format igraph can handle is the edge list format. This is nothing more than a simple text file with two whitespace separated columns containing vertex ids (ie. non-negative numbers). So if the graph.txt contains the following edge list:

0 1
2 3
3 2
1 0

then it can be read by setting the format argument to edgelist:

> read.graph("graph.txt", format="edgelist")
Vertices: 4 
Edges: 4 
Directed: TRUE 

Edges:
[0] 0 -> 1
[1] 2 -> 3
[2] 3 -> 2
[3] 1 -> 0

read.graph() has an optional directed argument, set this to FALSE to create an undirected graph, the default is directed. There is also an optional n argument, this is the number of vertices in the graph. Set this to the desired value if it is bigger than the highest vertex id in the edge list file. (If it is set to a smaller value then it will be ignored.)

Large Graph Layout Formats

These two formats were introduced by the Large Graph Layout program, they are abre to store edge weights and symbolic vertex names. See also the LGL homepage.

The ncol format is simply a weighted edge list with symbolic vertex names. The vertex names and the weight are separaeted by white space, the vertex names themselves cannot contain white space. The weights are optional. The vertex names are stored in the vertex attribute name, the edge weights are stored in the edge attribute weight. You may set the names and/or weights optional arguments to FALSE if you don't want to set these attributes.

If the file graph.ncol contains the following graph:

foo    bar 1
bar    baz -1
foobar foo 
foo    baz 2

then we get the following graph object:

> g <- read.graph("graph.ncol", format="ncol")
> g
Vertices: 4 
Edges: 4 
Directed: FALSE 

Edges:
[0] 0 -- 1
[1] 1 -- 2
[2] 0 -- 3
[3] 0 -- 2
> V(g)$name
[1] "foo"    "bar"    "baz"    "foobar"
> E(g)$weight
[1]  1 -1  0  2

The other LGL format is the lgl format. This is adjacency list like format, it might contain symbolic vertex names and optionally edge weights:

# vertex1name
vertex2name [optionalWeight]
vertex3name [optionalWeight]

This is similar to an adjacency list format, the neighboring vertices for each vertex are listed. The name of the initial vertex is preceded by “#”, then in the following lines the names of its neighbors are listed optionally with the weights of the edges.

The LGL software works with undirected graphs containing no multiple edges, this means that if vertex “B” is listed as a neighbor of vertex “A” then vertex “A” must not be listed as a neighbor of vertex “B”. igraph of course is happy with multiple edges and is able to read and write lgl files which break the original LGL software.

The Pajek format

Pajek it a popular network analysis program for Windows. (See the Pajek hompage.) It has a quite flexible but not very well documented file format, see the Pajek manual on the Pajek homepage for some information about the file format.

igraph can read and write Pajek files, with some limitations:

  • Only .net files are supported, Pajek project files (which can contain many graph and also other type of data) are not. Poject files might be supported in a forthcoming igraph release if they turned out to be needed.

  • Time events networks are not supported.

  • Hypergraphs (graphs with non-binary edges) are not supported as igraph cannot handle them.

  • Graphs containing both directed and undirected edges are not supported as igraph cannot represent them.

  • Bipartite (also called affiliation) networks are not supported. The surrent igraph version imports the network structure correctly but vertex type information is omitted.

  • Graph with multiple edge sets are not supported

igraph also reads the non-structural information from Pajek files, like edge weights and vertex colors and assign these as vertex and edge attributes. Note however that the names of the attributes are not always the same as in the Pajek file, some of them are renamed to be more informative, or other reasons. The following table contains the vertex attributes created by igraph:

igraph attribute name description, Pajek attribute
id Vertex id
x, y, z The “x”, “y” and “z” coordinate of the vertex
vertexsize The size of the vertex when plotted (size in Pajek).
shape The shape of the vertex when plotted.
color Vertex color (ic in Pajek) if given with symbolic name
color-red, color-green, color-blue Vertex color (ic in Pajek) if given in RGB notation
framecolor Border color (bc in Pajek) if given with symbolic name
framecolor-red, framecolor-green, framecolor-blue Border color (bc in Pajek) if given in RGB notation
labelcolor Label color (lc in Pajek) if given with symbolic name
labelcolor-red, labelcolor-green, labelcolor-blue Label color (lc in Pajek) if given in RGB notation
xfact, yfact The x_fact and y_fact Pajek attributes.
labeldist The distance of the label from the vertex. (lr in Pajek.)
labeldegree, labeldegree2 The la and lphi Pajek attributes
framewidth The width of the border (bw in Pajek).
fontsize Size of the label font (fos in Pajek.)
rotation The rotation of the vertex (phi in Pajek).
radius Radius, for some vertex shapes (r in Pajek).
diamondratio For the diamond shape (q in Pajek).

These igraph attributes are only created if there is at least one vertex in the Pajek file which has the corresponding associated information. Eg. if there are vertex coordinates for at least one vertex then the “x”, “y” and possibly also “z” vertex attributes will be created. For those vertices for which the attribute is not defined, NaN is assigned.

The following edge attributes might be created:

igraph attribute name description, Pajek attribute
weight Edge weights.
label l in Pajek.
color Edge color, if the color is given with a symbolic name, c in Pajek.
color-red, color-green, color-blue Edge color if it was given in RGB notation, c in Pajek.
edgewidth w in Pajek.
arrowsize s in Pajek.
hook1, hook2 h1 and h2 in Pajek.
angle1, angle2 a1 and a2 in Pajek, Bezier curve parameters.
velocity1, velocity2 k1 and k2 in Pajek, Bezier curve parameter.
arrowpos ap in Pajek.
labelpos lp in Pajek.
labelangle, labelangle2 lr and lphi in Pajek.
labeldegree la in Pajek.
fontsize fos in Pajek.
arrowtype a in Pajek.
linepattern p in Pajek.
labelcolor lc in Pajek.

Note vertices are numbered starting with one in Pajek, but igraph numbering starts with zero.

Here is an example Pajek file, it is part of the Pajek distribution under the name LINKS.NET:

*Network TRALALA
*vertices 4
   1 "1"                                           0.0938 0.0896   ellipse x_fact 1 y_fact 1
   2 "2"                                           0.8188 0.2458   ellipse x_fact 1 y_fact 1
   3 "3"                                           0.3688 0.7792   ellipse x_fact 1
   4 "4"                                           0.9583 0.8563   ellipse x_fact 1
*arcs
1 1 1  h2 0 w 3 c Blue s 3 a1 -130 k1 0.6 a2 -130 k2 0.6 ap 0.5 l "Bezier loop" lc BlueViolet fos 20 lr 58 lp 0.3 la 360
2 1 1  h2 0 a1 120 k1 1.3 a2 -120 k2 0.3 ap 25 l "Bezier arc" lphi 270 la 180 lr 19 lp 0.5
1 2 1  h2 0 a1 40 k1 2.8 a2 30 k2 0.8 ap 25 l "Bezier arc" lphi 90 la 0 lp 0.65
4 2 -1  h2 0 w 1 k1 -2 k2 250 ap 25 l "Circular arc" c Red lc OrangeRed
3 4 1  p Dashed h2 0 w 2 c OliveGreen ap 25 l "Straight arc" lc PineGreen
1 3 1  p Dashed h2 0 w 5 k1 -1 k2 -20 ap 25 l "Oval arc" c Brown lc Black
3 3 -1  h1 6 w 1 h2 12 k1 -2 k2 -15 ap 0.5 l "Circular loop" c Red lc OrangeRed lphi 270 la 180

This can be read into igraph with

> g <- read.graph(file="LINKS.NET", format="pajek")
> g
Vertices: 4 
Edges: 7 
Directed: FALSE 

Edges:
[0] 0 -- 0
[1] 0 -- 1
[2] 0 -- 1
[3] 1 -- 3
[4] 2 -- 3
[5] 0 -- 2
[6] 2 -- 2
> E(g)$color
[1] "Blue"       ""           ""           "Red"        "OliveGreen"
[6] "Brown"      "Red"

When writing a Pajek file with write.graph() the vertex and edge attributes in the previous two tables are written to the file if they're present in the graph.

The Pajek file format is included in igraph to allow users to convert their Pajek files to formats more suitable for igraph, like GraphML for example. The Pajek format is not intended to be used as a standard file format of igraph because of the lack of proper documentation.

The GraphML format

GraphML is an XML-based file format (an XML application in the XML terminology) to describe graphs. It is a modern format, and can store graphs with an extensible set of vertex and edge attributes, and generalized graphs which igraph cannot handle. Thus igraph supports only a subset of the GraphML language:

  • Hypergraphs are not supported.

  • Nested graphs are not supported.

  • Mixed graphs, ie. graphs with both directed and undirected edges are not supported. read.graph() sets the graph directed if this is the default in the GraphML file, even if all the edges are in fact undirected.

See the GraphML homepage for more information about the GraphML format.

Connection to Other Network Analysis Software

The SNA R Package

Pajek

As igraph can read and write Pajek .net files, this is quite easy, all you have to do is to save your graph in Pajek format and read it into Pajek or the other way: read your .net file into igraph. If you happen to have a different file Pajek file, then you can do the following: TODO.

Connection to Other Software

Microsoft Excel