Network Analysis with igraph |
---|

The igraph library and the igraph R package do not introduce a new file format to store graphs. Instead if recommends using the GraphML format for smaller graphs and some simple edge list format for larger graphs, perhaps also complessed.

igraph provides the `read.graph()`

and
`write.graph()`

functions for reading and writing
graphs from and to files. Both of these functions can handle a number
of different file formats, which can be selected by the
* format* argument.

Note however that you don't neccessarily need special igraph
functions to read and write graphs in various formats, as there is a
wealth of R functions for importing and exporting data in a variety of
formats. For example you can use the standard R
`scan()`

to read an edge list from a file
and then `graph()`

to create a graph object from
it. Similarly you can use `get.edgelist()`

to
create an edge list from a graph object and the standard
`write()`

to export it to a file.

As R can also import Excel files via the `.csv`

file format, Excel files can also easily converted to igraph graph
objects; just like nearly any other format.

Another possiblity to store graph data in files is to use the
`save()`

and `load()`

functions
as provided by R. Graph objects can be safely saved and loaded this
way.

Simple formats can tipically store the structure of a graph but no or just limited amount of meta-data like edge weights, vertex colors, etc.

Perhaps the simplest file format igraph can handle is the edge
list format. This is nothing more than a simple text file with two
whitespace separated columns containing vertex ids (ie. non-negative
numbers).
So if the `graph.txt`

contains the following
edge list:

0 1 2 3 3 2 1 0

then it can be read by setting the * format*
argument to

`edgelist`

:
`> `

read.graph("graph.txt", format="edgelist")

Vertices: 4 Edges: 4 Directed: TRUE Edges: [0] 0 -> 1 [1] 2 -> 3 [2] 3 -> 2 [3] 1 -> 0

`read.graph()`

has an optional
* directed* argument, set this to

`FALSE`

to create an undirected graph, the
default is directed. There is also an optional
`n`

These two formats were introduced by the Large Graph Layout program, they are abre to store edge weights and symbolic vertex names. See also the LGL homepage.

The `ncol`

format is simply a weighted edge list
with symbolic vertex names. The vertex names and the weight are
separaeted by white space, the vertex names themselves cannot contain
white space. The weights are optional. The vertex names are stored in
the vertex attribute `name`

, the edge weights are
stored in the edge attribute `weight`

. You may set
the * names* and/or

`weights`

`FALSE`

if you don't want to
set these attributes.
If the file `graph.ncol`

contains
the following graph:

foo bar 1 bar baz -1 foobar foo foo baz 2

then we get the following graph object:

`>`

g <- read.graph("graph.ncol", format="ncol")`>`

g

Vertices: 4 Edges: 4 Directed: FALSE Edges: [0] 0 -- 1 [1] 1 -- 2 [2] 0 -- 3 [3] 0 -- 2

`> `

V(g)$name

[1] "foo" "bar" "baz" "foobar"

`> `

E(g)$weight

[1] 1 -1 0 2

The other LGL format is the `lgl`

format. This is adjacency list like format, it might contain symbolic
vertex names and optionally edge weights:

# vertex1name vertex2name [optionalWeight] vertex3name [optionalWeight]

This is similar to an adjacency list format, the neighboring vertices for each vertex are listed. The name of the initial vertex is preceded by “#”, then in the following lines the names of its neighbors are listed optionally with the weights of the edges.

The LGL software works with undirected graphs containing no multiple
edges, this means that if vertex “B” is listed as a
neighbor of vertex “A” then vertex “A” must
not be listed as a neighbor of vertex “B”. igraph of
course is happy with multiple edges and is able to read and write
`lgl`

files which break the original LGL software.

Pajek it a popular network analysis program for Windows. (See the Pajek hompage.) It has a quite flexible but not very well documented file format, see the Pajek manual on the Pajek homepage for some information about the file format.

igraph can read and write Pajek files, with some limitations:

Only

`.net`

files are supported, Pajek project files (which can contain many graph and also other type of data) are not. Poject files might be supported in a forthcoming igraph release if they turned out to be needed.Time events networks are not supported.

Hypergraphs (graphs with non-binary edges) are not supported as igraph cannot handle them.

Graphs containing both directed and undirected edges are not supported as igraph cannot represent them.

Bipartite (also called affiliation) networks are not supported. The surrent igraph version imports the network structure correctly but vertex type information is omitted.

Graph with multiple edge sets are not supported

igraph also reads the non-structural information from Pajek files, like edge weights and vertex colors and assign these as vertex and edge attributes. Note however that the names of the attributes are not always the same as in the Pajek file, some of them are renamed to be more informative, or other reasons. The following table contains the vertex attributes created by igraph:

igraph attribute name | description, Pajek attribute |
---|---|

id | Vertex id |

x, y, z | The “x”, “y” and “z” coordinate of the vertex |

vertexsize | The size of the vertex when plotted
(`size` in Pajek). |

shape | The shape of the vertex when plotted. |

color | Vertex color (`ic` in Pajek) if given
with symbolic name |

color-red, color-green, color-blue | Vertex color (`ic` in Pajek) if given
in RGB notation |

framecolor | Border color (`bc` in Pajek) if given
with symbolic name |

framecolor-red, framecolor-green, framecolor-blue | Border color (`bc` in Pajek) if given
in RGB notation |

labelcolor | Label color (`lc` in Pajek) if given
with symbolic name |

labelcolor-red, labelcolor-green, labelcolor-blue | Label color (`lc` in Pajek) if given
in RGB notation |

xfact, yfact | The `x_fact` and
`y_fact` Pajek attributes. |

labeldist | The distance of the label from the
vertex. (`lr` in Pajek.) |

labeldegree, labeldegree2 | The `la` and `lphi`
Pajek attributes |

framewidth | The width of the border (`bw` in
Pajek). |

fontsize | Size of the label font (`fos` in
Pajek.) |

rotation | The rotation of the vertex (`phi` in
Pajek). |

radius | Radius, for some vertex shapes (`r` in
Pajek). |

diamondratio | For the diamond shape (`q` in
Pajek). |

These igraph attributes are only created if there is at least one
vertex in the Pajek file which has the corresponding associated
information. Eg. if there are vertex coordinates for at least one
vertex then the “x”, “y” and possibly also
“z” vertex attributes will be created. For those vertices
for which the attribute is not defined, `NaN`

is
assigned.

The following edge attributes might be created:

igraph attribute name | description, Pajek attribute |
---|---|

weight | Edge weights. |

label |
`l` in Pajek. |

color | Edge color, if the color is given with a symbolic name,
`c` in Pajek. |

color-red, color-green, color-blue | Edge color if it was given in RGB notation,
`c` in Pajek. |

edgewidth |
`w` in Pajek. |

arrowsize |
`s` in Pajek. |

hook1, hook2 |
`h1` and `h2` in
Pajek. |

angle1, angle2 |
`a1` and `a2` in
Pajek, Bezier curve parameters. |

velocity1, velocity2 |
`k1` and `k2` in
Pajek, Bezier curve parameter. |

arrowpos |
`ap` in Pajek. |

labelpos |
`lp` in Pajek. |

labelangle, labelangle2 |
`lr` and `lphi` in
Pajek. |

labeldegree |
`la` in Pajek. |

fontsize |
`fos` in Pajek. |

arrowtype |
`a` in Pajek. |

linepattern |
`p` in Pajek. |

labelcolor |
`lc` in Pajek. |

Note vertices are numbered starting with one in Pajek, but igraph numbering starts with zero.

Here is an example Pajek file, it is part of the Pajek distribution
under the name `LINKS.NET`

:

*Network TRALALA *vertices 4 1 "1" 0.0938 0.0896 ellipse x_fact 1 y_fact 1 2 "2" 0.8188 0.2458 ellipse x_fact 1 y_fact 1 3 "3" 0.3688 0.7792 ellipse x_fact 1 4 "4" 0.9583 0.8563 ellipse x_fact 1 *arcs 1 1 1 h2 0 w 3 c Blue s 3 a1 -130 k1 0.6 a2 -130 k2 0.6 ap 0.5 l "Bezier loop" lc BlueViolet fos 20 lr 58 lp 0.3 la 360 2 1 1 h2 0 a1 120 k1 1.3 a2 -120 k2 0.3 ap 25 l "Bezier arc" lphi 270 la 180 lr 19 lp 0.5 1 2 1 h2 0 a1 40 k1 2.8 a2 30 k2 0.8 ap 25 l "Bezier arc" lphi 90 la 0 lp 0.65 4 2 -1 h2 0 w 1 k1 -2 k2 250 ap 25 l "Circular arc" c Red lc OrangeRed 3 4 1 p Dashed h2 0 w 2 c OliveGreen ap 25 l "Straight arc" lc PineGreen 1 3 1 p Dashed h2 0 w 5 k1 -1 k2 -20 ap 25 l "Oval arc" c Brown lc Black 3 3 -1 h1 6 w 1 h2 12 k1 -2 k2 -15 ap 0.5 l "Circular loop" c Red lc OrangeRed lphi 270 la 180

This can be read into igraph with

`>`

g <- read.graph(file="LINKS.NET", format="pajek")`>`

g

Vertices: 4 Edges: 7 Directed: FALSE Edges: [0] 0 -- 0 [1] 0 -- 1 [2] 0 -- 1 [3] 1 -- 3 [4] 2 -- 3 [5] 0 -- 2 [6] 2 -- 2

`> `

E(g)$color

[1] "Blue" "" "" "Red" "OliveGreen" [6] "Brown" "Red"

When writing a Pajek file with `write.graph()`

the
vertex and edge attributes in the previous two tables are written to
the file if they're present in the graph.

The Pajek file format is included in igraph to allow users to convert their Pajek files to formats more suitable for igraph, like GraphML for example. The Pajek format is not intended to be used as a standard file format of igraph because of the lack of proper documentation.

GraphML is an XML-based file format (an XML application in the XML terminology) to describe graphs. It is a modern format, and can store graphs with an extensible set of vertex and edge attributes, and generalized graphs which igraph cannot handle. Thus igraph supports only a subset of the GraphML language:

Hypergraphs are not supported.

Nested graphs are not supported.

Mixed graphs, ie. graphs with both directed and undirected edges are not supported.

`read.graph()`

sets the graph directed if this is the default in the GraphML file, even if all the edges are in fact undirected.

See the GraphML homepage for more information about the GraphML format.

<< Random Graphs |
Graph Visualization >> |