com.bigdata.rdf.graph (Blazegraph Database Platform 2.1.5 API)

Interface Summary
Interface	Description
IBinder<VS,ES,ST>	An interface that may be used to extract variable bindings for the vertices visited by the algorithm.
IBindingExtractor<VS,ES,ST>	This interface makes it possible to extract bindings for variables from an `IGASProgram`.
IGASContext<VS,ES,ST>	Execution context for an `IGASProgram`.
IGASEngine	The interface used to submit an `IGASProgram` for evaluation.
IGASOptions<VS,ES,ST>	Interface for options that are understood by the `IGASEngine` and which may be declared by the `IGASProgram`.
IGASProgram<VS,ES,ST>	Abstract interface for GAS programs.
IGASScheduler	Interface schedules a vertex for execution.
IGASSchedulerImpl	Extended `IGASScheduler` interface.
IGASState<VS,ES,ST>	Interface exposes access to the VS and ES that is visible during a GATHER or SCATTER operation.
IGASStats	Statistics for GAS algorithms.
IGraphAccessor	Interface abstracts access to a backend graph implementation.
IPredecessor<VS,ES,ST>	A interface for `IGASProgram`s that compute paths and track a predecessor relationship among the visited vertices.
IReducer<VS,ES,ST,T>	An interface for computing reductions over the vertices of a graph.
IStaticFrontier	Interface abstracts the fixed frontier as known on entry into a new round.

Class Summary
Class	Description
BinderBase<VS,ES,ST>	A base class for IBinders.
Factory<V,T>	Singleton pattern for initializing a vertex state or edge state object given the vertex or edge.

Enum Summary
Enum	Description
EdgesEnum	Typesafe enumeration used to specify whether a GATHER or SCATTER phase is applied to the in-edges, out-edges, both, or not run.
FrontierEnum	Type-safe enumeration characterizing the assumptions of an algorithm concerning its initial frontier.
TraversalDirectionEnum	Typesafe enumeration of manner in which an RDF graph will be traversed by an `IGASProgram` based on its `EdgesEnum`.

Package com.bigdata.rdf.graph Description

The GAS (Gather Apply Scatter) API was developed for PowerGraph (aka GraphLab 2.1). This is a port of that API to the Java platform and schema-flexible attributed graphs using RDF.

Graph algorithms are stated using the GAS (Gather, Apply, Scatter) API. This API provides a vertex-centric approach to graph processing ("think like a vertex") that can be used to write a large number of graph algorithms (page rank, triangle counting, connected components, SSSP, betweenness centrality, etc.). The GAS API allows the GATHER operation to be efficently decomposed using fine-grained parallelism over a cluster.

Part of our effort under the XDATA program is to examine how fine-grained parallelism can be leveraged on GPUs and other many-core devices to deliver extreme performance on graph algorithms. We are looking at how the GAS abstraction can be evolved to expose more parallelism.

The interfaces of this API are stated in terms of RDF Value objects (for vertices) and Statement objects (for edges). Link attributes are handled efficiently by the bigdata implementation, which co-locates them in the indices with the links and then applies prefix compression to deliver a compact on disk foot print. See the section on Reification Done Right (below) for more details.

Reification Done Right and Property Graphs

Reification Done Right (RDR) explains the relationship between the somewhat opaque concept of RDF reification (which we use only for interchange) and statements about statements (more generally, the ability to turn any edge into a vertex and make statements about that vertex). There are different ways to handle statemetns about statements efficiently in the database, however these are internal physical schema design questions. From a user perspective, the main concern should be the performance of the database platform when using this feature. Bigdata uses a combination of inlining and prefix compression to provide a dense fast, bi-directional encoding of statements about statements and fast access paths whether querying by vertices, property values, or link attributes. You can also write queries using a high-level query language (SPARQL) that are automatically optimized and executed against the graph.

The RDR approach is more general than the Property Graph Model - anything that you can do with a property graph you can at as efficiently in an intelligently designed RDF database. Further, RDF graphs allow efficient handling of the following cases that are disallowed under the property graph model:

A vertex may have multiple property values for the same key.
A link may have multiple link attributes for the same key.
A link may serve as a vertex - thus you may have links whose sources or targets are other links (hypergraphs).

Because of its lack of cardinality constraints on property values and generality, RDF data sets may be freely combined and then leveraged. Data-level collisions simply do not occur.