Skip navigation
Help

Graph

warning: Creating default object from empty value in /var/www/vhosts/sayforward.com/subdomains/recorder/httpdocs/modules/taxonomy/taxonomy.pages.inc on line 33.

Faced with the need to generate ever-greater insight and end-user value, some of the world’s most innovative companies — Google, Facebook, Twitter, Adobe and American Express among them — have turned to graph technologies to tackle the complexity at the heart of their data.

To understand how graphs address data complexity, we need first to understand the nature of the complexity itself. In practical terms, data gets more complex as it gets bigger, more semi-structured, and more densely connected.

We all know about big data. The volume of net new data being created each year is growing exponentially — a trend that is set to continue for the foreseeable future. But increased volume isn’t the only force we have to contend with today: On top of this staggering growth in the volume of data, we are also seeing an increase in both the amount of semi-structure and the degree of connectedness present in that data.

Semi-Structure

Semi-structured data is messy data: data that doesn’t fit into a uniform, one-size-fits-all, rigid relational schema. It is characterized by the presence of sparse tables and lots of null checking logic — all of it necessary to produce a solution that is fast enough and flexible enough to deal with the vagaries of real world data.

Increased semi-structure, then, is another force with which we have to contend, besides increased data volume. As data volumes grow, we trade insight for uniformity; the more data we gather about a group of entities, the more that data is likely to be semi-structured.

Connectedness

But insight and end-user value do not simply result from ramping up volume and variation in our data. Many of the more important questions we want to ask of our data require us to understand how things are connected. Insight depends on us understanding the relationships between entities — and often, the quality of those relationships.

Here are some examples, taken from different domains, of the kinds of important questions we ask of our data:

  • Which friends and colleagues do we have in common?
  • What’s the quickest route between two stations on the metro?
  • What do you recommend I buy based on my previous purchases?
  • Which products, services and subscriptions do I have permission to access and modify? Conversely, given this particular subscription, who can modify or cancel it?
  • What’s the most efficient means of delivering a parcel from A to B?
  • Who has been fraudulently claiming benefits?
  • Who owns all the debt? Who is most at risk of poisoning the financial markets?

To answer each of these questions, we need to understand how the entities in our domain are connected. In other words, these are graph problems.

Why are these graph problems? Because graphs are the best abstraction we have for modeling and querying connectedness. Moreover, the malleability of the graph structure makes it ideal for creating high-fidelity representations of a semi-structured domain. Traditionally relegated to the more obscure applications of computer science, graph data models are today proving to be a powerful way of modeling and interrogating a wide range of common use cases. Put simply, graphs are everywhere.

Graph Databases

Today, if you’ve got a graph data problem, you can tackle it using a graph database — an online transactional system that allows you to store, manage and query your data in the form of a graph. A graph database enables you to represent any kind of data in a highly accessible, elegant way using nodes and relationships, both of which may host properties:

  • Nodes are containers for properties, which are key-value pairs that capture an entity’s attributes. In a graph model of a domain, nodes tend to be used to represent the things in the domain. The connections between these things are expressed using relationships.
  • A relationship has a name and a direction, which together lend semantic clarity and context to the nodes connected by the relationship. Like nodes, relationships can also contain properties: Attaching one or more properties to a relationship allows us to weight that relationship, or describe its quality, or otherwise qualify its applicability for a particular query.

The key thing about such a model is that it makes relations first-class citizens of the data, rather than treating them as metadata. As real data points, they can be queried and understood in their variety, weight and quality: Important capabilities in a world of increasing connectedness.

Graph Databases in Practice

Today, the most innovative organizations are leveraging graph databases as a way to solve the challenges around their connected data. These include major names such as Google, Facebook, Twitter, Adobe and American Express. Graph databases are also being used by organizations in a range of fields including finance, education, web, ISV and telecom and data communications.

The following examples offer use case scenarios of graph databases in practice.

  • Adobe Systems currently leverages a graph database to provide social capabilities to its Creative Cloud — a new array of services to media enthusiasts and professionals. A graph offers clear advantages in capturing Adobe’s rich data model fully, while still allowing for high performance queries that range from simple reads to advanced analytics. It also enables Adobe to store large amounts of connected data across three continents, all while maintaining high query performance.
  • Europe’s No. 1 professional network, Viadeo, has integrated a graph database to store all of its users and relationships. Viadeo currently has 40 million professionals in its network and requires a solution that is easy to use and capable of handling major expansion. Upon integrating a graph model, Viadeo has accelerated its system performance by more than 200 percent.
  • Telenor Group is one of the top ten wireless Telco companies in the world, and uses a graph database to manage its customer organizational structures. The ability to model and query complex data such as customer and account structures with high performance has proven to be critical to Telenor’s ongoing success.

An access control graph. Telenor uses a similar data model to manage products and subscriptions.

An access control graph. Telenor uses a similar data model to manage products and subscriptions.

  • Deutsche Telekom leverages a graph database for its highly scalable social soccer fan website attracting tens of thousands of visitors during each soccer match, where it provides painless data modeling, seamless data model extendibility, and high performance and reliability.
  • Squidoo is the popular social publishing platform where users share their passions. They recently created a product called Postcards, which are single-page, beautifully designed recommendations of books, movies, music albums, quotes and other products and media types. A graph database ensures that users have an awesome experience as it provides a primary data store for the Postcards taxonomy and the recommendation engine for what people should be doing next.

Such examples prove the pervasiveness of connections within data and the power of a graph model to optimally map relationships. A graph database allows you to further query and analyze such connections to provide greater insight and end-user value. In short, graphs are poised to deliver true competitive advantage by offering deeper perspective into data as well as a new framework to power today’s revolutionary applications.

A New Way of Thinking

Graphs are a new way of thinking for explicitly modeling the factors that make today’s big data so complex: Semi-structure and connectedness. As more and more organizations recognize the value of modeling data with a graph, they are turning to the use of graph databases to extend this powerful modeling capability to the storage and querying of complex, densely connected structures. The result is the opening up of new opportunities for generating critical insight and end-user value, which can make all the difference in keeping up with today’s competitive business environment.

Emil is the founder of the Neo4j open source graph database project, which is the most widely deployed graph database in the world. As a life-long compulsive programmer who started his first free software project in 1994, Emil has with horror witnessed his recent degradation into a VC-backed powerpoint engineer. As the CEO of Neo4j’s commercial sponsor Neo Technology, Emil is now mainly focused on spreading the word about the powers of graphs and preaching the demise of tabular solutions everywhere. Emil presents regularly at conferences such as JAOO, JavaOne, QCon and OSCON.

0
Your rating: None

Enlarge / Infogr.am's splash page lets you choose between creating a new design, or accessing previously stored works.

Let's say you want to compare the airspeed velocity of various unladen swallows. You could put that data into a spreadsheet, or present it using a graph. Chances are you'll pick the latter.

The question is, how do you create a graph that not only looks nice, but presents said data in a coherent, easy to understand way? You could go the traditional route, and make something basic in Microsoft Office or your productivity suite of choice. Or, if you have a bit more skill, you could opt for the flexibility of Adobe Illustrator instead. But perhaps you have neither software nor skill, and just want to make a simple graph or chart.

Infogr.am takes the same approach that Tumblr took to blogging, by turning the otherwise daunting task of creating great infographics into a process that's dead-simple. You bring the raw data to Infogr.am, and the site's online tool can help you turn that data into a nice looking chart or full-blown infographic in minutes.  It's really that simple—assuming you have a Facebook or Twitter account required to login—though, at times, to a fault.

Read more | Comments

0
Your rating: None

For this homework you will implement a simple distributed hill-climbing algorithm and test its behaviors on various graphs. The goal is for you to become proficient at using the NetLogo link primitives and to gain first-hand experience with the problems and benefits of distributed hill-climbing algorithms.

  1. Read the link primitives section of the NetLogo manual.
  2. Implement a NetLogo model that generates a random graph, by using a simple algorithm: first create num-nodes nodes (a slider), then create num-nodes * edge-ratio edges, each one connected to two randomly chosen nodes.
  3. Implement another button which instead generates the graph using preferential attachment. Specifally: first create num-nodes nodes, then create num-nodes * edge-ratio edges, each one created by picking two nodes chosen with a probability proportional to their degree (number of incident edges). For example, if there are 3 nodes, one with two edges and the other two with one edge each (the graph is a line) then the one with two edges gets chosen with probability 2 / 4 = 1/2, while each other two nodes is chosen with probability 1/4. The denominator is always the total number of edges and the numerator is the degree for that node.
  4. Implement a num-colors slider and randomly color the nodes using that many colors.
  5. Implement a layout button which calls one of the built-in NetLogo layout methods to make the graph look pretty.
  6. Implement a basic hill-climbing algorithm. On each tick every node looks at the colors of its neighbors and changes its color to one that does not conflict with any. If there is no such color then it will change to one that minimizes the number of constraint violations with its neighbors (min-conflict heuristic). If at any tick none of the nodes changes its color then we stop since a coloring has been found.
  7. Add a test button which performs a more extensive test. Specifically, for the given number of nodes and edge ratio, it will generate 100 graphs of random and preferential attachment types and run the hill climbing algorithm, plotting the number of time steps it took to find a solution in a histogram, one for random and one for preferential attachment. You will need to set an arbitrary large number for the 'does not stop' case.

Add your name and describe your results in the model description tab. Email me you .nlogo file by Wednesday September 21 @9am.

0
Your rating: None

I recently did a complete rewrite of my graph-based A* pathfinder example because I received a lot of questions on how to implement path-finding using the new ds library. So here is the updated version which works with ds 1.32:

I’m transforming the point cloud with delaunay triangulation into a graph structure. Then the system computes and draws the shortest path between two selected points.

Compile instructions

Running and examining the example is really easy:

  1. Use the automated installer to install haXe from http://haxe.org/download.
  2. Download the latest haXe nightly build and overwrite the existing ‘haxe.exe’ and ‘std’ folder with the downloaded version.
  3. Install the polygonal library by opening the command prompt and typing:
    haxelib install polygonal.

Sources should now be in {haxe_install_dir}/lib/polygonal/1,18/src/impl/sandbox/ds/astar, where {haxe_install_dir} is usually C:/Motion-Twin/haxe on Win7.
The demo can be compiled with:
cd C:\Motion-Twin\haxe\lib\polygonal\1,18\build
haxe.exe compile-ds-examples.hxml

Extending the Graph class

You have basically two options to extend the functionality of the Graph object: by composition or inheritance. While I highly recommend to use composition whenever possible, I’ve also included a version using inheritance – just so you see the difference.

The composition version looks like this:
astar using composition
The Graph object manages GraphNode objects, and each GraphNode holds a Waypoint object, which defines the world position of the waypoint as well as intermediate data used by the A* algorithm. Notice that GraphNode and Waypoint are cross-referencing each other as a Waypoint object has to query the graph for adjacent nodes. As a result, you have a clean separation between the data structure (Graph, GraphNode) and the algorithm (AStar, Waypoint) and don’t need object casting, which is good news because casting is a rather slow operation.

Now let’s look at the version using inheritance:
astar using inheritance
Here, Waypoint directly subclasses GraphNode. Since the Graph is defined to work with GraphNode objects, we need a lot of (unsafe) down-casts to access the Waypoint class. Furthermore, the use of haxe.rtti.Generic will be very restricted or even impossible (implementing this marker interface generates unique classes for each type to avoiding dynamic).

3
Your rating: None Average: 3 (1 vote)

Narrative graphs are a useful tool for charting out any narrative, but are especially useful for the development of game stories. This article overviews how this design tool can be used.

0
Your rating: None


Running Large Graph Algorithms: Evaluation of Current State-Of-the-Art and Lessons Learned

Google Tech Talk February 11, 2010 ABSTRACT Presented by Dr. Andy Yoo, Lawrence Livermore National Laboratory. Graphs have gained a lot of attention in recent years and have been a focal point in many emerging disciplines such as web mining, computational biology, social network analysis, and national security, just to name a few. These so-called scale-free graphs in the real world have very complex structure and their sizes already have reached unprecedented scale. Furthermore, most of the popular graph algorithms are computationally very expensive, making scalable graph analysis even more challenging. To scale these graph algorithms, which have different run-time characteristics and resource requirements than traditional scientific and engineering applications, we may have to adopt vastly different computing techniques than the current state-of-art. In this talk, I will discuss some of the findings from our studies on the performance and scalability of graph algorithms on various computing environments at LLNL, hoping to shed some light on the challenges in scaling large graph algorithms. Andy Yoo is a computer scientist in the Center for Applied Scientific Computing (CASC). His current research interests are scalable graph algorithms, high performance computing, large-scale data management, and performance evaluation. He has worked on the large graph problems since 2004. In 2005, he developed a scalable graph search algorithm and demonstrated it by searching a graph <b>...</b>
From:
GoogleTechTalks
Views:
4915

23
ratings
Time:
50:37
More in
Science & Technology

0
Your rating: None