May 162011
 

In the functional model that our PURSUIT work is based upon, we see three major functions being implemented, namely the finding of information, the building of an appropriate delivery graph, and the forwarding along this graph. Let us now address considerations for the separation of these functions at some levels of the architecture while performance and optimization demands merging these functions at others.

We start with the issue of separating finding of information from the forwarding along a possibly appropriate graph. The problem we try to outline is best captured by: If you don’t know its location, you might end up looking everywhere!

Our starting point is that of an interest in information at a consumer’s end with information being somewhere available in the network. First, let us assume that finding of information is merged with the forwarding of the actual information from a potential provider to the consumer of the information. For this, let us consider the following example:

Think of a social networking service very much akin to today’s solutions, like Facebook. A large-scale social network ala Facebook is likely to be distributed all the over the world in terms of publishers and subscribers. In other words, one can think of it as a social construct in which information pertaining to this social construct is unlikely to be locally constrained. For now, we assume that there is at least one scope that represents the information space of the social network.

There are now two options for implementation. First, take one that is similar to today’s Facebook in which (the scope) ‘facebook.com’ points to a (set of) server(s) that ‘host’ the service Facebook. Hence, all publications to the Facebook scope are in fact uploads, i.e., any subscription to a named data piece is in fact routed to the Facebook server farm. In this case, all information has in fact a location by virtue of the upload operation to a set of dedicated servers whether one wanted it or not. Merging finding of information and forwarding along a desirable graph is now possible, since any local egress router (called, e.g., content router in NDN) can simply forward the interest request to the (limited number of) domain(s) hosting the Facebook server.

Let us consider another approach to Facebook that builds on the power of storing the data at the publisher or at any other node. In other words, we do not assume that the information is uploaded to a server. Instead, we merely assume that the publisher (of Facebook information) signals the availability of the data within the scope of Facebook. It is now the task of the network to provide an appropriate route to this publisher for any future interest request. This model is appealing to a company like Facebook since it still allows control over the data by virtue of possible access control and profiling of usage patterns. But it relieves Facebook from the burden of hosting the actual data, i.e., it removes the need for operating uploading servers and therefore reduces overall costs of their operations. Any entity that happens to have a particular information item (such as a status update or photo) can provide the information to the interested subscriber.

In this form of a social network, what would happen if functions of finding and delivery were not separated? Let’s assume that the item is not available within the domain where it is requested (leaving out caching since we are concerned with an original request for the item which hasn’t been cached yet). An interest in a particular (social network) information item now needs to be forwarded to ANY domain that hosts the information. If we assume a BGP-like table at the egress router of each domain, the Facebook entry is likely to point to a number of domains that might host Facebook content (which can be any, given the scenario). Slowly, the interest request will propagate over many domains although it is likely that only one is hosting the actual information items at hand.  As a result, ANY status update of ANY social network member is likely to be spread over many, if not all, domains in the Internet! Depending on the intra-domain approach to determine whether or not an interest request can be fulfilled (NDN, for instance, uses local broadcast), this could easily amount to a global flooding of status updates in any network that might hold viable information about this social network (which is, in the case of Facebook, a LARGE number!).

A similar problem arises when bringing information mobility into play, i.e., information that is  exclusively available at a moving target (e.g., my personal laptop). If trying to be reachable by potential interested parties, the interest requests need to be forwarded in a larger number of ISPs (surely, movement patterns could be used to limit the region of discovery but would require not only disclosure of this information but also additional logic in the network to do so – information and logic that one does not want to associate to the fast-path forwarding function).

What is the problem here? Returning to our statement from the beginning, we can conclude that if you don’t know its location, you might end up looking everywhere! What is the lesson learned here? It is that, if information is location-less (which is often the case), finding the information needs to be separated from the construction of an appropriate delivery graph in order to optimize the operations of each of the functions. While we appreciate the cases where information has clearly limited location, e.g., in content distribution where dedicated content networks serve the interests of their customers, we consider it a strong assumption that (application-level) information in general is location-less, either due to the nature of its information space or due to mobility.

However, if information HAS location, merging these functions not only does not lead to problems but even allows for optimization of the operation. These cases are more likely, however, at lower levels of implementation. Take as an example the segmentation of a larger information item along a dedicated (transfer byte limited) forwarding link from one node to the other. Separating finding from delivery is futile here since the location of the information is obvious (the sending node) from the receiving node’s perspective. Hence, the rendezvous signal of interest can be interpreted as the very send operation from the one physical node to the other.

It is this separation of functions that is the powerful notion of the PURSUIT functional model and we need to do more work to better understand this power.

 Posted by at 07:07

  3 Responses to “On Separation of Functions”

  1. Am not sure whether the Facebook example is very suitable. Trend is to move all data to data-centres because they have 24/7 reliable operation. I would be frustrating having data at user hosts: their operation is ephemeral and (with today’s access technologies, e.g. xDSL), their upload capacity is limited.

    It would be interesting to have a thorough discussion on this subject: to what extend will user hosts act as publishers in the future? If we expect more and more data to be moved to the cloud, do we need a solution that optimizes network operation for the little amount of data being stored at user hosts?

    The MAD architecture (presented in day 1) suggested that it’s better to differentiate the network operation: optimize only for the heavy stuff (say data-centres), let the rest be handled by less efficient (and lighter) mechanisms. In the end, you don’t loose much.

    • The Facebook is not very suitable for a variety of reasons but it still serves the point of making the issue of separation of function understandable. Whether or not Facebook would be implemented in such way (within an ICN approach) is a very different issue. I’m not sure that the “cloud” issue is the point here; it is to be seen how much people will rely on some notion of cloud-based storage or not. It is an issue of control more than of technology how to possibly implement a social network. So you need to look beyond the specific example at the architectural issue at hand – we can discuss the implementation of FB-type services separately.

      P.S.: if there’s no utilization beyond data centre based storage, why doing ICN in the first place?

  2. “It would be interesting to have a thorough discussion on this subject: to what extend will user hosts act as publishers in the future?”

    Networking research seems to share a great similarity to the fashion world in that it follows trends in cycles. Less than ten years ago, P2P networking was the way forward, but today it is the cloud, which (at least to me) seems similar to a return to a lost world of mainframes solving computational problems (EC2) and being used to store our data.

    Ten years ago, many people were sharing their music using P2P file sharing systems, as they were perceived as providing less exposure to copyright enforcement authorities than centralised mechanisms. Connections were significantly slower then, but their uptake was significant. Clearly there is past precedence for content to published from end user devices. Legality or illegality was the motivation in P2P file sharing, but there are many other motivations for a variety of application contexts.

    Twitter strikes me as a service that shouldn’t be implemented the way it is. I can’t quite work out why I should be posting status updates of short-term interest to a centralised service. I publish the status update on my pc/tablet/mobile and the network should deliver it to those that are interested. With Twitter, I have to search for individuals who publish content of interest to me and I have to put up with their publications that are of no interest to me. This content need not be persistant (when I log into Twitter, I don’t read all unread tweets), and my HSDPA connection should have no issues in fulfilling its role in disseminating the content.

    To answer the question, I think the extent to which end user devices will be publishers is going to be dependent upon the end users. If that is what they require of the network, then the network has to be able to adapt to their needs. Ultimately it may be non-computing related considerations such as access control, legal issues, locality, mobility and user behaviour that may be some of the key issues.

Sorry, the comment form is closed at this time.