Our most recent architecture deliverable D2.3 is now available on our webpage (you can find it here)! Feel free to download and study our most recent update to the PURSUIT architecture and design choices that we have been working on.
In the recent SIGCOMM ICN workshop, one particular paper by Perino and Varvello evaluated the feasibility of realizing a CCN content router; this feasibility being drawn against a roadmap of memory technologies (given that the authors were from Alcatel-Lucent, we can assume some truth in the outlined memory roadmaps). The paper concluded that “We ﬁnd that today’s technology is not ready to support an Internet scale CCN deployment, whereas a CDN or ISP scale can be easily afforded.”
In other words, the authors assert that the feasibility of CCN content routers just about ‘rides the wave of state-of-the-art memory technologies’. One needs to investigate closer as to what the authors assumptions were when arriving at these conclusions, i.e., what was the input into the model to derive expected FIB (forwarding information base), content store and PIT (Pending Interest Table) sizes? As outlined on page 47 (of the proceedings), the authors assume a starting point as well as progression similar to that of today’s reachable hostnames (derived from Google statistics), i.e., starting at today’s about 280 millions indexed routable hostnames. Given that starting point and progression, the required sizes for FIBs and PITs etc just about progress along the expected availability of different memory types.
Why is this conclusion so interesting for PURSUIT?
It is interesting because PURSUIT has a significantly different starting point than the authors present in their paper. This different starting point comes from a very different assumption of what an information-centric Internet really is. The assumption in the paper shows that it sees a CCN-like Internet as a progression of the same model that we know today. In other words, it assumes a hosting model through globally routable servers which host information that needs to be disseminated (sending an interest request to fp7-pursuit.eu/query, an information space for the PURSUIT project that is hosted by a commercial ISP!).
But can we assume that any future (information-centric Internet) really has the same model?
In our predecessor project PSIRP, we went through a lengthy discussion on this at the very beginning of the project. What were our assumptions of how information is being pushed around in this brave new world that we envision? While Pekka Nikander’s numbers of items and scopes of information were certainly VERY high for some in the project (I recall numbers like 1021 individual items per user with some 1015 scopes), it certainly showed a direction of thought that has been commonly shared since: an information-centric Internet has a vast number of publishers of data; publishers that we assume need to be reachable in some form or another in order to disseminate the information finally to those being interested in them. For instance, if my local pub would like to publish the availability of spaces in their facilities, we assume a simple device being stuck on their walls, publishing the availability through an embedded system on the device directly to the (future) Internet. No need for uploading to a hosting provider that would then become the globally routable hostname!
This is a VERY different starting point. If we apply this starting point to the aforementioned paper, the conclusions are obvious: an in-router state approach such as CCN (in its current form) cannot work, given the memory technologies known to come.
But apart from this difference in conclusions in a paper published some four years after the start of PSIRP, there was a more immediate conclusion that we arrived at very early within the project: we have to exploit a spectrum of forwarding technologies that ranges from truly stateless (albeit limited in capabilities) to fully in-router state!
It is this insight that we have exploited since in our work on forwarding in information-centric networks. The very first step, the point at the ‘far left’ of the aforementioned spectrum, was developed in the PSIRP project. It is known as the LIPSIN forwarding and was published in SIGCOMM 2009. It is this scheme that is also currently implemented in our Blackadder prototype. Sure, its ability to implement native multicast with constant length headers is limited due to the inevitable false positives when increasing the tree size. But it still allows to cover a remarkable range of scenarios with local reach of information dissemination. And we are working on a number of extensions to the basic scheme, either through a relay architecture (see our deliverable D2.2 for a first version of that) or through moving towards a variable length scheme (more on this in the upcoming D2.3).
What we know well is the ‘right hand side’ of the spectrum, namely full in-router state approaches such as CCN or IP multicast, come to that. But we truly believe that only a range of forwarding solutions along this outlined spectrum of state tradeoff can bring us the required set of solutions for fully accommodating the requirements for a future (information-centric) Internet where publishers exist in abundance!
Our prototype plan (D3.1) as well as our dissemination plan (D5.5) have been published in our Deliverables section! Please have a look at what we plan on the implementation side of things and what we achieved so far in disseminating our results.
Within PURSUIT, we have completed our 1st architecture deliverable. It is available here. You can find there our functional model for the architecture, the major functions of the architecture and descriptions on our current work on caching, transport and other areas.
In the functional model that our PURSUIT work is based upon, we see three major functions being implemented, namely the finding of information, the building of an appropriate delivery graph, and the forwarding along this graph. Let us now address considerations for the separation of these functions at some levels of the architecture while performance and optimization demands merging these functions at others.
We start with the issue of separating finding of information from the forwarding along a possibly appropriate graph. The problem we try to outline is best captured by: If you don’t know its location, you might end up looking everywhere!
Our starting point is that of an interest in information at a consumer’s end with information being somewhere available in the network. First, let us assume that finding of information is merged with the forwarding of the actual information from a potential provider to the consumer of the information. For this, let us consider the following example:
Think of a social networking service very much akin to today’s solutions, like Facebook. A large-scale social network ala Facebook is likely to be distributed all the over the world in terms of publishers and subscribers. In other words, one can think of it as a social construct in which information pertaining to this social construct is unlikely to be locally constrained. For now, we assume that there is at least one scope that represents the information space of the social network.
There are now two options for implementation. First, take one that is similar to today’s Facebook in which (the scope) ‘facebook.com’ points to a (set of) server(s) that ‘host’ the service Facebook. Hence, all publications to the Facebook scope are in fact uploads, i.e., any subscription to a named data piece is in fact routed to the Facebook server farm. In this case, all information has in fact a location by virtue of the upload operation to a set of dedicated servers whether one wanted it or not. Merging finding of information and forwarding along a desirable graph is now possible, since any local egress router (called, e.g., content router in NDN) can simply forward the interest request to the (limited number of) domain(s) hosting the Facebook server.
Let us consider another approach to Facebook that builds on the power of storing the data at the publisher or at any other node. In other words, we do not assume that the information is uploaded to a server. Instead, we merely assume that the publisher (of Facebook information) signals the availability of the data within the scope of Facebook. It is now the task of the network to provide an appropriate route to this publisher for any future interest request. This model is appealing to a company like Facebook since it still allows control over the data by virtue of possible access control and profiling of usage patterns. But it relieves Facebook from the burden of hosting the actual data, i.e., it removes the need for operating uploading servers and therefore reduces overall costs of their operations. Any entity that happens to have a particular information item (such as a status update or photo) can provide the information to the interested subscriber.
In this form of a social network, what would happen if functions of finding and delivery were not separated? Let’s assume that the item is not available within the domain where it is requested (leaving out caching since we are concerned with an original request for the item which hasn’t been cached yet). An interest in a particular (social network) information item now needs to be forwarded to ANY domain that hosts the information. If we assume a BGP-like table at the egress router of each domain, the Facebook entry is likely to point to a number of domains that might host Facebook content (which can be any, given the scenario). Slowly, the interest request will propagate over many domains although it is likely that only one is hosting the actual information items at hand. As a result, ANY status update of ANY social network member is likely to be spread over many, if not all, domains in the Internet! Depending on the intra-domain approach to determine whether or not an interest request can be fulfilled (NDN, for instance, uses local broadcast), this could easily amount to a global flooding of status updates in any network that might hold viable information about this social network (which is, in the case of Facebook, a LARGE number!).
A similar problem arises when bringing information mobility into play, i.e., information that is exclusively available at a moving target (e.g., my personal laptop). If trying to be reachable by potential interested parties, the interest requests need to be forwarded in a larger number of ISPs (surely, movement patterns could be used to limit the region of discovery but would require not only disclosure of this information but also additional logic in the network to do so – information and logic that one does not want to associate to the fast-path forwarding function).
What is the problem here? Returning to our statement from the beginning, we can conclude that if you don’t know its location, you might end up looking everywhere! What is the lesson learned here? It is that, if information is location-less (which is often the case), finding the information needs to be separated from the construction of an appropriate delivery graph in order to optimize the operations of each of the functions. While we appreciate the cases where information has clearly limited location, e.g., in content distribution where dedicated content networks serve the interests of their customers, we consider it a strong assumption that (application-level) information in general is location-less, either due to the nature of its information space or due to mobility.
However, if information HAS location, merging these functions not only does not lead to problems but even allows for optimization of the operation. These cases are more likely, however, at lower levels of implementation. Take as an example the segmentation of a larger information item along a dedicated (transfer byte limited) forwarding link from one node to the other. Separating finding from delivery is futile here since the location of the information is obvious (the sending node) from the receiving node’s perspective. Hence, the rendezvous signal of interest can be interpreted as the very send operation from the one physical node to the other.
It is this separation of functions that is the powerful notion of the PURSUIT functional model and we need to do more work to better understand this power.
Check out our short film about PURSUIT and how you could experience some of the PURSUIT features in the future. The focus of this film is on the security (access control) and accountability aspect that is enabled by the information centrism that PURSUIT has.
Enjoy the video!
We have been working on going live with our PURSUIT test bed during these past few months. Our new prototype Blackadder is now deployed in a 25+ node setup over various European sites (all our partners contributed to the setup). And we also welcome MIT’s Advanced Network Architecture group to the pool of PURSUIT users!
In order to emulate a native Ethernet-based deployment, we lay the test bed over IP through an openVPN setup. Each node is implemented on a virtual machine running in various dedicated machines at each site. This gives us a growing opportunity for demonstrating and testing new developments within the PURSUIT project. A first demonstration is streaming high-quality video from a server at Cambridge University to anybody who subscribes in the network – all in native PURSUIT multicast fashion!
We are proud to announce the availability of our new PURSUIT prototype under an open source license model. The prototype, called Blackadder, implements a fully DAG-based information-centric service model. It is based on the Click modular router and has been tested under Debian and Ubuntu Linux platforms.
The open source package, available on our new Blackadder webpage includes the source code as well as configuration scripts necessary for you to implement your own information-centric applications.
Any feedback, feature requests or comments in general, please post on our Feature Request page or send a direct email to the team.
The test bed originally built out within the PSIRP project has been revived for purposes of testing and demonstrating PURSUIT technology. Initial installations have been finished with first internal tests performed successfully – a simple video streaming application successfully transfered video content over several native forwarding nodes to a video subscriber.
The test bed tunnels native Ethernet frames via the public Internet, using openVPN between the various sites. Ethernet topologies are currently configured through scripts but graphical configuration is worked on.
We hope to have finalized the test bed setup during April that will allow for demonstration of the main functions of our prototype. We will update you in the News section about this.