NOTE: The wcsplus
mailing list is no longer active. The list archives are made available for historical reasons.
Dear all, (I'm replying to Stefano's email under John's new thread title since this makes more sense! I've copied Stefano's email below my message.) Stefano - this is a really interesting email, thanks. I must admit I'm not sure I've fully grasped all the implications of what you say but perhaps you can help me understand a couple of points: 1) I agree that the REST approach gives a neater and more logical URI syntax, which can also apply to WMS, WFS etc. It seems logical to have a URI that represents a coverage (or Layer in a WMS), and to use this URI as the base for data and metadata queries. However I'm afraid I'm still struggling to see the gains beyond this: 2) The web is highly scalable (largely) because it is document-oriented and these documents (being generally "not very big") can easily be cached in various places (in your browser, in proxy servers, at ISPs etc) so that the load on the primary servers is reduced. However, dynamic content is usually not cached (caching would destroy the dynamic nature) so web servers that serve dynamic content don't benefit too much from this scalability. Serving of "large" files is usually made scalable, not through the caching system, but through the creation of mirror sites (i.e. an application outside the architecture of the Web). 3) Web servers are easy to implement because, for static content, there is a very simple mapping between a filesystem on disk and a URI hierarchy. Servers that serve dynamic content are harder to implement. Given that OWS servers are highly dynamic and will often deal with large datasets, I can't yet see how a REST approach brings benefits to efficiency or ease of implementation. The server still has to do basically the same tasks of query parsing, data extraction and data formatting. To take the example of caching - what do we cache? Entire coverages (e.g. http://someserver.net/coverages/foo)? I don't think this is feasible in the general case. In any case you will need most or all of your business logic to exist on all cache servers in the system to support subsetting. It is of course highly desirable to make OWS servers more efficient and easier to implement. However, I'm not sure the analogy with the Web is valid. I still think that OWS are query-oriented rather than resource-oriented: a general WCS server will have "a few" resources (coverages) but will need to be able to serve an infinity of possible subsetting queries. I think a better analogy is with data-driven dynamic websites, which can only scale up to large number of simultaneous users through "clustering" the back-end, which of course adds to the complexity. Perhaps I'm missing your point though. It is Monday morning after all! ;-) Jon On 11/2/07, Stefano Nativi <nativi@xxxxxxxxxxx> wrote:
Dear all, I really appreciate this discussion which touches several of the issues we have been discussing and facing in our research and development activity. We have been developing OWS on SOAP; recently, we decided to play with some REST implementations (especially for asynch interactions). Therefore, I'd like to add some comments stemming from our understanding of REST and experience with it. Please, forgive the long content of this email; actually I put together Paolo's and my comments :-) . Let me distinguish between the REST approach (the architectural style) and the RESTful implementation (the current technological solutions for implementing REST). The REST approach proved to be highly scalable and sufficiently flexible in many contexts, primarly the WEB infrastructure but also DB and filesystem access. In all these cases we have resources singularly addressed with a uniform interface. Indeed the possible REST actions are limited by the uniform interface which tipically maps the simple CRUD (create, retrieve, update and delete) paradigm. Often simplicity means generality and flexibility (see the netCDF data model case); in fact, this simplicity was one of the reason for the WEB pervasive success and for its scalability. On the other hand, advanced semantic actions (e.g. resource processing actions) must be mapped to the basic CRUD vocabulary. For example in the DB domain we can use SQL: a DB is the resource domain; the uniform interface is made of SELECT/INSERT/CREATE/UPDATE/DELETE methods; resource-IDs are all the possible SQL "WHERE" clauses. For the WEB (which may be seen as a globally distributed DB), resource-IDs are the WEB URIs (i.e. the "WEB clauses"). In both cases the resource-ID may become really complex (i.e. very long KVP strings; or complex SQL JOIN SELECTS) and, hence, it may be difficult to efficiently manage these IDs. For a REST WCS implementation (at the abstract level; no implementation details), resource-IDs are the GetCoverage clauses (analogous to the "SELECT" request content). In our opinion, this is the real asset/limitation of REST: the application business logic must be faced and partially addressed at the interaction level (the protocol level), leaving the rest of the business logic to the server which, consequently, may result simpler (almost any Institution can manage a WEB server, today). With the Service-oriented approach, the entire application business logic is left to the server (i.e. the service provider) implementing a even simpler interaction: Exchange/Send an Electronic Document. Thus, SOA guarantees high flexibility, but the server (the service provider) has to face all the resource-related issues (e.g. resource caching, ID, creation, encoding, etc.) anyway. Thus REST focus is on uniform interface and resource addressing not on resources nature (discrete, existing, etc.). If we can provide a uniform interface and a complete resources addressing we can adopt a REST architecture. In our opinion WCS seems to be implicitly based on a uniform interface (since we GET coverages, GET coverages descriptions and GET server capabilities and we do not explicitly define other action like INTERPOLATE, SUBSET, etc.), allowing to address each resource. Hence, a REST architecture seems an effective choice for this domain. As to RESTful implementation for Geospatial resources, several issues must be considered. First of all we should define what "resource" and " resource representation" are in this domain. We could decide that a dataset is the resource and all the features extracted from the dataset through interpolation, subsetting and resampling are simply different representations. In such case we should only address the dataset with a known URI and possibly create new resources if required. On the other hand we could consider each feature extracted from a dataset as a different resource. In such case we should address each feature with a different URI. Presently, we are working on this second approach for some reasons: for theoretical consistency (according to the Web architecture a representation should only affect formats), and for implementation reasons (different URIs could support server-side caching). Concerning the addressing problem we do not need to explicitly define URIs for each possible feature. We can simply provide a functional mapping between a URI-space and resource representations. In the OWS the URL-encoding of KVP string in a GET request IS the resource addressing. The fact that the feature is dynamically created is not an architectural problem but an implementation issue which might require smart caching servers. For example: http://someserver.net/wcs?name=foo&bbox=-180,-90,180,90&... is the URI for the feature extracted by the coverage named "foo" with the interpolation, subsetting and resampling defined by bbox (and other) parameters. (A better URI could be defined leaving only non-hierarchical parameters in the query part of the URI. Something like: http://someserver.net/coverages/foo?bbox=-180,-90,180,90&... ) When the request is encoded in a POST it should be considered as a query to the root resource which responds with the representation of the target resource. This could also be viewed as an extraction-from-dataset service; however, this may introduce useless complexity since the request is still a GET action. In fact, there exists an implicit hierarchy of our features, and the root feature (the "foo" coverage in our example) doesn't support only its own GET operation, but also the selection of its children via a POST operation. These considerations seem to be valid not only for WCS but for all the data access services (e.g. WCS, WFS and WMS). They conform to a resource-oriented approach and can be implemented in a RESTful architecture with "minimal" modifications of existing specifications. Besides, the RESTful implementation might be easily adopted by data providers, since it should be based on well-known technologies. The case of WPS and WCTS seems to be different. In fact, they don't define a uniform interface for the many operations they should support; on the contrary, they introduce a uniform interface to receive a message which contains specific operation requests. In this case we should use the POST method as the extension point for interaction with HTTP based services which create new addressable resources (a sort of ending point in the SOA view). In such a way we should have the advantages of pervasive and scalable data provision (through the RESTful implementation) and modular and composable processing (through the service-oriented architecture). Some possible conclusions: A RESTful implementation is valuable for scalability and extensibility (derived by the REST architectural style) as well as for simplicity (the implementation is simple since it is based on well-known technology and only simple operations must be supported server-side) The RESTful implementation seems feasible for data access services because they are typically resource-based. The RESTful architecture must interact with a Service-oriented architecture for basic and advanced processing. XML and HTTP are the key technologies for bridging. Thank you for your patience, Stefano and Paolo
-- -------------------------------------------------------------- Dr Jon Blower Tel: +44 118 378 5213 (direct line) Technical Director Tel: +44 118 378 8741 (ESSC) Reading e-Science Centre Fax: +44 118 378 6413 ESSC Email: jdb@xxxxxxxxxxxxxxxxxxxx University of Reading 3 Earley Gate Reading RG6 6AL, UK --------------------------------------------------------------
wcsplus
archives: