Yaron responded to my Web3S posts on identity in hierarchies and graph serialization support. (Yaron's comment is on the second link).
My original concern was Web3S duplicates the values of a single resource when serializing a tree, and this obscures the identity of resources.
Now, I realize the issue isn't just about obscuring the identity, but Web3S has no standard way to expose the primary resources in a system.
The properties of a Web3S:ID are
- unique only within the containing element,
- used to generate URIs, but also path-relative to the containing elements
This URI (a non-prefixed, short version from the example in section 7):
http://example.net/stuff/morestuff/articles/article(8383)/authors/author(23455)doesn't provide a primary resource identifier for the author. Why? This path dependent URI is hardly different than the same URI with a fragment identifier (for a secondary resource):
http://example.net/stuff/morestuff/articles/article(8383)#author23455From Architecture of the World Wide Web, section 2.6 Fragment Identifiers:
The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. The terms "primary resource" and "secondary resource" are defined in section 3.5 of [URI].The author in all of these examples should be (able to be) identified as a primary resource, not just a secondary one. Whether or not authors are primary or secondary resources is really a question of server implementation. The spec for Web3S makes that that decision by default for all services though.
Here is the rest of Yaron's comment, with my additional thinking added in.
Yaron writes: Web3S’s infoset is just the core set of primitive data containers, they don’t define the data models that sit on top. So, for example, one could easily define a graph based data model on top of the tree based infoset that used links to say things like “These two things are the same”. In fact section 10.1 and 10.2 of the Web3S spec define a standard HREF style element for exactly this reason.True enough, it could be done. However, the Web3S infoset does explicitly define a notion of identify, Web3S:ID, that doesn't provide a way to express those thing.
The definition you mentioned of HREF elements makes no statement about what resource is identified, so no client can assume anything about it.
Yaron writes: By Value – If the canonical author entry and the references to author all exist within the same ‘system’ (I’m being vague intentionally but think of examples like a single DB) then likely would one just use by value. The author values would show up where needed and changing one author value would change the other. Astoria does this today but they add the additional guarantee that if two instances of a particular element (e.g. author) are in fact the same underlying object then the ID will be the same. That is completely legal in Web3S. Web3S just says that the caller can only assume that IDs are locally unique. But the server is free to offer a higher guarantee if it wants and then advertise that fact. Heck a server could choose to give every element instance a GUID/UUID and so guarantee global uniqueness.All based on optional elements and out of band (published schema) communication.
- I can't write a general purpose Web3S tool that takes advantage of that, and
- Tools that do take advantage of that are more tightly coupled to that one service.
Yaron writes: Also, for whatever it’s worth, Astoria supports both hard and soft linking. Our current thinking about this in Web3S is that we would allow servers to advertise schemas that define object relationships, explain ID guarantees, specify hard versus soft linking, etc. We will also probably provide mechanisms to allow servers to annotate data with this information directly rather than requiring schema look up but given the bandwidth expense I’m not sure how often we would use this.I hope you would use it all the time, otherwise you will be promoting a code generation solution with published Schemas. See my own blog posts, as well as the recent storm of discourse on WADL and REST.
I'm not exactly sure what hard and soft linking refer to here, can you expand? I don't have experience with Astoria yet, and I'm thinking inode filesystem hard links...
Yaron writes: By Reference – Alternatively there would be a single canonical author entry and anyone who wanted to refer to that entry would just use a URL ala section 10.1.This should be the default for how Web3S works: the identifier can be a primary or secondary resource identifier in all cases. It should be absolutely server-dependent whether authors can be independently identifiable, but providing canonical and absolute identity for any EII should not require client-server coupling.
I'll think more about how to achieve this, but a first random idea is to allow the HREF element inside (and in place of) the Web3S:ID element.
Yaron writes: In either case you can get there from here.Yes, you can, but the default, standard, and primary means of identification should directly support a canonical identifier system without reporting to optional elements and shared schemas.