# README_Shrub DISCLAIMER: Files in this directory may be added deleted or changed at any time especially in the short term (Winter 2020). What is it? Experimental format for representing graphs with edges as first class citizens. In this case the the graph is the Monarch RDF straight out of Dipper before any reasoning, clique-merging, inference or other Ontology based enhancements are done. Transformations from Dipper RDF output will be listed here as are implemented. 1) De-Reification of Monarch's OBAN Associations; name-spaced as 'MONARCH-EDGE' The basic premise is each statement gets its name-spaced identifier. graph edge subject predicate object This implementation is partitioning each component into name-space and local-identifier as ten tab separated items. URI are represented compactly where ever possible this facilitates non-native/non-canonical downstream renderings. Here, in this prototype example: - graph name-space: is the names of the ingest scripts which produced the RDF. - graph local-id: is the datestamps the ingest script ran (which RELEASE) - edge name-space: is the made up name-space 'EDGE' OR '-EDGE' e.g. 'MONARCH-EDGE' for de-reified Monarch OBAN associations - edge local-id: is the dipper RFD OBAN node id OR digest of the (RDF) subject, predicate, object that forms the statement. - subject name-space: is the curie-prefix for the URI - subject local-id: is the local identifier for the URI - predicate name-space: is the curie-prefix for the URI of the ontological term - predicate local-id: is the ontological term for the URI - object name-space: is the curie-prefix for the URI - object local-id: is the local identifier for the URI OR - object name-space: is the made up name-space 'LITERAL' - object local-id: is the literal -------------------------------------------------------------------------------- this does introduce a of nigh-constants which contribute to a bit of "bloat" - the graph-id "RELEASE" - the edge name-space 'EDGE' - the object name-space "LITERAL' but in time or with a more evolved graph these namespaces could vary more than they do here. ------------------------------------------------------------------------------------- This (2020 Nov) release I took a first pass at clique merging or in RDF-land creating a "Smushed" graph which collapses various flavors of "equivilent" items into a repreaentative proxy which accumulates all the properties of its members. Not production ready or even pretending to represent the complete graph for example: all singelton entities (no equivalances) are not in the smushed graph unless they happened to be directly related to a clique member. Another important distinction from the scigraph aproach is no effort is made to select a "clique-leader" from the pool of members as the most appropiate clique member (or members) will vary depending on the needs of whomever is asking. this has some positive ramifications A property isn't directly associated w/an entity it was never reported to have. The smushed graph covers it's members search space with a third? as many statements. A mapping file from clique-member to the cliqur it is in is included as "clique_list_202011.tab.gz with three columns - entity-namespace (curie-prefix) - local-ID - clique-identifier