This is a difficult topic and surprisingly hard to address in a simple discussion thread.
The challenges include (but are not limited to):
- Syntactic correctness/validity - do the suggested expressions conform to the compositional grammar specification?
- Model conformance - are the suggested associations allowed when compared with the published concept model (and its machine-readable MRCM counterpart)?
- Consistency with current modelling orthodoxy - are the suggested expressions consistent with the modelling pattern used in published pre-coordinated content (can equivalence/subsumption be detected)?
- Retrieval behaviour - will the suggested expressions behave as intended when the data is analysed (whether in their original recording context or after e.g. cross-institution or cross-border transfer?
- Broader risks relating to compositional coding / ‘post-coordination’ - notably (if this is what is being done) will the expressions created be comparable with pre-coordinated content at some future date.
Taking each in turn:
Points 1 & 2 are the simplest are ought to be testable by automated validation. That said I was unable to make the ‘Postcoordination Implementation Demo’ do anything relevant and useful today. Syntactically I can see a missing comma and closing brace in the second example, but in principle the patterns you have used (give-or-take the need for role grouping) broadly ‘model legal’.
Point 3 (current modelling orthodoxy) requires more unpacking and takes us into (a) the complex world of Disorder Combination Modelling and (b) the related but less documented world of ‘Syndrome’ modelling (in particular how a syndrome is related to its ‘essential features’).
I shan’t attempt to restate the documentation that is linked, but it is fair to say that reproducibility in its interpretation and application is variable, and that the modelling approach has changed over time (and may change again).
Broadly the modelling pattern options are:
a. Causal: Diabetes: {Due to=Genetic disease} (your suggested pattern)
b. Co-occurence: Diabetes + Genetic disease (how essential features of a syndrome tend to be modelled)
c. Both: Diabetes + Genetic disease: {Due to=Genetic disease} (some mixture of a and b! See the section ‘Determining causation only versus causation and co-occurrence’ in the editorial guide link for more).
The pattern you are considering is the first one - the ‘diabetes’ that eventually manifests is ‘caused by’ the genetic disease.
I don’t know enough about the conditions, but just looking at the examples you offer makes me wonder whether these are best considered ‘causal’ associations or whether the diabetes is a ‘variable/optional/late’ syndromic feature (and may, in some cases, be an essential part of a named syndrome variant/specialisation - as seems to be the case for some of the detailed specialisations of 41864002 |Autoimmune polyendocrinopathy|) and therefore might be better treated more like patterns b or c.
Testing the current data against patterns a, b, and c (where Diabetes mellitus is either an essential syndrome feature or is a consequence of the syndrome) indicates that all three patterns can be found. A graphical representation is shown here (select the ‘Raw’ option on the right for a searchable web page) - pattern a in blue, pattern b in yellow and pattern c in purple. This shows that a small majority are currently modelled as pattern c, with a sizeable number as patterns a or b.
This brings us onto point 4 - retrieval behaviour. The variable patterns used (to represent a mixture of ‘diabetes as part of a genetic syndrome’ and ‘diabetes as a consequence of a genetic syndrome’ - currently predominantly the former) mean that consistent and complete retrieval is not simple! In fact there is always going to be a trade-off between specificity and sensitivity, and it would appear that no one approach will guarantee (without further checks) that you will only return the concepts you wanted.
Which finally brings us to point 5. As Kai asks, are you creating/modelling new extension content, or are you considering the use of code phrases directly in records? Code composition (often called ‘post-coordination’ but strictly-speaking this is the classification stage) is a risky activity. Anyone creating a code phrase is an accidental SNOMED CT modeller and their work needs to be aligned with contemporary modelling practice. What might happen if SNOMED revises its modelling approach to disorder combinations? For pre-coordinated content (in a national extension) this is less of a problem - modelling of such content (whether in the International data or in a National/local Extensions) can be updated accordingly and (hopefully) retrieval and analysis behaviour will be as intended. However this revision is harder for compositional expressions - in particular when they only sit in records. At worst they become undetectable to retrieval queries specified against the (revised) modelling of the reference data.
It’s therefore hard to say what is “…the right/allowed approach…”. Whether modelling extension content or (IMHO inadvisably) creating compositional expressions for entry in records, the approach may well depend on prevailing orthodoxy in pre-coordinated modelling as well as a detailed understanding of each causing syndrome and its relationship to the caused diabetes. Pattern a is not ‘wrong’ but is not the commonest approach, Pattern b has an attractive simplicity, retrieval completeness and consistency with the modelling of essential syndrome features but this may not always be suitable as it omits the additional explicit ‘causation’ - in which case pattern c would be more suitable.
Ed