I researched alcoholic myopathy the other day. The results were appalling. There’s been very little research in the past 25-30 years, fundamental questions are unanswered, repeated references are made to the same old prevalence numbers, etc.
It reminded me again of the big gaps in our medical knowledge base. Some topics get a lot of research, others seem to be forgotten. So I wondered about a way to characterize the gaps.
I think this would work, and it would be a nice paper for someone:
- From medicare data get list of most often used 1,500 ICD-9 codes (still 9 for US).
- For the “other” codes map them to the parent level in hierarchy.
- For this set obtain ICD long names (optionally, use the better structured SNOMED names using ICD to SNOMED maps).
- For each string run a pubmed query using the NLM’s API.
- For each results count which are reviews, randomized controlled studies etc. (standard NLM metadata).
- Analyze the results for outliers with few reviews, studies, etc. I’d expect 50-100 would be neglected.