Wikipedia:Reference desk/Archives/Computing/2023 December 20
Computing desk | ||
---|---|---|
< December 19 | << Nov | December | Jan >> | Current desk > |
Welcome to the Wikipedia Computing Reference Desk Archives |
---|
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
December 20
[edit]Question about Format of Keywords on Cochrane Medical Documents Web Site
[edit]This is a very specific question but I've tried several forums (including the help desk of the site) and haven't been able to get an answer to what seems to me to be a very simple question. This really should go to people with a background in library science and medical research but I don't see a high level category for that. I'm helping a medical expert build a tool for Semantic Search of certain kinds of medical documents. The idea is to enable search that is much more specific than simple keywords by bringing together various AI technologies. One of the sites we are using is: https://www.cochranelibrary.com/ They have a great UI to do a search and download CSV files with metadata and links to the articles which is giving us a great starting point for our Corpus. Here's the question. The keyword field on that site has some kind of simple syntax, probably a very simple kind of regular expression. Here's an example:
*Pyroglyphidae [drug effects]; Acaricides; Adolescent; Adult; Animals; Bedding and Linens; Child; Child, Preschool; Dust [*prevention & control]; Eczema [etiology, *prevention & control]; Elbow; Environmental Exposure [*prevention & control]; Household Work [*methods]; Humans; Knee; Middle Aged; Randomized Controlled Trials as Topic; Respiratory Hypersensitivity [etiology, prevention & control]; Young Adult
All I want to know is what is the meaning of the asterisks (I'm sure some kind of wildcard) and the brackets? My background is computer science not healthcare but I've started to study various medical coding systems. My best guess is that these are terms from the National Institute of Health Medical Encoding Subject Heading (MeSH) vocabulary: https://meshb-prev.nlm.nih.gov/treeView. I think the semi-colon is a delimiter and the words in brackets are what MeSH calls qualifier terms and the words outside the brackets are what MeSH calls Main Heading (Descriptor) Terms. And the asterisk perhaps means that there are multiple terms that end with that word and to match all of them. But that is just a guess and could be completely wrong. I sent in a help request but their answer was to point me to documentation that would take me hours to read through and at this point I understand all I need to about that site but I want to make the most out of those keywords. I know this is really niche so would really appreciate any help. --MadScientistX11 (talk) 20:19, 20 December 2023 (UTC)
- Yes it is a wildcard. Look at the search help. It says: "Use * to search for zero or more characters, and ? for zero or one characters" Graeme Bartlett (talk) 08:59, 21 December 2023 (UTC)
- The Advanced Search help page of Cochrane Library explains the meaning of an asterisk indeed as a wildcard operator. That same page gives MeSH searches as having the form "[mh ...]", with any qualifiers preceded by a slash ("/"). Is the extensive documentation not searchable? A search for "square brackets" may lead you to the answer. --Lambiam 09:51, 21 December 2023 (UTC)
- When I enter the example above into the Advanced Search box, I get this message:
- Error: this line contains one or more special characters that are not supported. See Help for more information.
- --Lambiam 09:59, 21 December 2023 (UTC)
- @Lambiam Excellent! Thanks a lot. The documentation is searchable but I wasn't finding what I needed. That really helps. I appreciate it. --MadScientistX11 (talk) 13:23, 21 December 2023 (UTC)