Jump to content

Wikipedia:WikiProject Microformats/Species

From Wikipedia, the free encyclopedia

The draft species microformat has been applied to the {{Taxobox}} and other templates.

There are currently 471,218 Wikipedia articles emitting the microformat. It is also used by other organisations, including the BBC; and is recognised by:

Wikipedia editors are invited to comment on or contribute to the development of this microformat, on this page's talk page.

Purpose

[edit]

"Species" will allow the disambiguation of common names, and the lookup of species in databases, recording software, etc. - and the lookup from such sites or software, on Wikipedia.

Imagine viewing a web page with a reference to a species (or genus or other rank; or cultivar, breed or hybrid, etc.) - and being able to use an add-on to your browser to be taken directly to information about that species, on, say, Wikispecies, or Google Images, or another site, such as in an academic database, of your choosing.

Your software would automatically know to search site A if the scientific name referred to a moth, site B for a bird, and site C for a plant - and you could set your preferences as to which sites those were to be, and in which order two or more were to be searched (e.g. for moths, try UK Moths (http://ukmoths.org.uk/) first, if not found try The Global Lepidoptera Names Index (http://www.nhm.ac.uk/research-curation/projects/lepindex/index.html)).

Or supposing someone writes a long article about all the birds, insects, mammals and plants of a particular place, listed alphabetically by common name, but you want to extract a list of species, sorted into alphabetical order within taxonomic class (birds first, then insects then...) or in taxonomic order.

Your software, or a search engine, would be able to differentiate between a pages discussing HMS Beagle, the ship, and a Beagle dog.

Another benefit would be that user-agents could be instructed to treat text marked up in this way as not being in the base language of the document or element in which they occur - pronunciation should be as for Latin, they should not be translated (e.g. where a component word happens also to be a valid word in that language, such as the genus Colon, Circus cyaneus, Hesperia comma, or anything with major or minor on an English-language page) and should not be spell-checked, or be spell-checked with a specialised dictionary (a need identified in this 2003 ietf-languages discussion of language values for taxonomic names (http://www.alvestrand.no/pipermail/ietf-languages/2003-February/000590.html)).

A further benefit the species microformat brings, is in enriching and enhancing species checklists, which are commonly found on the web. Broadly speaking, a species checklist is a list of taxa, usually for a particular group of similar organisms such as birds or vascular plants, found within a particular geographical region (a country, a region (http://www.westmidlandbirdclub.com/records/lists.htm), a county, or a specific site, large or small). A typical example of a species checklist is the Checklist of Beetles of the British Isles (http://www.coleopterist.org.uk/checklist.htm). A Google Search for "species checklist" (http://www.google.com/search?q=species+checklist) will reveal many other such examples. Species checklists are presented in a broadly consistent manner but are usually unable to be parsed and utilised by computers due to the lack of a common standard for marking them up in HTML. The species microformat would provide that common standard. A fully microformat enabled checklist would be parsable by computers and thus would provide developers with a means to aggregate and otherwise make use of this invaluable content beyond the current, rather limited, use of simple online viewing.

A specific example of checklist use might be in enabling biological recording software (http://www.aditsite.co.uk/html/choosing_recording_software.html [dead link]) to parse and aggregate checklists in order to include them in their own species dictionaries. Typically there are waits of many months or even years while humans collate checklist changes manually for inclusion in recording software. Automated checklist parsing and aggregation would greatly expedite and increase the efficiencies of this process.

Classes

[edit]

For a list of HTML classes used by the species microformat, see classes#species.


Category

[edit]

See Category:Templates generating 'species' microformats