Linguistic categories

Linguistic categories include

Lexical category, a part of speech such as noun, preposition, etc.
Syntactic category, a similar concept which can also include phrasal categories
Grammatical category, a grammatical feature such as tense, gender, etc.

The definition of linguistic categories is a major concern of linguistic theory, and thus, the definition and naming of categories varies across different theoretical frameworks and grammatical traditions for different languages. The operationalization of linguistic categories in lexicography, computational linguistics, natural language processing, corpus linguistics, and terminology management typically requires resource-, problem- or application-specific definitions of linguistic categories. In Cognitive linguistics it has been argued that linguistic categories have a prototype structure like that of the categories of common words in a language.^[1]

Linguistic category inventories

To facilitate the interoperability between lexical resources, linguistic annotations and annotation tools and for the systematic handling of linguistic categories across different theoretical frameworks, a number of inventories of linguistic categories have been developed and are being used, with examples as given below. The practical objective of such inventories is to perform quantitative evaluation (for language-specific inventories), to train NLP tools, or to facilitate cross-linguistic evaluation, querying or annotation of language data. At a theoretical level, the existence of universal categories in human language has been postulated, e.g., in Universal grammar, but also heavily criticized.

Part-of-Speech tagsets

Schools commonly teach that there are 9 parts of speech in English: noun, verb, article, adjective, preposition, pronoun, adverb, conjunction, and interjection. However, there are clearly many more categories and sub-categories. For nouns, the plural, possessive, and singular forms can be distinguished. In many languages words are also marked for their case (role as subject, object, etc.), grammatical gender, and so on; while verbs are marked for tense, aspect, and other things. In some tagging systems, different inflections of the same root word will get different parts of speech, resulting in a large number of tags. For example, NN for singular common nouns, NNS for plural common nouns, NP for singular proper nouns (see the POS tags used in the Brown Corpus). Other tagging systems use a smaller number of tags and ignore fine differences or model them as features somewhat independent from part-of-speech.^[2]

In part-of-speech tagging by computer, it is typical to distinguish from 50 to 150 separate parts of speech for English. POS tagging work has been done in a variety of languages, and the set of POS tags used varies greatly with language. Tags usually are designed to include overt morphological distinctions, although this leads to inconsistencies such as case-marking for pronouns but not nouns in English, and much larger cross-language differences. The tag sets for heavily inflected languages such as Greek and Latin can be very large; tagging words in agglutinative languages such as Inuit languages may be virtually impossible. Work on stochastic methods for tagging Koine Greek (DeRose 1990) has used over 1,000 parts of speech and found that about as many words were ambiguous in that language as in English. A morphosyntactic descriptor in the case of morphologically rich languages is commonly expressed using very short mnemonics, such as Ncmsan for Category=Noun, Type = common, Gender = masculine, Number = singular, Case = accusative, Animate = no.

The most popular "tag set" for POS tagging for American English is probably the Penn tag set, developed in the Penn Treebank project.

Multilingual annotation schemes

For Western European languages, cross-linguistically applicable annotation schemes for parts-of-speech, morphosyntax and syntax have been developed with the EAGLES Guidelines. The "Expert Advisory Group on Language Engineering Standards" (EAGLES) was an initiative of the European Commission that ran within the DG XIII Linguistic Research and Engineering programme from 1994 to 1998, coordinated by Consorzio Pisa Ricerche, Pisa, Italy. The EAGLES guidelines provide guidance for markup to be used with text corpora, particularly for identifying features relevant in computational linguistics and lexicography. Numerous companies, research centres, universities and professional bodies across the European Union collaborated to produce the EAGLES Guidelines, which set out recommendations for de facto standards and rules of best practice for:^[3]

Large-scale language resources (such as text corpora, computational lexicons and speech corpora);
Means of manipulating such knowledge, via computational linguistic formalisms, mark up languages and various software tools;
Means of assessing and evaluating resources, tools and products.

The Eagles guidelines have inspired subsequent work on other regions, as well, e.g., Eastern Europe.^[4]

A generation later, a similar effort was initiated by the research community under the umbrella of Universal Dependencies. Petrov et al.^[5]^[6] have proposed a "universal", but highly reductionist, tag set, with 12 categories (for example, no subtypes of nouns, verbs, punctuation, etc.; no distinction of "to" as an infinitive marker vs. preposition (hardly a "universal" coincidence), etc.). Subsequently, this was complemented with cross-lingual specifications for dependency syntax (Stanford Dependencies),^[7] and morphosyntax (Interset interlingua,^[8] partially building on the Multext-East/Eagles tradition) in the context of the Universal Dependencies (UD), an international cooperative project to create treebanks of the world's languages with cross-linguistically applicable ("universal") annotations for parts of speech, dependency syntax, and (optionally) morphosyntactic (morphological) features. Core applications are automated text processing in the field of natural language processing (NLP) and research into natural language syntax and grammar, especially within linguistic typology. The annotation scheme has it roots in three related projects: The UD annotation scheme uses a representation in the form of dependency trees as opposed to a phrase structure trees. At as of February 2019, there are just over 100 treebanks of more than 70 languages available in the UD inventory.^[9] The project's primary aim is to achieve cross-linguistic consistency of annotation. However, language-specific extensions are permitted for morphological features (individual languages or resources can introduce additional features). In a more restricted form, dependency relations can be extended with a secondary label that accompanies the UD label, e.g., aux:pass for an auxiliary (UD aux) used to mark passive voice.^[10]

The Universal Dependencies have inspired similar efforts for the areas of inflectional morphology,^[11] frame semantics^[12] and coreference.^[13] For phrase structure syntax, a comparable effort does not seem to exist, but the specifications of the Penn Treebank have been applied to (and extended for) a broad range of languages,^[14] e.g., Icelandic,^[15] Old English,^[16] Middle English,^[17] Middle Low German,^[18] Early Modern High German,^[19] Yiddish,^[20] Portuguese,^[21] Japanese,^[22] Arabic^[23] and Chinese.^[24]

Conventions for interlinear glosses

In linguistics, an interlinear gloss is a gloss (series of brief explanations, such as definitions or pronunciations) placed between lines (inter- + linear), such as between a line of original text and its translation into another language. When glossed, each line of the original text acquires one or more lines of transcription known as an interlinear text or interlinear glossed text (IGT)—interlinear for short. Such glosses help the reader follow the relationship between the source text and its translation, and the structure of the original language. There is no standard inventory for glosses, but common labels are collected in the Leipzig Glossing Rules.^[25] Wikipedia also provides a List of glossing abbreviations that draws on this and other sources.

General Ontology for Linguistic Description (GOLD)

GOLD ("General Ontology for Linguistic Description") is an ontology for descriptive linguistics. It gives a formalized account of the most basic categories and relations used in the scientific description of human language, e.g., as a formalization of interlinear glosses. GOLD was first introduced by Farrar and Langendoen (2003).^[26] Originally, it was envisioned as a solution to the problem of resolving disparate markup schemes for linguistic data, in particular data from endangered languages. However, GOLD is much more general and can be applied to all languages. In this function, GOLD overlaps with the ISO 12620 Data Category Registry (ISOcat); it is, however, more stringently structured.

GOLD was maintained by the LINGUIST List and others from 2007 to 2010.^[27] The RELISH project created a mirror of the 2010 edition of GOLD as a Data Category Selection within ISOcat. As of 2018, GOLD data remains an important terminology hub in the context of the Linguistic Linked Open Data cloud, but as it is not actively maintained anymore, its function is increasingly replaced by OLiA (for linguistic annotation, building on GOLD and ISOcat) and lexinfo.net (for dictionary metadata, building on ISOcat).

ISO 12620 (ISO TC37 Data Category Registry, ISOcat)

ISO 12620 is a standard from ISO/TC 37 that defines a Data Category Registry, a registry for registering linguistic terms used in various fields of translation, computational linguistics and natural language processing and defining mappings both between different terms and the same terms used in different systems.^[28]^[29]^[30]

An earlier implementation of this standard, ISOcat, provides persistent identifiers and URIs for linguistic categories, including the inventory of the GOLD ontology (see below). The goal of the registry is that new systems can reuse existing terminology, or at least be easily mapped to existing terminology, to aid interoperability.^[31] The standard is used by other standards such as Lexical Markup Framework (ISO 24613:2008), and a number of terminologies have been added to the registry, including the Eagles guidelines, the National Corpus of Polish, and the TermBase eXchange format from the Localization Industry Standards Association.

However, the current edition ISO 12620:2019^[32] does no longer provide a registry of terms for language technology and terminology, but it is now restricted to terminology resources, hence the revised title "Management of terminology resources — Data category specifications". Accordingly, ISOcat is no longer actively developed.^[33] As of May 2020, successor systems, CLARIN Concept Registry^[34] and DatCatInfo^[35] are only emerging.

For linguistic categories relevant to lexical resources, the lexinfo vocabulary represents an established community standard,^[36] in particular in connection with the OntoLex vocabulary and machine-readable dictionaries in the context of Linguistic Linked Open Data technologies. Like the OntoLex vocabulary builds on the Lexical Markup Framework (LMF), lexinfo builds on (the LMF section of) ISOcat.^[37] Unlike ISOcat, however, lexinfo is actively maintained and currently (May 2020) extended in a community effort.^[38]

Ontologies of Linguistic Annotation (OLiA)

Similar in spirit to GOLD, the Ontologies of Linguistic Annotation (OLiA) provide a reference inventory of linguistic categories for syntactic, morphological and semantic phenomena relevant for linguistic annotation and linguistic corpora in the form of an ontology. In addition, they also provide machine-readable annotation schemes for more than 100 languages, linked with the OLiA reference model.^[39] The OLiA ontologies represent a major hub of annotation terminology in the (Linguistic) Linked Open Data cloud, with applications for search, retrieval and machine learning over heterogeneously annotated language resources.^[37]

In addition to annotation schemes, the OLiA Reference Model is also linked with the Eagles Guidelines,^[40] GOLD,^[40] ISOcat,^[41] CLARIN Concept Registry,^[42] Universal Dependencies,^[43] lexinfo,^[43] etc., they thus enable interoperability between these vocabularies. OLiA is being developed as a community project on GitHub ^[44]

References

^ John R Taylor (1995) Linguistic Categorization: Prototypes in Linguistic Theory, 2nd ed., ch.2 p.21
^ Universal POS tags
^ The essentials of EAGLES
^ Dimitrova, L., Ide, N., Petkevic, V., Erjavec, T., Kaalep, H. J., & Tufis, D. (1998, August). Multext-east: Parallel and comparable corpora and lexicons for six central and eastern european languages. In Proceedings of the 17th international conference on Computational linguistics-Volume 1 (pp. 315-319). Association for Computational Linguistics.
^ Petrov, Slav; Das, Dipanjan; McDonald, Ryan (11 Apr 2011). "A Universal Part-of-Speech Tagset". arXiv:1104.2086 [cs.CL].
^ Petrov, Slav (11 Apr 2011). "A Universal Part-of-Speech Tagset". arXiv:1104.2086 [cs.CL].
^ "Stanford Dependencies". nlp.stanford.edu. The Stanford Natural Language Processing Group. Retrieved 8 May 2020.
^ "Interset". cuni.cz. Institute of Formal and Applied Linguistics (Czech Republic). Retrieved 8 May 2020.
^ "Universal Dependencies". universaldependencies.org. Retrieved 2020-05-14.
^ "aux:pass". universaldependencies.org. Retrieved 2020-05-14.
^ UniMorph. "UniMorph: Universal Morphological Annotation". UniMorph. Retrieved 2020-05-14.
^ System-T/UniversalPropositions, System-T, 2020-05-14, retrieved 2020-05-14
^ Prange, J., Schneider, N., & Abend, O. (2019, August). Semantically Constrained Multilayer Annotation: The Case of Coreference. In Proceedings of the First International Workshop on Designing Meaning Representations (pp. 164-176).
^ "Penn Parsed Corpora of Historical English: Other Corpora". www.ling.upenn.edu. Retrieved 2020-05-14.
^ "Icelandic Parsed Historical Corpus (IcePaHC)". www.linguist.is. Retrieved 2020-05-14.
^ Warner, Anthony Department of Language and Linguistic Science University of York York; Taylor, Ann; Warner, Anthony; Pintzuk, Susan; Beths, Frank (September 2003). "The York-Toronto-Helsinki Parsed Corpus of Old English prose (YCOE)". {{cite journal}}: Cite journal requires |journal= (help)
^ "Penn-Helsinki Parsed Corpus of Middle English 2". www.ling.upenn.edu. Retrieved 2020-05-14.
^ "Corpus of Historical Low German". www.chlg.ac.uk. Retrieved 2020-05-14.
^ Light, C., & Wallenberg, J. (2011). On the use of passives across Germanic. Presented at 13th Meeting of the Diachronic Generative Syntax (DIGS) Conference DIGS 13, University of Pennsylvania. June 5, 2011
^ Beatrice Santorini (1993) [./Ftp://babel.ling.upenn.edu/papers/faculty/beatrice%20santorini/santorini-1993.pdf The rate of phrase structure change in the history of Yiddish]. Language Variation and Change 5, 257-283.
^ "Tycho Brahe Project". www.tycho.iel.unicamp.br. Retrieved 2020-05-14.
^ "NPCMJ – Ninjal Parsed Corpus of Modern Japanese". Retrieved 2020-05-14.
^ "Arabic Treebank: Part 3 (full corpus) v 2.0 (MPG + Syntactic Analysis) - Linguistic Data Consortium". catalog.ldc.upenn.edu. Retrieved 2020-05-14.
^ "Penn Chinese Treebank Project". verbs.colorado.edu. Retrieved 2020-05-14.
^ Comrie, B., Haspelmath, M., & Bickel, B. (2008). The Leipzig Glossing Rules: Conventions for interlinear morpheme-by-morpheme glosses. Department of Linguistics of the Max Planck Institute for Evolutionary Anthropology & the Department of Linguistics of the University of Leipzig. Retrieved January, 28, 2010.
^ Scott Farrar and D. Terence Langendoen (2003) "A linguistic ontology for the Semantic Web." GLOT International. 7 (3), pp.97-100, [1].
^ GOLD versions
^ "ISO 12620:1999 - Computer applications in terminology -- Data categories". iso.org. 2011. Retrieved 9 November 2011.
^ "ISO 12620:2009 - Terminology and other language and content resources -- Specification of data categories and management of a Data Category Registry for language resources". iso.org. 2011. Retrieved 9 November 2011.
^ "ISO 12620:2019 Management of terminology resources — Data category specifications". ISO. Retrieved 20 January 2020.
^ Bononno, Robert (2011). "Terminology for Translators -- an Implementation of ISO 12620". Meta. 45 (4): 646–669. CiteSeerX 10.1.1.136.4771. doi:10.7202/002101ar.
^ "ISO 12620:2019 Management of terminology resources — Data category specifications". ISO. Retrieved 20 January 2020.
^ "The Data Category Repository (DCR) has changed address". www.iso.org. Retrieved 2020-05-08.
^ "CLARIN Concept Registry | CLARIN ERIC". www.clarin.eu. Retrieved 2020-05-08.
^ "DatCatInfo". www.datcatinfo.net. Retrieved 2020-05-08.
^ "LexInfo". www.lexinfo.net. Retrieved 2020-05-14.
^ ^a ^b Cimiano, P., Chiarcos, C., McCrae, J. P., & Gracia, J. (2020). Linguistic Linked Data (pp. 137-160). Springer, Cham.
^ ontolex/lexinfo, OntoLex Community Group, 2020-03-07, retrieved 2020-05-14
^ "OLiA ontologies". purl.org/olia. Retrieved 2020-05-14.
^ ^a ^b Chiarcos, C. (2008). An ontology of linguistic annotations. In LDV Forum (Vol. 23, No. 1, pp. 1-16).
^ Chiarcos, C. (2010, May). Grounding an ontology of linguistic annotations in the Data Category Registry. In LREC 2010 Workshop on Language Resource and Language Technology Standards (LT&LTS), Valetta, Malta (pp. 37-40).
^ Rehm, G., Galanis, D., Labropoulou, P., Piperidis, S., Welß, M., Usbeck, R., et al (2020). Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability. arXiv preprint arXiv:2004.08355.
^ ^a ^b Christian Chiarcos, Maxim Ionov and Christian Fäth (2020), Annotation interoperability in the post-ISOcat era, LREC 2020
^ acoli-repo/olia, ACoLi, 2020-03-10, retrieved 2020-05-14

External links

[Taylor1995p21-1] John R Taylor (1995) Linguistic Categorization: Prototypes in Linguistic Theory, 2nd ed., ch.2 p.21

[universal-2] Universal POS tags

[3] The essentials of EAGLES

[4] Dimitrova, L., Ide, N., Petkevic, V., Erjavec, T., Kaalep, H. J., & Tufis, D. (1998, August). Multext-east: Parallel and comparable corpora and lexicons for six central and eastern european languages. In Proceedings of the 17th international conference on Computational linguistics-Volume 1 (pp. 315-319). Association for Computational Linguistics.

[5] Petrov, Slav; Das, Dipanjan; McDonald, Ryan (11 Apr 2011). "A Universal Part-of-Speech Tagset". arXiv:1104.2086 [cs.CL].

[6] Petrov, Slav (11 Apr 2011). "A Universal Part-of-Speech Tagset". arXiv:1104.2086 [cs.CL].

[7] "Stanford Dependencies". nlp.stanford.edu. The Stanford Natural Language Processing Group. Retrieved 8 May 2020.

[8] "Interset". cuni.cz. Institute of Formal and Applied Linguistics (Czech Republic). Retrieved 8 May 2020.

[9] "Universal Dependencies". universaldependencies.org. Retrieved 2020-05-14.

[10] "aux:pass". universaldependencies.org. Retrieved 2020-05-14.

[11] UniMorph. "UniMorph: Universal Morphological Annotation". UniMorph. Retrieved 2020-05-14.

[12] System-T/UniversalPropositions, System-T, 2020-05-14, retrieved 2020-05-14

[13] Prange, J., Schneider, N., & Abend, O. (2019, August). Semantically Constrained Multilayer Annotation: The Case of Coreference. In Proceedings of the First International Workshop on Designing Meaning Representations (pp. 164-176).

[14] "Penn Parsed Corpora of Historical English: Other Corpora". www.ling.upenn.edu. Retrieved 2020-05-14.

[15] "Icelandic Parsed Historical Corpus (IcePaHC)". www.linguist.is. Retrieved 2020-05-14.

[16] Warner, Anthony Department of Language and Linguistic Science University of York York; Taylor, Ann; Warner, Anthony; Pintzuk, Susan; Beths, Frank (September 2003). "The York-Toronto-Helsinki Parsed Corpus of Old English prose (YCOE)". {{cite journal}}: Cite journal requires |journal= (help)

[17] "Penn-Helsinki Parsed Corpus of Middle English 2". www.ling.upenn.edu. Retrieved 2020-05-14.

[18] "Corpus of Historical Low German". www.chlg.ac.uk. Retrieved 2020-05-14.

[19] Light, C., & Wallenberg, J. (2011). On the use of passives across Germanic. Presented at 13th Meeting of the Diachronic Generative Syntax (DIGS) Conference DIGS 13, University of Pennsylvania. June 5, 2011

[20] Beatrice Santorini (1993) [./Ftp://babel.ling.upenn.edu/papers/faculty/beatrice%20santorini/santorini-1993.pdf The rate of phrase structure change in the history of Yiddish]. Language Variation and Change 5, 257-283.

[21] "Tycho Brahe Project". www.tycho.iel.unicamp.br. Retrieved 2020-05-14.

[22] "NPCMJ – Ninjal Parsed Corpus of Modern Japanese". Retrieved 2020-05-14.

[23] "Arabic Treebank: Part 3 (full corpus) v 2.0 (MPG + Syntactic Analysis) - Linguistic Data Consortium". catalog.ldc.upenn.edu. Retrieved 2020-05-14.

[24] "Penn Chinese Treebank Project". verbs.colorado.edu. Retrieved 2020-05-14.

[:0-25] Comrie, B., Haspelmath, M., & Bickel, B. (2008). The Leipzig Glossing Rules: Conventions for interlinear morpheme-by-morpheme glosses. Department of Linguistics of the Max Planck Institute for Evolutionary Anthropology & the Department of Linguistics of the University of Leipzig. Retrieved January, 28, 2010.

[26] Scott Farrar and D. Terence Langendoen (2003) "A linguistic ontology for the Semantic Web." GLOT International. 7 (3), pp.97-100, [1].

[27] GOLD versions

[28] "ISO 12620:1999 - Computer applications in terminology -- Data categories". iso.org. 2011. Retrieved 9 November 2011.

[29] "ISO 12620:2009 - Terminology and other language and content resources -- Specification of data categories and management of a Data Category Registry for language resources". iso.org. 2011. Retrieved 9 November 2011.

[30] "ISO 12620:2019 Management of terminology resources — Data category specifications". ISO. Retrieved 20 January 2020.

[31] Bononno, Robert (2011). "Terminology for Translators -- an Implementation of ISO 12620". Meta. 45 (4): 646–669. CiteSeerX 10.1.1.136.4771. doi:10.7202/002101ar.

[32] "ISO 12620:2019 Management of terminology resources — Data category specifications". ISO. Retrieved 20 January 2020.

[33] "The Data Category Repository (DCR) has changed address". www.iso.org. Retrieved 2020-05-08.

[34] "CLARIN Concept Registry | CLARIN ERIC". www.clarin.eu. Retrieved 2020-05-08.

[35] "DatCatInfo". www.datcatinfo.net. Retrieved 2020-05-08.

[36] "LexInfo". www.lexinfo.net. Retrieved 2020-05-14.

[Cimiano,_P._2020_pp._137-160-37] Cimiano, P., Chiarcos, C., McCrae, J. P., & Gracia, J. (2020). Linguistic Linked Data (pp. 137-160). Springer, Cham.

[38] ontolex/lexinfo, OntoLex Community Group, 2020-03-07, retrieved 2020-05-14

[39] "OLiA ontologies". purl.org/olia. Retrieved 2020-05-14.

[:1-40] Chiarcos, C. (2008). An ontology of linguistic annotations. In LDV Forum (Vol. 23, No. 1, pp. 1-16).

[41] Chiarcos, C. (2010, May). Grounding an ontology of linguistic annotations in the Data Category Registry. In LREC 2010 Workshop on Language Resource and Language Technology Standards (LT&LTS), Valetta, Malta (pp. 37-40).

[42] Rehm, G., Galanis, D., Labropoulou, P., Piperidis, S., Welß, M., Usbeck, R., et al (2020). Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability. arXiv preprint arXiv:2004.08355.

[:2-43] Christian Chiarcos, Maxim Ionov and Christian Fäth (2020), Annotation interoperability in the post-ISOcat era, LREC 2020

[44] acoli-repo/olia, ACoLi, 2020-03-10, retrieved 2020-05-14

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

v t e ISO standards by standard number
List of ISO standards – ISO romanizations – IEC standards
1–9999	1 2 3 4 6 7 9 16 17 31 -0 -1 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 68-1 128 216 217 226 228 233 259 261 262 302 306 361 500 518 519 639 -1 -2 -3 -5 -6 646 657 668 690 704 732 764 838 843 860 898 965 999 1000 1004 1007 1073-1 1073-2 1155 1413 1538 1629 1745 1989 2014 2015 2022 2033 2047 2108 2145 2146 2240 2281 2533 2709 2711 2720 2788 2848 2852 2921 3029 3103 3166 -1 -2 -3 3297 3307 3601 3602 3864 3901 3950 3977 4031 4157 4165 4217 4909 5218 5426 5427 5428 5725 5775 5776 5800 5807 5964 6166 6344 6346 6373 6385 6425 6429 6438 6523 6709 6943 7001 7002 7010 7027 7064 7098 7185 7200 7498 -1 7637 7736 7810 7811 7812 7813 7816 7942 8000 8093 8178 8217 8373 8501-1 8571 8583 8601 8613 8632 8651 8652 8691 8805/8806 8807 8820-5 8859 -1 -2 -3 -4 -5 -6 -7 -8 -8-I -9 -10 -11 -12 -13 -14 -15 -16 8879 9000/9001 9036 9075 9126 9141 9227 9241 9293 9314 9362 9407 9496 9506 9529 9564 9592/9593 9594 9660 9797-1 9897 9899 9945 9984 9985 9995
10000–19999	10006 10007 10116 10118-3 10160 10161 10165 10179 10206 10218 10279 10303 -11 -21 -22 -28 -238 10383 10585 10589 10628 10646 10664 10746 10861 10957 10962 10967 11073 11170 11172 11179 11404 11544 11783 11784 11785 11801 11889 11898 11940 (-2) 11941 11941 (TR) 11992 12006 12052 12182 12207 12234-2 12620 13211 -1 -2 13216 13250 13399 13406-2 13450 13485 13490 13567 13568 13584 13616 13816 13818 14000 14031 14224 14289 14396 14443 14496 -2 -3 -6 -10 -11 -12 -14 -17 -20 14617 14644 14649 14651 14698 14764 14882 14971 15022 15189 15288 15291 15398 15408 15444 -3 -9 15445 15438 15504 15511 15686 15693 15706 -2 15707 15897 15919 15924 15926 15926 WIP 15930 15938 16023 16262 16355-1 16485 16612-2 16750 16949 (TS) 17024 17025 17100 17203 17369 17442 17506 17799 18004 18014 18181 18245 18629 18916 19005 19011 19092 -1 -2 19114 19115 19125 19136 19407 19439 19500 19501 19502 19503 19505 19506 19507 19508 19509 19510 19600 19752 19757 19770 19775-1 19794-5 19831
20000–29999	20000 20022 20121 20400 20802 20830 21000 21001 21047 21122 21500 21827 22000 22275 22300 22301 22395 22537 23000 23003 23008 23009 23090-3 23092 23094-1 23094-2 23270 23271 23360 23941 24517 24613 24617 24707 24728 25178 25964 26000 26262 26300 26324 27000 series 27000 27001 27002 27005 27006 27729 28000 29110 29148 29199-2 29500
30000+	30170 31000 32000 37001 38500 39075 40500 42010 45001 50001 55000 56000 80000
Category