Jump to content

Wikipedia talk:WikiProject Opera/Archive 19

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 15Archive 17Archive 18Archive 19Archive 20Archive 21Archive 25

Recategorization using AWB

My watchlist tells me that User:BrownHairedGirl has created 300+ new categories entitled Category:1700s operas, Category:1704 operas, Category:1707 operas and so on, and is moving operas into these using AWB. Did anyone request this? If not, do we think that this is a good idea? --GuillaumeTell 16:24, 15 May 2007 (UTC)

  • Well, I certainly didn't and I'm not so sure how much of a good idea it is. I'm not absolutely opposed but I do think we've reached saturation point as far as categories are concerned and I'd really like a moratorium on them for the foreseeable future (unless someone creates one that is obviously necessary). --Folantin 16:51, 15 May 2007 (UTC)
  • Hang on sec. What does this mean? Does the category refer to operas performed in that year? Or operas composed in that year? I'm slightly confused as to what is meant here. Apart from that, the category is telling you something about the years involved. That's hopefully redundant to the first sentence. I'm not quite sure what this is meant to achieve. Moreschi Talk 16:56, 15 May 2007 (UTC)
  • I've left a note on her talk page asking about this. My first impression is that operas-by-year is too fine grained (operas by decade might be more educationally useful -- but perhaps not moreso than a list?), assuming an answer to Moreschi's performed/composed/??? question. There are enough variables here that this should be discussed first... Fireplace 18:02, 15 May 2007 (UTC)
  • Hi folks (and thanks Fireplace for the msg). While doing a CfD recategorisation I noticed that many operas were not categorised by year, but some were categorised under Category:Works by year. It seemed to me to be a good idea to categorise them by year, to tie with the other artistic creations of that year (some do not have year of composition, so in those cases I have used year of first performance). It also seemed to me that if they were being categorised, they might as well go in an operas-by-year category as in the general works-by-year category. There appear to be about 900 articles on operas, so assuming that most are 1700 onwards, that'll be an average of just under 3 per year. Since these things tend to cluster, the actual average will be higher than that (many years will have no operas), which is just about the lowest level of usefulness for by-year categories, but if they went in the operas-by-decade categories (i.e. without an operas-by-year category) they should really be categorised in both Category:Works by year and in the operas-by-decade category. That just seems to me to creating category clutter and impeding navigation, so by-year seemed the better choice, especially since there were already some (inadequately parented operas-by-year categories).
    I'll hold off pending further discussion. --BrownHairedGirl (talk) • (contribs) 18:32, 15 May 2007 (UTC)
  • In reply to Moreschi, about usefulness: many (most?) of the articles categorised so far don't mention a date in the first sentence, although presumably they will soon be fixed to include that. But the point of a category is not to directly tell the reader anything about the articles in question (that's tagging, which WP doesn't do), but rather to facilitate navigation between related articles. Different readers will be interested in different aspects of an article, but those interested in the historical context will find it useful to be able to navigate directly to other works of the same period. Similarly, anyone starting by looking at the year or decade will find that the categories allow them to easily find operas from that year or period. --BrownHairedGirl (talk) • (contribs) 18:48, 15 May 2007 (UTC)
  • Lists are fine too (a year article like 1861 is just a list). But there isn't an either/or choice between lists and categories: the two can coexist and complement each other. Categories are easier and quicker to create, and can relate to each other in multiple ways; lists can be more detailed, and can be referenced. Ideally, there should be both, but categories are a good way of starting to build the lists.
    A further problem with the year articles like 1861 is that they cover all sorts of events, and the more recent ones are getting big and need pruning, so that they hold only the most significant events. It's unlikely that they could be used to list all operas, though they are useful for listing a few of the more notable ones.
    You're right that it's not a single step between categories (though with popups it's not far off it), but it's the best route available. --BrownHairedGirl (talk) • (contribs) 19:11, 15 May 2007 (UTC)

One other thing that I've noticed that's happened during this process is that in some cases (e.g. La battaglia di Legnano) the Category:Operas has been replaced by (in this case) Category:1849 operas, but in others (e.g. Il pastor fido), Category:Operas is still there, along with the new Category:1712 operas. If operas are to be categorised by year (and there's no consensus on that here yet), do we wish to retain the Operas category or not? --GuillaumeTell 20:42, 15 May 2007 (UTC)

The Category:Operas list has been useful. It was built up over the past three years. Operas are sub-categorized in a number of different ways: by country, by composer, by genre etc. For that reason it isn't possible simply to go to one set of subcategories to re-assemble the whole series of works. (The other lists we have, such as The opera corpus, have been compiled by hand and don't show coverage as accurately, or in the same way.) This may be unorthodox, but it is explained on the project page item 14.1 here, and it's never been challenged. I think we should keep to this system. I don't see any reason to change it.

What exactly is the situation now? How many operas have been recategorized? (I can see at least 70 or 80.) Is it reversable? I'm a bit surprised you (BrownHairedGirl) launched into this without talking to us. After all, we have had discussions before ([1]), and you know we are here. IMO this demonstrates again just how vulnerable editing is to automated processes (AWB, bot etc.).

Having said all that, I am not opposed to some kind of categorization by date, or period, but as Moreschi sensibly explains above we need to have a viable system that has been discussed and defined, not an automated ('combined-harvester') process randomly extracting dates, sometimes of composition, sometimes of performance, sometimes of revision etc. There are lots of anomalies, especially with early works, and we would need a systematic approach to deal with them. -- Kleinzach 00:10, 16 May 2007 (UTC)

BTW, Category:Operas by year is miscategorized under Y under Category:Operas. -- Kleinzach 04:22, 16 May 2007 (UTC)

Umm, no, it's correctly indexed under Y ... but as below, correct indexing doesn't work when the parent category is overpopulated :( Thanks for the pointer, now kludged to appear on the first page. --BrownHairedGirl (talk) • (contribs) 05:32, 16 May 2007 (UTC)
Kleinzach, per WP:CAT, "Articles should not usually be in both a category and its subcategory", so the articles should not have been in Category:Operas. (And indeed, they were not all there; the few categories by composer which I checked included articles which quite correctly did not include both Category:Operas and a subcategory). I intended no offence, but it never occurred to me that there would be any need to seek consensus for implementing a well-established guideline, and I am a little bit surprised to find that that this breach of it was a project objective rather than (as I had assumed) an oversight or the work of one or two individuals — and in any case, I wasn't actually aware of the project, having missed that link in the CfD discussion you linked to (sorry, I probably should have spotted it, but it was a v small link).
Anyway, I have now re-read the discussions in the archives, and this issue seems to have been last discussed in 2005, around the time when the category operas-by-title was deleted. That was before my time on wikipedia, but I am not aware of any other situation where a parent category is intended to directly contain all the articles in its sub-categories.
About 200 articles have been recategorised so far, and while it would of course be reasonably easy to replace the articles in Category:Operas, that would be contrary to guidelines. I don't think that the put-everything-in Category:Operas practice is sustainable in the face of the well-established guidelines. (For example, Category:Musicals doesn't do duplicate categorisation; why is it argued that this is needed for operas, but not for anything else?) There is also a useability downside to lumping everything into the parent category, in that correctly-indexed sub-cats are unlikely to be on the first page of the category: e.g. Category:Polish operas was one of two or three subcats appearing on page 2 or later until I kludged the indexing by adding a space to force them onto the first page (will do the same for the by-year category which you noticed; it was indexed correctly, but correct indexing doesn't work when the parent category is overpopulated). Anyway, if removing the articles from duplicate categorisation in Category:Operas is controversial, I suggest taking the issue to WT:CAT for wider input.
I am surprised that there should be any controversy about the principle of categorising articles by year: it has been done for many other category trees, and parallels in the arts include Category:Films by year, Category:Books by year, and Category:Songs by year.
The dates have not been somehow extracted by an automated process, nor have they been done at random. WP:AWB simply provides a means of editing in turn a list of articles, with optional automated replacements, which were not used in this case (except where an article was already categorised under Category:Works by year). Nor were the dates "randomly extracted", as you suggest; as explained briefly above, I have used the date of first performance in most cases because in most cases that's the only date available, and in most others it coincides with the year that the opera was composed. In the very few instances where there has been a lack of clarity about which date to use (such as an opera not performed until after the composer's death, or where there has been a major revision), I have either added more than one date or used the decade category; and a number of early operas which are unclear have been skipped altogether.
The reason I followed that path was to mirror the principle used for books and songs: to categorise by date of first publication or release, (or, in the case of unreleased songs, the year in which they were composed or finished). Any other suggestions on how to select the date(s)?
If it helps for maintenance purposes, it would not be a big job to use AWB's list-making function to generate a list of all articles under Category:Operas; would that help? --BrownHairedGirl (talk) • (contribs) 05:32, 16 May 2007 (UTC)
Thank you for your long response. Perhaps we can all work this out together? Without going into a huge amount of detail, I'd agree that the category system here is full of anomalies. We've been working on this for the last few years. It's much better than it was, but it certainly isn't perfect. (As with WP generally, there is a lot of cross-linking so it isn't always clear what is a category and what is a subcategory.) Newly created categories like the Category:Polish operas - which is actually one day old! - will sometimes be added in the wrong place. All we can do is to try pragmatically to make the scheme as clear as possible.
As I explained above, I am not myself opposed to Category:Operas by year and I am reassured that you are using first performance dates. (As you will have seen we have been trying to put these in the opening paragraphs.) However, personally, I would like to see Category:Operas restored to its pre-AWB condition, given that its subcategorization is partial and complex. What do other people think? -- Kleinzach 06:24, 16 May 2007 (UTC)
Thanks for your nice reply :) I'm sure we'll all work our way towards a consensus.
I will doodle a few notes below on how to implement categorisation by year, and see what y'all think: it sounds like we're on a similar track, and I think that once it's nearer to completion it will be very useful (still underpopulated, but see Category:19th century operas for an illustration of how easy it is to navigate).
I take your point about the subcategorisation being incomplete, but I really do think for lots of reasons that it's a very bad idea to put everything in Category:Operas, and that we should take it to WP:CAT to see whether this really is a case to breach the guidelines. Alternatively, why not just start with a list and get to work finishing the subcats? --BrownHairedGirl (talk) • (contribs) 09:31, 16 May 2007 (UTC)
Thanks but hasn't the categorization re-started? I thought we were "holding off pending further discussion"! Wouldn't it be better to wait until GuillaumeTell, Folantin, Moreschi, and Fireplace have had a chance to read what you have written and reply if they want to? Or is your AWB doing its own thing? -- Kleinzach 09:37, 16 May 2007 (UTC)
P.S. I didn't say the subcategorisation was incomplete! That's wasn't my point! -- Kleinzach 10:01, 16 May 2007 (UTC)
I'd give deference to WP:CAT on this one. Consistency is valuable and I don't see a relevant difference between opera articles and books, musicals, etc. As best I can tell, Kleinzach's primary objection is that Category:Operas auto-generated a list that was helpful for project-wide editing jobs. But, BHG correctly points out that AWB has a list function that accomplishes the same purpose (not a perfect substitute, but not insurmountably inconvenient either).
Regarding Category:Works by year, per my comments above and Moreschi's comments below, I don't see the value of this scheme and I'd probably scrap the whole thing. But, (unlike infoboxes), it's not particularly harmful either. Fireplace 13:15, 16 May 2007 (UTC)
Replying to Kleinzach, I have not been removing any more articles from Category:Operas, but have been adding by-year categories, since we seemed to have reached agreement on which year to use. --BrownHairedGirl (talk) • (contribs) 14:21, 16 May 2007 (UTC)
I am unhappy you didn't hold off as promised to allow the others to make comments. (I was only expressing my own personal view that Category:Operas by year based rigorously on first performance date would probably pose no problems not making any agreement on behalf of the others). Will you now replace Category:Operas for the first series where it was removed? I think that would show good faith and we could then work out how to reform the system, perhaps asking your advice as you obviously have the technical expertise I lack. -- Kleinzach 22:45, 16 May 2007 (UTC)

I'm still just a little confused as to what this is intended to achieve, though in all fairness I don't see the point of Category:Works by year either. The point of categorization is to help the reader, no? In which case, I don't see the point of this any more than I see the point of Category:Works that contain the word "festinate". The point of the comparison is that both are, in my view, pointless categorisations based on coincidence. Just because one work was written in the same year as another does not mean that there's any thematic link between the two: in many cases, quite the contrary! This is not just true for operas but also for everything else artistic, as far as I can see. Now, there's a thematic link between Verdi's operas because they're all composed by one man, Verdi - but the same is not true of, say, an opera by J.C Bach and one by Gluck, though they probably wrote operas in the same year! This may be a wiki-wide problem, not just an opera one. Bemused in London, Moreschi Talk 10:41, 16 May 2007 (UTC)

Comparing things by year (or other time period) is one line of analysis, and it may or may not produce interesting results for the reader; that depends what they are interested in. But the same applies to many other categories: the Category:English-language operas includes a variety of stuff which may have little thematic link, such as The Pirates of Penzance and The Rape of Lucretia.
However, one of the questions a historian asks is "what else was going on at the same time", and the by-year categories are a route towards answering that. Maybe you don't often take a historical perspective on art? That's fine, but others do. --BrownHairedGirl (talk) • (contribs) 14:21, 16 May 2007 (UTC)
I'm not particularly opposed to your scheme, however IMO you misunderstand the nature of opera -something that came out in your argument against renaming the Comic opera category. Opera is international/cross-cultural. It developed in different places at different times. It's not particularly useful to group operas performed in the same year, because of the cultural time lag factor. (There's a huge difference between, say, Naples, London and Moscow in 1750, so grouping them together is not really meaningful.)
However it would be very useful to group together different kinds of performance (including spoken theatre, dance etc.) that took place in the same time and place. (For example it would be interesting to correlate Beaumarchais and Goldoni drama performances with operas they influenced.) Grouping together unrelated events (what happened in China and England in 1217 or whatever) may seem like fun, especially if you have a techy toy like AWB to wizz through hundreds of pages with, but if it is just GIGO it detracts from WP. Speaking as a historian (former) myself, these things have to be thought out properly. The end is more important than the (techy) means. -- Kleinzach 23:23, 16 May 2007 (UTC)
Agreed. Even by decade that categorisation is too broad to be useful for analysis, because of what Kleinzach has aptly phrased as the "cultural time lag factor" - and that factor is very big and very important to the history of opera. But I'm not so bothered, I guess there are worse evils to slay. We can always come back to this later. Moreschi Talk 11:23, 17 May 2007 (UTC)
Umm, I understand what you are all saying, but it seems to me that those are arguments about how to use or interpret year categories rather than arguments against the classification itself. I hope that no-one would argue that opera is the only field of human endeavour in which things may develop at different times in different places: similar time lags can be observed in literature, in sculpture, in politics, in warfare, and many other fields, all of which are categorised by year, because the date when something was created is a defining characteristic.
Of course it's not the only defining characteristic, but it is one of several defining characteristics. Taking Kleinzach's comment about 1750 and the cultural time-lag factor, categorisation by date allows the reader to easily see what different approaches were being followed at that time in Naples, London and Moscow ... and also, of course, to see what happened at the same time within the same cultural circles. Date does not define things in the same way in different places, but it is always a relevant factor.
I'm surprised at your rather patronising assumption that I am not aware that opera is international/cross-cultural. Of course it is, but so are many other areas the of arts. Why do you presume that classification by date somehow denies that? On the contrary, it can assist in illustrating that point. (And on the comic opera, I have no problem with a category restricted to English comic opera; my objection was to having no grouping for comic opera in different languages or cultures. Comedy may be expressed in many different ways, and in different cultures it can be very different; but across cultural and linguistic boundaries there remains a distinction between a tragedy and a comedy).
It's also unhelpful to make sneering remarks about "techy toys" etc. You are right about the end being more important than the means, and the end in this case is very simple: to add a basic category, which the reader can use as they see fit. It does not cause category clutter, and if it is garbage in, that would be only because the dates in the articles are themselves garbage, which I doubt is the case.
Your suggestion of grouping together different kinds of performance (including spoken theatre, dance etc.) that took place in the same time and place is an interesting, and if you choose to do it, I'd say that's a great idea. In doing so you'll find it helpful to refer to categories such as operas by year, plays by year.
I think, though, that many of you are missing an important point here. Wikipedia already has Category:Works by year and Category:Years in music; if Category:operas by year is removed, the result would not be to remove the date category entirely, but for operas to be categorised instead in one of those category trees, as some articles already were. How would that be an improvement? --BrownHairedGirl (talk) • (contribs)
FYI, for some perspective from someone who has been engaged in a similar project: we have developed lists by year for literature (e.g., List of years in literature and List of years in poetry), and they are immensely useful for historical articles. They make it much easier to compare across countries and cultures, to develop timelines for key developments and to do comparative work (e.g., did the English ballad really coexist as a major form with iambic pentameter? When and where did rhyme get introduced into the Romance and Germanic languages from the Celtic and Arabic languages? How does the number of women writers in Medieval Japan and Medieval Europe compare?). Work has also been done on similar projects in Art, Architecture, and Film. Whether in a list or category, it really is useful, and broadening the scope of these projects into other artistic and cultural areas could make WP a great and more unique reference for this kind of work. A Musing 18:56, 17 May 2007 (UTC)
Ok, disclaimer, I'm commiting the cardinal sin of commenting without reading the whole discussion. I would first like to remind everyone of the halo of the Jimbo and the beauty of WikiLove, let us all come round the fire and bask in its loving glow. Ok, that's over. There have been many long and fruitless discussions about categories and sub-categories. I don't think we'll be able to come up with a solution here. But my views are 1) both lists and categories serve different but both useful functions, even if there is some redundancy. It would be nice if categories were navigable, but the fact of the matter is, they're not. Try to navigate any category scheme in Wikipedia using Logic, and you will come out mystified and disappointed. This is because, despite years of Western thinking and the attempts of encyclopedists and others, knowledge isn't simply hierarchical - the connections are far too complex to put into a hierarchy. The best idea I've heard about categories lately is that they should be conceived as tags, rather than hierarchical schemes, and "super" categories should always be used - operas should always have an "opera" tag, baroque operas should have "baroque" "music" and "opera" tags, etc. etc. Then, rather than being navigated by confused humans, they will be happily collated (or have their intersections parsed) by computer programs. I think there are legitimate uses for the intersections of "1872 works" and "opera". Right now I think the best way to move toward this is to maintain keeping all operas in Category:Operas while also adding Category:1985 operas. I have seen a number of researchers on the mailing lists pleading for categories to be used this way, and I still remember my initial frustration with the uselessness of categories for any sort of navigation when I was new to Wikipedia. I think we need to give over that idea, and create something actually useful. Also, don't get me started on hierarchies of knowledge. You might regret it. Mak (talk) 23:21, 17 May 2007 (UTC)
Good. Mak said all that better than I could have done. Is it gong to be possible to replace all the Category:Operas items we have lost? Is it easy to do with AWB? (I can't use AWB myself as I don't have a Windows machine.) -- Kleinzach 04:31, 18 May 2007 (UTC)
Mak, please read WP:CAT; categories are not solely hierarchical. Nor are they un-navigable - - that depends on how they are implemented. Computer-generated category intersection is an idea that has been on the table for ages, and the code has been written, but it is not used on wikipedia because of the server load it can cause. It would be great, but don't expect it any time soon :(
It would, however, be useful to create a little schematic map of the opera categories, a simpler version of what has has been done for with {{Christian denomination tree}} for Category:Christian denominations -- that would make category navigation even easier.
As to placing all the articles in the Category:Operas, the guidelines have for ages recommended that this should not be done, for a variety of good reasons which I won't repeat here. I don't see anyone making a case for why the opera categories should be treated differently to books or sportspeople or Christian denominations or ships or any of the many other categories where articles are correctly dispersed to the subcategories. As I have said before, if people think that opera is an exceptional case, why not take the question to WT:CAT and see what the categorisation specialists have to say? --05:32, 18 May 2007 (UTC)
I tend to agree with BHG and precedent, but there is also this duplication exception at WP:CAT (although, if it applies here, it would also apply to books or Christian denominations). Fireplace 05:43, 18 May 2007 (UTC)