Wikipedia:Village pump (idea lab)/Archive 44
This page contains discussions that have been archived from Village pump (idea lab). Please do not edit the contents of this page. If you wish to revive any of these discussions, either start a new thread or use the talk page associated with that topic.
< Older discussions · Archives: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61
What would a community-focused donation option look like?
I was just reading the RfC about the WMF's upcoming fundraising banners, and see some familiar arguments, including several about how the money is used. There is clearly some resentment that a lot of money is being spent on things that aren't sufficiently connected to the improvement or maintenance of Wikipedia and/or its community. Spitballing here, but what if the banners also included an option to donate directly to the community? Of course, if we start building up a big mediating entity to figure out how to spend it, we're going to recreate the WMF, but what if it's a much smaller project with specific needs, and an option for donors to address those needs.
For example, you need a book? A subscription? Something else to support your work on an article? List it. Maybe it's like a collection of patreons or a wedding registry or just a donation to a fund that's restricted to specific kinds of purchases. It's tempting to also include things like internet connections, computers, cameras, etc., but there's also the "mo money mo problems" factor.
What else could such an option look like? Maybe an option to donate generally, or an option to fund specific projects/grants. Lots of potential for problems, indeed -- just spitballing, as I said.
Maybe this is just so unrealistic or fraught that it's not worth the thought experiment, but what would a community-focused donation option look like? — Rhododendrites talk \\ 23:35, 14 November 2022 (UTC)
- It's worth considering. However, it might abstract funds from the WMF's own collections, so we could expect a reaction, possibly including legal or technical measures to prevent Wikipedia from soliciting donations and the withdrawal of the existing meagre funds which trickle back down to Wikipedia. Certes (talk) 11:31, 15 November 2022 (UTC)
- Are these things available via the grants program already, assuming someone wants to manage it (c.f. meta:Grants:Programs/Wikimedia Community Fund)? If so, diverting away to a "chapter" type fund would likely just reduce the grant budget. — xaosflux Talk 16:04, 15 November 2022 (UTC)
- These are some of the questions, yeah. Maybe what I'm most trying to get at here is an option for people to donate money directly into a fund controlled by the community in some way. Maybe this is how we fund a larger "community tech wishlist". Maybe it's a book fund where you ask on-wiki instead of going through a grantmaking process. Maybe something else... — Rhododendrites talk \\ 16:46, 15 November 2022 (UTC)
- I feel like anything conventional is going to eventually recreate the WMF, yeah. You'd have to look to alternative grantmaking models. A Patreon-style directory sounds interesting, though I can see some problems like how do you stop all the money going to the people who make the slickest WikiPatreon page, or how do you make sure it isn't used to circumvent the paid editing guidelines. The only other thing I can think of is no-questions-asked microgrants (inspired by Sportula and the Black Trowel Collective): basically you give small sums away to any Wikipedian that wants it, and trust them not to abuse it. – Joe (talk) 16:28, 15 November 2022 (UTC)
- A community-focused donation option doesn't have to follow a non-profit organization model, but it does have advantages. Having a centralized fundraising appeal alleviates donor fatigue (avoiding many little appeals that potential donors have to vet). Having a registered non-profit do the collecting saves on taxes (which is why the WMF has educational goals in its mission, in order to qualify). Spending money in a way that can benefit the entire community rather than helping individual editors avoids turning editors into paid editors (and avoids income tax considerations for those editors).
- I think getting the WMF to support directed donations may be the best way to keep the advantages of centralized fundraising while also giving donors some more control over where their gifts are used. How to get the community to agree upon how to disburse the money from specific funds it controls is, of course, another problem. isaacl (talk) 17:28, 15 November 2022 (UTC)
- Rather than a community banner, a template saying only X % of money goes to improving software for editors please donate to this fund administered by the EFF would work. OR please don't give money to Wikipedia.
- @Certes There is actually nothing the WMF can do, except delegitimize the editors that propose it as a toxic minority (so we need to reduce conflict, and have more editors involved or aware of decisions) . and control the ability to do software change
- @Isaacl I agree getting the community to decide how to disburse funds is a big issue; maybe consensus decision making doesn't work well if there are lots of vested interests, or if it is about strategic issues?Wakelamp d[@-@]b (talk) 16:14, 17 November 2022 (UTC)
- Rather than saying "please don't give money to Wikipedia", we should say "the money you give goes to the WMF, and is spent on..." As a separate point, yes, a separate donation opportunity administered by someone like the EFF and spent on Wikipedia rather than other WMF activities could be very helpful. It's unclear whether we could or should endorse a competing fund on pages bearing the WMF-owned Wikipedia brand and hosted on WMF-controlled servers. Certes (talk) 16:21, 17 November 2022 (UTC)
Make no small plans. WMF has gone off course and needs to be reformed or maybe fired or at least be influenced by that latter possibility. Wikipedia (including commons) is what people want to support, not WMF. Wikipedia is more than the flagship, it is THE ship that the WMF ivory tower is riding on. Wikipedia really needs to build strength and governance and have money separate from WMF. So setting up something to make that happen from a donation side would be a good start. North8000 (talk) 18:16, 17 November 2022 (UTC)
- If we had $10 Million of donations in the bank, what we spend it on?
- And is there something about Wikipedia (processes, editors, community, software/hardware) that makes it inevitable that a new organisation would go off course?
- For instance, does WMF exist because the current meta wiki software forces a centrally controlled server system , allows them to force a single UX on us, a central admin system, necessitates broad and deep knowledge by admins and editors, and discourages communication. Maybe we don't need WMF, if we extend the federated approach discussed in this Wired interview with Ward Cunningham and this blog [[wmfblog:2019/08/01/structured-data-on-commons-part-two-federated-wikibase-and-multi-content-revisions/|diff article], Wakelamp d[@-@]b (talk) 12:03, 18 November 2022 (UTC)
Changes to Wikipedia's visual design
Wikipedia's design is minimal and good but what about making some changes to its design like more pleasing color schemes and modifications to the visuals of the website like less harsh colors, if you could give more to the topic I would be very thankful. Prode101 (talk) 11:35, 19 November 2022 (UTC)
- Hi Prode101, there is considerable discussion on this topic at WP:VECTOR2022. You may wish to comment there if you have some specific suggestions. Loopy30 (talk) 12:35, 19 November 2022 (UTC)
- Your question assumes that the current design is unpleasant and harsh. It isn't. --User:Khajidha (talk) (contributions) 15:33, 19 November 2022 (UTC)
Restricting translations
Should we make it a requirement that translations from foreign Wikipedia articles must be done by people competent in the source language? I'm starting to see more and more machine translations, often with non-trivial factual errors. Alternatively or additionally, should there be a deletion criterion for machine-translated articles? WP:MACHINE discourages their creation, but I don't know whether it's a well-accepted argument for deletion (besides via WP:PROD). Ovinus (talk) 20:27, 19 October 2022 (UTC)
- It's always been a requirement. —Cryptic 21:01, 19 October 2022 (UTC)
- I understand that machine translation has improved significantly in the 19 years since Brion VIBBER wrote that. I suspect that modern advice would say not to use machine translation without manually checking the translation and improving upon it.
- I see two problems with declaring that translations into English must be done by "people competent in the source language".
- There's no way for either editors or software to determine who is "competent" until they've published their translation.
- The level of linguistic competence needed depends on what you're translating. You need less linguistic competence if you translate a short, simple article in an area you thoroughly understand. You need more linguistic competence for long, complex articles about a subject you don't understand.
- Most of us are probably competent to "translate" most substubs that basically say "<Subject> is a <nationality> <profession>." I could probably use machine translation to produce an accurate translation of some basic articles about medical subjects from multiple European languages into English. (Machine translation is stronger for European languages than for others, plus I know enough Spanish and German to make a guess at what's written in related languages. Here's an example of me 'translating' from Swedish – nobody's ever complained about it.) Few of us, though, have the skills to translate complex articles on unfamiliar subjects in a completely unfamiliar script. I think you have to know your own limitations, and use your own judgment about what's within your power and what's beyond your abilities. WhatamIdoing (talk) 21:29, 20 October 2022 (UTC)
- In recent months I'm seeing a number of translations, from Spanish and Portuguese, of long and important articles on art history that might or not be machine translations (probably they are) but are certainly not checked by anyone competent in English, as the original title of the latest I've seen is enough to show: Greek Classicism sculpture (now moved). Oh God, I see the same editor User:Racnela21, who says they are a paid editor, has now followed up with Ancient Rome Painting (73k bytes, which somebody turned into a redirect) and Rococo Painting (61k bytes), both on the same day, so they must be machine traslations. Needless to say these are done without checking to see if they duplicate existing articles, which they normally do. Johnbod (talk) 22:10, 20 October 2022 (UTC)
- I think that Ovinus is talking about people who speak English well, but don't read the source language (e.g., Spanish) well enough to know that "Ella no tiene pelos en la lengua" is a warning that she tends to be blunt, rather than a statement that she's recovered from Hairy tongue. WhatamIdoing (talk) 00:10, 21 October 2022 (UTC)
- Actually I think it's the machine, who doesn't speak either language well. Do you really think that people doing machine translations bother to check their work? I don't. Johnbod (talk) 22:47, 22 October 2022 (UTC)
- @Johnbod, when I use machine translation into any language that I can read, I always check the result. Why wouldn't I expect another editor to do the same thing, for the same reasons? WhatamIdoing (talk) 06:51, 8 November 2022 (UTC)
- Actually I think it's the machine, who doesn't speak either language well. Do you really think that people doing machine translations bother to check their work? I don't. Johnbod (talk) 22:47, 22 October 2022 (UTC)
- I think that Ovinus is talking about people who speak English well, but don't read the source language (e.g., Spanish) well enough to know that "Ella no tiene pelos en la lengua" is a warning that she tends to be blunt, rather than a statement that she's recovered from Hairy tongue. WhatamIdoing (talk) 00:10, 21 October 2022 (UTC)
- I 100% agree. The question is, what should we do about editors who don't seem to have that judgment, and despite being asked to stop, continue to badly translate articles? Ovinus (talk) 22:54, 20 October 2022 (UTC)
- I think we treat it as a behavioral problem, probably with an eye towards a WP:TBAN or even a block.
- What we don't want to end up with is incentivizing licensing/attribution problems. It's better to have people using the Wikipedia:Content translation tool than to have them "secretly" translating articles and "forgetting" to say that's what they are doing. Note, for the record, that "using the tool whose edit summary automatically complies with the license" is not the same thing as "enabling the machine translation option inside that tool". I'm only saying that, since we're going to have people translating articles whether we like it or not, it would be better to have them automatically comply with license requirements, and with the interlanguage links in place. Right now, we limit this to editors who have made more edits than 99.75% of registered accounts here (or who know the multiple workarounds), and that risks license/copyright violations rather pointlessly. I'd rather limit it to people who have made more edits than 95% of our accounts. WhatamIdoing (talk) 00:03, 21 October 2022 (UTC)
- In recent months I'm seeing a number of translations, from Spanish and Portuguese, of long and important articles on art history that might or not be machine translations (probably they are) but are certainly not checked by anyone competent in English, as the original title of the latest I've seen is enough to show: Greek Classicism sculpture (now moved). Oh God, I see the same editor User:Racnela21, who says they are a paid editor, has now followed up with Ancient Rome Painting (73k bytes, which somebody turned into a redirect) and Rococo Painting (61k bytes), both on the same day, so they must be machine traslations. Needless to say these are done without checking to see if they duplicate existing articles, which they normally do. Johnbod (talk) 22:10, 20 October 2022 (UTC)
- Every page on Wikipedia carries a disclaimer: "Please be advised that nothing found here has necessarily been reviewed by people with the expertise required to provide you with complete, accurate or reliable information." To require expertise would be a fundamental change, requiring presentation of qualifications, passing a test or other demonstration of competence. For an example of a project that works in this way, see Scholarpedia. The last time I checked, they were producing about 1 article per year. Andrew🐉(talk) 22:16, 20 October 2022 (UTC)
- This seems to be an extreme view. To be clear, I didn't mean that "tests" would be administered, just that the editor made a good-faith assertion. Also there is some precedent here, like with ContentTranslation (WP:X2). Ovinus (talk) 22:54, 20 October 2022 (UTC)
- There are several problems with incompetent translations. One is misunderstandings like what WAID is pointing out, another is lack of awareness of the cultural context. What I find worst is when translators do not check the sources but just assume the other language Wikipedia has perfect source to text integrity, no close paraphrasing and that all citations there are correct. This is unlikely to be the case, and we should hold people to account for the sourcing they pretend to use. —Kusma (talk) 05:56, 21 October 2022 (UTC)
- Comment. I think it is better a bad translation—as long as it doesn't violate policies—than no translation at all when needed. Even professional translators disagree in the wording of translated texts and are not immune to mistakes. Certainly to a lesser degree than non-translators.
- Besides, there is not that many availability of translators. Out of tens of thousands of experienced editors in English Wikipedia, there is only a handful of active Spanish translators at Wikipedia:Translators available. Therefore, I err on the topic of this thread on the side of free flow of information. Thinker78 (talk) 17:29, 22 October 2022 (UTC)
- I would agree, except that many of these bad translations have genuine factual errors, or are incredibly misleading. Ovinus (talk) 20:06, 22 October 2022 (UTC)
- I disagree, as we have seem to have more translators willing to fill red links than willing to fix poor translations, see for example the six-year backlog at WP:PNTCU. Fixing poor translations is hard work, almost as much as creating a new article, and not half as rewarding. Also, machine translation software is getting better and better and is available as browser plugin or feature, so sending people to foreign Wikipedias via {{ill}} and letting them read a live translation using most up-to-date software of a recently updated article can be better than giving them a poorly made out-of-date translation of the article of ten years ago using the software of ten years ago. I personally have stopped translating years ago, and try to write a new article based on the same sources (plus anything in English I can find) instead. Working directly from the sources (which often aren't available for machine translations) reduces errors introduced by multiple rounds of paraphrasing and translation. —Kusma (talk) 21:49, 22 October 2022 (UTC)
- There is a reason why there are classes of articles, like stub, start, etc. Those articles can be in bad shape, but they can be improved. Also, translations—be it words, paragraphs or articles— are also subject to regular policies and guidelines. There are cases where information can be removed due to not complying with said policies. Thinker78 (talk) 00:29, 23 October 2022 (UTC)
- Following up the ones I mentioned above, it is now clear that there is a large paid for campaign to add articles machine translated, mostly from Spanish (some Portuguese), to Wikipedia. The Open Knowledge Association (OKA) is funding this; these are their instructions, and they are recording progress here. They've done about 70 and want to do another 180 odd. I think we should urge them to stop. The quality is abysmally low, and many of them duplicate existing articles. We don't want this. The "freelancers" names are given - all seem Spanish. Johnbod (talk) 22:47, 22 October 2022 (UTC)
- There are many articles with abysmally low quality and we don't urge all new editors to stop just because of it. We let them know policies and guidelines and guide them on how to do it better. Besides, in the link you provided they have a provision that urges to follow Wikipedia's policies. So I don't see a problem with it. Instead, they seem to want to help the project. Thinker78 (talk) 00:35, 23 October 2022 (UTC)
- Of course they want to help the project, as well as earn money. But so do most of the editors who get reverted every day. Johnbod (talk) 03:35, 23 October 2022 (UTC)
- Most paid editors are here to promote the organisation that pays them. To be fair, OKA seems to have a more noble goal and has done some good work but also some well-intentioned harm. On the plus side, we're getting usable articles on missing topics. Some of those articles were obviously not written by native English speakers, but we tolerate that from other editors and such work gets cleaned up. On the minus side, other work duplicates existing articles, and the effort would be better spent elsewhere. I hope the compilers of to-do lists check the inter-language links (hidden behind a dropdown top right on ptwp) for an English equivalent first, but that's not always enough: for example, Animal husbandry in Brazil wasn't listed as related to pt:Pecuária no Brasil until an editor cleaned up Wikidata after replacing the new translation by a redirect. One suggestion might be to search enwp more thoroughly for the proposed title and see if an article on the topic comes up; if so then all that need be created is a redirect from the proposed title and an interwiki link in Wikidata. Certes (talk) 14:51, 27 October 2022 (UTC)
- Of course they want to help the project, as well as earn money. But so do most of the editors who get reverted every day. Johnbod (talk) 03:35, 23 October 2022 (UTC)
- @Johnbod I checked a couple of the articles created by them and I won't say at all their "quality is abysmally low". Although some articles get turned down, or are duplicates, I found a few of them actually of very good quality. Check for example article 1, article 2, article 3. This latter one got even a wow and a barnstar from an administrator. Thinker78 (talk) 01:04, 23 October 2022 (UTC)
- That was from User:CaptainEek, not known for article work, who evidently didn't know it was a machine translation. But that one is better than the others, & at first glance seems to fill a gap, which some of the obscure Hispanic history ones may also do. It had clearly been looked though and adjusted. I can see considerable problems with the first two though, though neither are on areas where I know the terminology. The big art history ones by the prolific User:Racnela21 are half in fluent gobbledegook, as shown by the original titles Greek Classicism sculpture and Ancient Rome Painting. Both of these duplicate well-developed existing articles (Ancient Greek sculpture and Roman art, with their subsidiary articles). Someone else has redirected the Roman one as a CFORK, btw, which should probably be done to others. Johnbod (talk) 03:32, 23 October 2022 (UTC)
- "The big art history ones by the prolific User:Racnela21 are half in fluent gobbledegook, as shown by the original titles Greek Classicism sculpture and Ancient Rome Painting." I read the lead of the first one and found no evident problems, much less gobbledegook. Actually I think the quality is above that of the average Wikipedia article.
- No idea why you are making such drastic criticism, that in my view doesn't nearly reflect the relevant work. If you read a problematic sample, if the work was as bad as you say, then said sample would reflect the condition of the whole work in general, and not be an isolated situation. Thinker78 (talk) 00:29, 24 October 2022 (UTC)
- Well firstly the lead (but not the rest) has been considerably cleaned up by other people, but I think if you think it is ok you probably don't know much about Ancient Greek sculpture (like what "Classical" means in this context). Are you really happy with "... With it, a form of representation of the human body was inaugurated that was one of the fulcrums for the birth of a new philosophical branch, aesthetics, and was the stylistic foundation of later revivalist movements of enormous importance, such as the Renaissance and Neoclassicism, and remains influential to this day. Thus, its impact on Western culture cannot be emphasized enough, and it is a central reference for the study of Western art history. But apart from its historical value, its intrinsic artistic quality has rarely been questioned, the vast majority of ancient and modern critics praise it vehemently, and the museums that preserve it are visited by millions of people every year"? Mind you, I think you are a native Spanish speaker, so perhaps are used to the airy waffle of the style here, which is unfortunately very prevalent in art history in the Romance languages. Johnbod (talk) 04:46, 24 October 2022 (UTC)
- Yeah, I am a native Spanish speaker. In contrast to the airy waffle, the Germanic languages—like English—in Guatemala have a reputation of excessive dryness. Lol.
- I understand your point that the language used in Wikipedia English should be objective. But, I think this may be out of the purview of this thread (unsanctioned translations). Do you consider the quoted text translated improperly? Thinker78 (talk) 05:15, 24 October 2022 (UTC)
- Well, we English-speakers like it that way, and it does mean you can, you know, convey information efficiently, like an encyclopaedia is supposed to do. I haven't checked either the original the machine produced, or the Hispanic original. It's the final product presented in English I'm concerned with. Johnbod (talk) 05:20, 24 October 2022 (UTC)
- Hi @Johnbod, I am the founder of OKA. I would like to provide a few clarifications here, but before that, I must say that I find some of the statements a bit demeaning to the work of our translators. They indeed receive a stipend from us to cover their costs of living, which is why they state that they are paid, however this is a very meager compensation compared to the hours they put into their work. In that sense, they are more like volunteers, as they could have taken much better paying opportunities, but decided instead to work on Wikipedia. Additionally, OKA is not a for-profit organization financed by companies, but a non-profit (officially recognized as tax exempt in Switzerland), which I have so far funded entirely out of my own pocket. It is ok to have different opinions and to discuss them, but making extreme statements such as "abysmally low quality" and "fluent gobbledegook" doesn't help having a constructive discussion.
- As you can see in our process and our website, our translators are expected to manually review every sentence. So the articles are not pure machine translation. As this is a lot of text, and because none of them are native speakers, it may be that some paragraphs or articles are of lower quality than others, but from my experience, this is more an exception rather than the norm. Additionally, in many cases, the translation work is not the root cause, but rather the fact that the original article was written in a way that may sound less natural in English (as Thinker78 pointed out).
- Whenever a quality issue is flagged on an article, our translators are supposed to work on it to fix it, so their work doesn't stop once the article is published. If it is a more fundamental problem with the source article and the community decides that an article should be deleted, then we take note of that decision and try to update our processes to ensure this doesn't happen again in the future.
- As far as I am aware, out of the 70+ articles we have translated so far (most of them very long), almost none was taken down. Most of these articles could of course be improved, but I am of the view that it is better to have an article in English with small things to improve here and there than no article at all. I know that this is a contentious topic as not all Wikipedians share the same view, but we are trying our best to abide by all the policies of the English Wikipedia community on that front.
- In general, we are very open to the feedback of the community on how we can strengthen our impact. We are happy to involve other Wikipedians in the design of our process, or in the process of reviewing or translating articles altogether. What we are creating is a taskforce of full-time translators dedicated to editing Wikipedia; at the moment, they are focused on creating content, but I am also happy to adjust our processes so that a share of our translators work on quality improvement work and other review process if the community feels it can help balance the situation and add value. If this is of interest, you can reach out to info@oka.wiki 7804j (talk) 18:40, 24 October 2022 (UTC)
- Taking your comments in reverse order, afaik you made no effort to inform the wp community this was coming, or ask for advice on how best to do it, but just started uploading. There are all sorts of issues with your instructions and the articles you have selected. A number of your articles have been "taken down" and I expect more will be in the future, even if only draftified. It is perfectly clear that some of your "translators" don't speak really English at all, or we wouldn't get titles like Greek Classicism sculpture, Ancient Regime of Spain, Brazilian Romanticism Painting and Ancient Rome Painting. And these are just the titles! The quality of the articles I've looked at is variable, with some ok, but others terrible. It is clear that not enough work is done to see we don't already have an article on the subject; there have been several where we did, in fact most of the general (non-Iberian) ones. I imagine your translator's English was too poor to find them. Your instructions tell your people to avoid subjects that are too Spanish or Portuguese: "Some articles may have high interest in Spanish but low interest in English, because they concern topics where the readers are most likely already Spanish-speaking (e.g., articles about local celebrities). It is usually better to prioritize articles that are universal, i.e. not language or region-specific". This is very bad advice, and exactly the wrong way round - you should prioritize actual gaps on Iberian topics, and avoid universal topics completely. The Spanish and Portuguese wps do not have a very high reputation, and on universal or wide topics like art history understandably place a great deal of emphasis on contributions from the Iberian countries - far too much for what an Anglophone audience is interested in. The art history ones have clearly not been read over by anyone with the slightest familiarity with the correct vocabulary in English. When the subject is something like Termination of employment in Argentina, well, who cares really (actually the English seems ok). You say your translators are "expected" and "supposed" to do various things, but it's pretty clear to me they don't. I could go on (and on) but.... Thanks for responding anyway. Johnbod (talk) 04:46, 25 October 2022 (UTC)
- Dear @Johnbod. You are a textbook case of someone who sees the glass half empty instead of half full. Nevermind that 7804j contributes their own money to help the project. You don't care about that, why?
- I think it is a great initiative that very few people do without a commercial interest. And it highlights why there shouldn't be too much restriction about translation qualifications. If anyone can be an editor in Wikipedia, I don't see why only very few people could be translators in Wikipedia.
- I also have to point out that you have experience creating articles. But you just threw at us a wall of text. What happened to paragraphs?
Shorter sentences and paragraphs make your content easier to skim and less intimidating. Paragraphs should top out around 3 to 8 sentences. Ideal sentence length is around 15 to 20 words.
— Harvard Library, "Writing Guide", Book Title (date)- Impressive record otherwise, kudos! Thinker78 (talk) 03:10, 26 October 2022 (UTC)
- This glass is more than half-empty, and I'm afraid 7804j has wasted a high proportion of her money, on work that won't survive, as WP:CFORKs etc. As I expect you know, the community is suspicious, after many years of bad experiences, of paid-for editing initiatives. Many of these are completely well-meaning, but we judge by the results, not the intentions. If 7804j had come to the community & explained her intended initiative, she could have received a lot of advice on how to avoid problems. Instead she seems to have started the initiative in April or before, but only at the end of August mentioned anything about it on her user page and only yesterday, over 6 months in, did she inform the community on a public page. Who suggested that "only very few people could be translators in Wikipedia"? Nobody. But minimum qualifications should be some ability to speak English, and a willingness to check through what the machine throws at you. It's odd that you should complain about a wall of text, as paragraphs many times that length are very characteristic of some of the OKA translations. Perhaps you haven't looked at many. Johnbod (talk) 12:04, 26 October 2022 (UTC)
- @Johnbod So far, the community didn't agree with your assessment of whether the pages deserve a separate entry. There were a few of our pages that were nominated for potential merge, but where the community thought they would be better to remain as a separate page. Maybe some of the pages will be removed or merged along the way, and I think that's ok -- we will learn from this as well and adjust our ways of working as a result.
- Also, as I stated in my other post, the reason why I waited for bringing it up in the Village pump is that I first wanted to set up everything and run a MVP to see if the model works. I wasn't sure that I would be able to recruit translators and get the non-profit recognition from the government, nor that it would be possible to train them on editing Wikipedia with reasonable effort, so I didn't want to bother the community with hypotheticals. I prefer to act than to talk. Now that the concept has been tested and that I have had time to improve the processes, I am inviting the community to share its inputs.
- Our translators do not define the lengths of the paragraphs they write. It is defined by the source they work on. So far, we have only translated articles that were considered "Featured" or "Good articles" in the source language.
- (by the way, I don't know what makes you feel that I am a "she". If you spent so much time digging in my profile, you should be able to find out the right pronoun)
- 7804j (talk) 12:58, 26 October 2022 (UTC)
- Ah, apologies for that. Obviously, I haven't spent any time at all 'digging in your profile', or I wouldn't have got that wrong! To be honest I had confused your operation with this lot, where the top brass are all female. Can you link to the "few of our pages that were nominated for potential merge, but where the community thought they would be better to remain as a separate page". I certainly haven't looked all the articles. But you should realize that when pages appear with (often) no categories, wikiproject ratings & so on, nobody sees them except a new page patroller, whose tick in no way constitutes acceptance by the community - they are highly unlikely to spot a content fork. I had noticed a number of strange art history forks appearing for some time, but hadn't realized there was a project until just recently, when your translators started adding proper declarations on their user pages. It was unhelpful (and against the rules) of you not to have declared your conflict of interest, or any involvement at all, when we were in a discussion back in May at Draft talk:Early modern art. The fact is that mostly, the community just hasn't considered "whether the pages deserve a separate entry" at all, and the process may not be quick. Some cases are not 100% content fork, but only mostly, and may best be broken up for spare parts, with some used elsewhere, as I suggest at Talk:History_of_engraving.
- Now that I have the right website, I think it is a great pity your project hasn't been following your own declared aims to assist with "...topics where volunteers are missing. For example, articles in topics such as Science, technology, engineering, and Finance are lacking compared to topics such as History, Geography, and Humanities." And yet the great majority of your articles are on "topics such as History, Geography, and Humanities." Johnbod (talk) 18:10, 26 October 2022 (UTC)
- @Johnbod I am not sure about the objectivity or accuracy of your criticism. You would have to compare the frequency of declined drafts or of deleted articles created by random editors vs 7804j´s project.
Meanwhile, I lean to believe that said project illustrates the need to avoid instruction creep in regulating translations. Because the evidence I have read seems to point at least to good quality work.
Thinker78 (talk) 01:58, 27 October 2022 (UTC)- If you think I'm rude, try Wikipediocracy. Johnbod (talk) 03:16, 27 October 2022 (UTC)
- This glass is more than half-empty, and I'm afraid 7804j has wasted a high proportion of her money, on work that won't survive, as WP:CFORKs etc. As I expect you know, the community is suspicious, after many years of bad experiences, of paid-for editing initiatives. Many of these are completely well-meaning, but we judge by the results, not the intentions. If 7804j had come to the community & explained her intended initiative, she could have received a lot of advice on how to avoid problems. Instead she seems to have started the initiative in April or before, but only at the end of August mentioned anything about it on her user page and only yesterday, over 6 months in, did she inform the community on a public page. Who suggested that "only very few people could be translators in Wikipedia"? Nobody. But minimum qualifications should be some ability to speak English, and a willingness to check through what the machine throws at you. It's odd that you should complain about a wall of text, as paragraphs many times that length are very characteristic of some of the OKA translations. Perhaps you haven't looked at many. Johnbod (talk) 12:04, 26 October 2022 (UTC)
- I think human translators are awesome!!!!
- The machine translations are not. A link to one of the featured articles the other day was machine translated. I spent four hours and still couldn't make sense of it (Major issues were loss of nuance, literal translation of place names, non-existent links, and word order),
- Can someone explain how changes are synched? One wiki has translated 6 million articles
- Also are 350 different versions sustainable? Wakelamp d[@-@]b (talk) 13:37, 25 October 2022 (UTC)
- They're mostly not translated. The Cebuano Wikipedia likes to run article creation bots. These don't translate articles. They construct them in a kind of mail merge system – every article says the same things, but you fill in the blanks. See Lsjbot and Rambot's contributions if you want an idea of how it works. WhatamIdoing (talk) 06:56, 8 November 2022 (UTC)
- Taking your comments in reverse order, afaik you made no effort to inform the wp community this was coming, or ask for advice on how best to do it, but just started uploading. There are all sorts of issues with your instructions and the articles you have selected. A number of your articles have been "taken down" and I expect more will be in the future, even if only draftified. It is perfectly clear that some of your "translators" don't speak really English at all, or we wouldn't get titles like Greek Classicism sculpture, Ancient Regime of Spain, Brazilian Romanticism Painting and Ancient Rome Painting. And these are just the titles! The quality of the articles I've looked at is variable, with some ok, but others terrible. It is clear that not enough work is done to see we don't already have an article on the subject; there have been several where we did, in fact most of the general (non-Iberian) ones. I imagine your translator's English was too poor to find them. Your instructions tell your people to avoid subjects that are too Spanish or Portuguese: "Some articles may have high interest in Spanish but low interest in English, because they concern topics where the readers are most likely already Spanish-speaking (e.g., articles about local celebrities). It is usually better to prioritize articles that are universal, i.e. not language or region-specific". This is very bad advice, and exactly the wrong way round - you should prioritize actual gaps on Iberian topics, and avoid universal topics completely. The Spanish and Portuguese wps do not have a very high reputation, and on universal or wide topics like art history understandably place a great deal of emphasis on contributions from the Iberian countries - far too much for what an Anglophone audience is interested in. The art history ones have clearly not been read over by anyone with the slightest familiarity with the correct vocabulary in English. When the subject is something like Termination of employment in Argentina, well, who cares really (actually the English seems ok). You say your translators are "expected" and "supposed" to do various things, but it's pretty clear to me they don't. I could go on (and on) but.... Thanks for responding anyway. Johnbod (talk) 04:46, 25 October 2022 (UTC)
- Well, we English-speakers like it that way, and it does mean you can, you know, convey information efficiently, like an encyclopaedia is supposed to do. I haven't checked either the original the machine produced, or the Hispanic original. It's the final product presented in English I'm concerned with. Johnbod (talk) 05:20, 24 October 2022 (UTC)
- Well firstly the lead (but not the rest) has been considerably cleaned up by other people, but I think if you think it is ok you probably don't know much about Ancient Greek sculpture (like what "Classical" means in this context). Are you really happy with "... With it, a form of representation of the human body was inaugurated that was one of the fulcrums for the birth of a new philosophical branch, aesthetics, and was the stylistic foundation of later revivalist movements of enormous importance, such as the Renaissance and Neoclassicism, and remains influential to this day. Thus, its impact on Western culture cannot be emphasized enough, and it is a central reference for the study of Western art history. But apart from its historical value, its intrinsic artistic quality has rarely been questioned, the vast majority of ancient and modern critics praise it vehemently, and the museums that preserve it are visited by millions of people every year"? Mind you, I think you are a native Spanish speaker, so perhaps are used to the airy waffle of the style here, which is unfortunately very prevalent in art history in the Romance languages. Johnbod (talk) 04:46, 24 October 2022 (UTC)
- That was from User:CaptainEek, not known for article work, who evidently didn't know it was a machine translation. But that one is better than the others, & at first glance seems to fill a gap, which some of the obscure Hispanic history ones may also do. It had clearly been looked though and adjusted. I can see considerable problems with the first two though, though neither are on areas where I know the terminology. The big art history ones by the prolific User:Racnela21 are half in fluent gobbledegook, as shown by the original titles Greek Classicism sculpture and Ancient Rome Painting. Both of these duplicate well-developed existing articles (Ancient Greek sculpture and Roman art, with their subsidiary articles). Someone else has redirected the Roman one as a CFORK, btw, which should probably be done to others. Johnbod (talk) 03:32, 23 October 2022 (UTC)
- There are many articles with abysmally low quality and we don't urge all new editors to stop just because of it. We let them know policies and guidelines and guide them on how to do it better. Besides, in the link you provided they have a provision that urges to follow Wikipedia's policies. So I don't see a problem with it. Instead, they seem to want to help the project. Thinker78 (talk) 00:35, 23 October 2022 (UTC)
- I have only skim-read the discussion above, but need to say something about translation. Proficiency is needed in the source language when translating, but even more proficiency is needed in the target language. For example, I am a native speaker of English but am pretty fluent in Polish (I've spoken it most days for over 40 years). I would be perfectly happy translating from Polish to English, but would not presume to be good at translating from English to Polish. I'm sure that my writing would make it pretty obvious that I am not a native speaker. I'm pretty sure that the translators are proficient in Spanish or Portuguese, but is their knowledge of English good enough for them to be writing articles here? Phil Bridger (talk) 18:42, 26 October 2022 (UTC)
- I have confidently translated a few articles into my native English. I wouldn't have made a good job of translating anything from English. Knowing the source language is useful, but fluency in the target language is critical. The alternatif be half-wrote mess for other someone to leave. Certes (talk) 14:37, 27 October 2022 (UTC)
- Question: Any example of a poor machine translation job in Wikipedia in the last 2 years? Thinker78 (talk) 21:07, 27 October 2022 (UTC)
- WP:PNTCU.—S Marshall T/C 22:22, 27 October 2022 (UTC)
- There must be at least 50 in the OKA spreadsheet here, some mentioned above, including a number of downright ungrammatical titles. Johnbod (talk) 00:26, 28 October 2022 (UTC)
- @S Marshall:. I checked WP:PNTCU, but that was not a specific example of bad machine translation job, but just a collection of articles that for one reason or another need various degrees of cleanup.
- @Johnbod:, I checked the work of another one of the editors of the OKA project. Although I had already checked a few pages not finding evidence of deserving such poor rating as you provided, I checked once more. I found evidence backing my opinion doubting the objectivity or accuracy of your criticism of said articles.
- You made a statement to User:Racnela21, "It's fairly clear (from the titles alone) that your English isn't good enough to do any checking of these". @Andrew Davidson: replied, "I just started reviewing Kassite dynasty. I've not noticed any problems with the English [...]".
- I also read the lead of said article and I concur with Andrew Davidson not noticing any problems with the use of English.
- You may do very good editing work, but translation work, assessing statistically others' work or objective analytical criticism may not be your fields. Thinker78 (talk) 18:08, 28 October 2022 (UTC)
- Thinker78, you say you can't see problems with the articles I linked and I assume that you genuinely can't. But most other editors can.—S Marshall T/C 23:39, 28 October 2022 (UTC)
As noted by S Marshall and Johnbod, we already have Wikipedia:Pages needing translation into English#Translated pages that could still use some cleanup as the coordinating list for those editors able and willing to undertake the Sisyphean task of fixing bad translations. (Note the backlog extending back to 2016, and click on almost anything listed there, or in the associated Category:Wikipedia articles needing cleanup after translation, to see how useless to readers a machine translation can be.) It is indeed our policy that machine-translated text is worse than no article, or no expansion. By their nature, all machine translations not only produce errors (omitting negatives, misidentifying antecedents, mistranslating by choosing wrongly among alternatives), they tend to produce plausible-looking mistranslations of some passages, because they are ultimately based on text search. The WMF's translation tool, mentioned positively above, flooded en.wikipedia with very bad translations and was disabled for use here after community outcry; I understand it can now be used, but only by extended-confirmed editors, enforced by an edit filter. The remnants of the clean-up list from editors being encouraged to use it to add articles to en.wikipedia can be found starting here (linked in its original location at the top of the Pages needing translation section).
Unfortunately, as Phil Bridger notes, there's not one problem, but two: to either translate, check a machine translation, or clean up another editor's translation, both knowledge of the original language (including idioms) and proficiency in the target language (English) are necessary. English Wikipedia attracts a lot of well-intentioned editors who write poor English from scratch; I suspect we have this problem more than do other-language Wikipedias. But while we editors may have become inured to a certain level of ESLese, especially in some topic areas, it's not reasonable to expect readers to wade through incomprehensible prose. We regularly, sadly, block editors for insufficient competence in English, as well as other things. In fact I would say that insufficient competency in English is more of a problem when translating for en.wikipedia than is insufficient competency in the source language. Machine translating programs help in understanding the original; a surprising amount can be understood and fixed in the translation by following the wikilinks in the original and looking at the English interwikis (something I'm surprised that most creators of poor translations evidently haven't thought to do; these are linked databases); there are sometimes even references in English (and we recommend searching for and adding English-language references anyway, to help satisfy WP:V for that vast majority of potential readers who won't be able to read references in the original language)—but it requires competency in English to render the translation in clear English and to spot false friends and other nonsense in machine output.
This was and is the problem with the WMF promoting their machine translations, and this, from the editor whose work I've seen and the editor Johnbod ran into, is the problem with the Open Knowledge Association project; the translators don't even have sufficient competency in English to realize that livestock farming in Brazil can also be called "animal husbandry in Brazil", and are creating content forks. In my opinion, that project, despite its good intentions, is not only a problem for English Wikipedia but should not be fund-raising until it puts in place adequate managerial oversight, including testing the ability of its proposed translators to write professional-level English. It's an axiom in translation that one should only translate into one's native language, and this kind of poor work demonstrates why.
I can't propose any feel-good solution; in my opinion (as a translator and as someone familiar with how thinly stretched our qualified translation checkers are here on en.wikipedia) the OKA project should have been brought to one of the administrators' noticeboards, not flagged at an off-wiki criticism site and mentioned here and elsewhere on the Village Pump with pride. The road to Hell is paved with good intentions, but this project needs quality control implementing immediately at the source or deprecating as harmful to en.WP out of all proportion to the few useful articles it may be adding. (We have processes for requesting translations; see Wikipedia:Translation#Translation from another language to English.) Yngvadottir (talk) 02:23, 28 October 2022 (UTC)
- the problem with the WMF promoting their machine translations – What are you talking about? WhatamIdoing (talk) 06:57, 8 November 2022 (UTC)
Previous community decisions on this
- There's a relevant community decision from 2016. Here is the whole vast, sprawling discussion, if you'd like to read it all in context, but the short version is that the community decided to restrict the use of automatic translation tools to extended-confirmed users. This is implemented using an edit filter (Special:AbuseFilter/782). The community has also authorised speedy deletion of edits made using automated translation tools prior to that consensus, which was at WP:CSD X2. I personally deprecated X2 after the community decided that automated translations can also be speedily draftified.
- A scant few months after the community authorised speedy draftification of automated translations, the community then passed another rule that articles in draft space can be deleted after six months. Nowadays speedy draftification has become highly entangled with New Pages Patrol, and there seems to be a rule that only recently-created pages can be speedily draftified nowadays. This conflicts with the automated translation decision.
- If large-scale automated translation is taking place once again, then we need to revisit these old discussions.—S Marshall T/C 15:42, 27 October 2022 (UTC)
- This highlights something I have been thinking a lot about recently… I think we may be using Draftspace to solve too many things. I think it could be divided into two parts…
- A “Triage” space for new articles… focused mostly on basic sourcing and establishing notability… this would continue to be managed by the NPP and have all the current rules and time limits.
- A new “fix it” space for other types of problematic articles. This would not be under the remit of NPP, and would have a much more generous (or perhaps no) time limit attached. The article would simply be removed from Mainspace until fixed.
- Poorly done machine translations of otherwise acceptable topics could go into this new “fix it” space until someone who knows the original language can review it and make appropriate corrections. Blueboar (talk) 16:11, 27 October 2022 (UTC)
- I think we already have a fix-it space in the form of userspace. It's only sensible to userfy poor articles when we can identify the person who will fix them, but I think that's a feature and not a bug -- when we can't identify a fixer, the disputed content does need to go in the compost heap. With translations in fix-it space, I think the foreseeable problem is that people whose first language is English are notoriously poor at foreign languages and our translator numbers are extremely low relative to other-language wikipedias. WP:PNT has more than a decade of backlog and it's getting worse, and the reason is because when you do have the appropriate dual fluency, it's always so much easier, quicker, and more fun to do your own translation from scratch than to fix someone else's. So such content will tend to linger in fix-it space until the mainspace article is written by someone else.—S Marshall T/C 22:05, 27 October 2022 (UTC)
Making machine translation available again
I propose we add machine translation back into the WMF's translation tool for English Wikipedia. It's currently removed for all users. This would be a privilege extended only to extended-confirmed users, and, importantly, we should withdraw it on an individual basis for those who consistently produce mediocre machine translations.
- Competence should be assumed. Why remove the tool from all users, when they may be perfectly capable of using it intelligently?
- Removing machine translation from the Translate tool doesn't prevent people from doing machine translations; it's a complete waste of time for competent editors to start from scratch or waste time copy-pasting, when they could spend that time improving the translation and checking the sources, etc.
- Some machine translation tools are very competent (DeepL is far better than Google Translate), so most problems could be better addressed by being more careful about which machine translation services we build-in. For example, I used DeepL to create fr:Alliance militaire, IMO a pretty decent translation (my first).
- It would significantly help address WP:SYSTEMICBIAS, a priority for Wikipedia.
- WP:MACHINE is incorrect when it claims that machine translations are easily accessible; many browsers do not have built-in translation, especially on mobile (or obviously the Wikipedia app). It also only forbids "unedited" machine translations, and cannot be used to support removing machine translations from the Translate tool altogether.
- I support Blueboar's proposals above; but even the current mechanisms (drafts, AfC, new page review) would be sufficient here.
- Bad previous translations are far better addressed by simply re-translating them today. Even an unedited machine translation with today's improved algorithms would be miles better than the crap produced by Google Translate back then. Returning machine translation to the translate tool would make this easier, and would help clear the backlog.
The idea that "people are more interested in creating translations than fixing them" may be true, but it's true for all articles. Our vital articles are very neglected, that's not an argument for anything. DFlhb (talk) 07:07, 4 November 2022 (UTC)
- Oppose, just in case that wasn't obvious from the long discussions and positions I linked above.—S Marshall T/C 10:35, 4 November 2022 (UTC)
- How would it "significantly help address WP:SYSTEMICBIAS, a priority for Wikipedia."? Wakelamp d[@-@]b (talk) 12:16, 8 November 2022 (UTC)
- Oppose anything that makes it easier to add articles without verifying the sources. Wikipedias are not reliable sources, and blindly translating them is dangerous. —Kusma (talk) 12:27, 8 November 2022 (UTC)
- Oppose, other Wikipedias are not reliable sources. CMD (talk) 13:52, 8 November 2022 (UTC)
- Comment there is no point in allowing people to create machine translations, because if you want to read a machine translation, all you need to do is click on the little symbol in the English WP's article, or on the inter-language link if you've found one, and your browser will do a machine translation for you. This will be an up-to-date translation containing the latest best version of the target article, translated to the latest standards of technology, while a machine translation created in English WP is neither. Elemimele (talk) 16:23, 21 November 2022 (UTC)
- Oppose Correctly translating Wikipedia articles, along with all the required referencing, is no easy task. Wikipedia does not need anymore poorly translated and broken articles, which is the only kind that machine translation is currently capable of creating.
A current example from DYK
I first heard about OKA just today when reviewing Template:Did you know nominations/Gothic sculpture. Oddly enough, the submission had initially been approved, then the machine translation issue came up and folks ran in the other direction. -- RoySmith (talk) 16:15, 14 November 2022 (UTC)
Archived discussions are not being de-archived
Hi everyone, there is a warning on this page: "Discussions are automatically archived after remaining inactive for two weeks." It must be adjusted with a statement "And they do not get returned, whether they active or not after that period!" If it is easy to implement such returning, that'd be great, but it seems I'm asking for too much.
I've verified that by replying on an idea risen almost six years ago. I have the same idea, I found it via searching, suggested at the top of this page. (Yeah, I follow algorithms, and this is an old rule to search before asking).
My initial idea is discussed on a link: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab)/Archive_22#A_new_button_%22contest_edit_on_talk_page%22
Tosha Langue (talk) 09:48, 24 November 2022 (UTC)
- Why would that need to be stated? Why would anyone expect discussions to be unarchived? Just start a new discussion.--User:Khajidha (talk) (contributions) 19:02, 24 November 2022 (UTC)
- @Khajidha, I (perhaps) made a tough claim, and you replied with rhetorical questions. Sorry for that! I'll try to explain. I didn't mean that it should be done as I proposed (I assumed it just as an idea).Tosha Langue (talk) 09:07, 25 November 2022 (UTC)
- If you want to revive an archived discussion, you can just do that as well. — xaosflux Talk 01:16, 25 November 2022 (UTC)
- @Xaosflux: that's understood, but how? May I just copy and paste the whole discussion to a new one? To make a clone, so to speak... Or I should retell what happened (to make a recall), either to provide a link to an archived discussion in a passing manner (as I've already done), what is acceptable, what is better? Tosha Langue (talk) 09:22, 25 November 2022 (UTC)
- @Tosha Langue "it depends". In general, if a discussion was recently archived (it would be in the most-recent archive) because it was stale, then just cut-paste it from the archive back to the main and continue it. If it wasn't so recent, just start a new discussion and link to the old one. What you should not do is reply in the archive as you did above, the archive says
Please do not edit the contents of this page. If you wish to revive any of these discussions, either start a new thread or use the talk page associated with that topic.
If you want to talk about that idea from 2016, just start a new thread as the direction says. — xaosflux Talk 10:12, 25 November 2022 (UTC)- Indeed. I've gone and reverted your edit to that archive. Graham87 07:59, 26 November 2022 (UTC)
- But I overleaped right to that section from a search result, and didn't notice the warning. Anyway, thanks, @Graham87 Tosha Langue (talk) 15:21, 26 November 2022 (UTC)
- Archives do not have section editing links to prevent such accidents. Unfortunately, the "reply" tool works on pages where section editing is disabled, making it easy to accidentally edit archives. —Kusma (talk) 19:18, 26 November 2022 (UTC)
- According to phab:T249293, people are working on a new magic word to prevent "reply" links on archive pages. —Kusma (talk) 20:46, 26 November 2022 (UTC)
- Marginally related to this, it would be awesome if there was some mechanism for links to discussion threads to continue to work when the thread gets archived. I could imagine something like every new section heading getting a UUID, and a "link to this" clickable thing next to the header. Then the various archiving tools could maintain some sort of forwarding map which could be consulted on the fly to get you to the right archive automatically with the same URL that got you to the pre-archived live thread. -- RoySmith (talk) 20:54, 26 November 2022 (UTC)
- Shamelessly plugging Wikipedia:Convenient Discussions which automaticaly searches archives for you ■ ∃ Madeline ⇔ ∃ Part of me ; 21:01, 26 November 2022 (UTC)
- @RoySmith, permalinks are planned, but I don't know how soon we'll get them. I believe that a database is being filled this month. The current link structure looks like this: Wikipedia:Village pump (idea lab)#c-RoySmith-20221126205400-Kusma-20221126204600 (for your comment). I'm not sure whether this will change. Whatamidoing (WMF) (talk) 02:02, 29 November 2022 (UTC)
- Cool, thanks! -- RoySmith (talk) 02:09, 29 November 2022 (UTC)
- @RoySmith, permalinks are planned, but I don't know how soon we'll get them. I believe that a database is being filled this month. The current link structure looks like this: Wikipedia:Village pump (idea lab)#c-RoySmith-20221126205400-Kusma-20221126204600 (for your comment). I'm not sure whether this will change. Whatamidoing (WMF) (talk) 02:02, 29 November 2022 (UTC)
- Shamelessly plugging Wikipedia:Convenient Discussions which automaticaly searches archives for you ■ ∃ Madeline ⇔ ∃ Part of me ; 21:01, 26 November 2022 (UTC)
- Marginally related to this, it would be awesome if there was some mechanism for links to discussion threads to continue to work when the thread gets archived. I could imagine something like every new section heading getting a UUID, and a "link to this" clickable thing next to the header. Then the various archiving tools could maintain some sort of forwarding map which could be consulted on the fly to get you to the right archive automatically with the same URL that got you to the pre-archived live thread. -- RoySmith (talk) 20:54, 26 November 2022 (UTC)
- According to phab:T249293, people are working on a new magic word to prevent "reply" links on archive pages. —Kusma (talk) 20:46, 26 November 2022 (UTC)
- Archives do not have section editing links to prevent such accidents. Unfortunately, the "reply" tool works on pages where section editing is disabled, making it easy to accidentally edit archives. —Kusma (talk) 19:18, 26 November 2022 (UTC)
- But I overleaped right to that section from a search result, and didn't notice the warning. Anyway, thanks, @Graham87 Tosha Langue (talk) 15:21, 26 November 2022 (UTC)
- Indeed. I've gone and reverted your edit to that archive. Graham87 07:59, 26 November 2022 (UTC)
- @Tosha Langue "it depends". In general, if a discussion was recently archived (it would be in the most-recent archive) because it was stale, then just cut-paste it from the archive back to the main and continue it. If it wasn't so recent, just start a new discussion and link to the old one. What you should not do is reply in the archive as you did above, the archive says
- @Xaosflux: that's understood, but how? May I just copy and paste the whole discussion to a new one? To make a clone, so to speak... Or I should retell what happened (to make a recall), either to provide a link to an archived discussion in a passing manner (as I've already done), what is acceptable, what is better? Tosha Langue (talk) 09:22, 25 November 2022 (UTC)
Idea: Enable the AbuseFilter blocking action
Background: Edit filter actions
The edit filter (or AbuseFilter) is a tool that allows editors in the edit filter manager group to primarily set automated controls to address common patterns of harmful editing.
When an edit filter's pattern matches an edit, a number of actions can be configured to be triggered. On the English Wikipedia, we have enabled the following edit filter actions:
- Warning—The user is warned that their edit may not be appreciated, and is given the opportunity to submit it again.
- Disallowing—Actions matching the filter will be prevented.
- Revoking auto-promoted groups—Actions matching the filter will cause the user in question to be barred from receiving any auto-promotion (i.e. autoconfirmed / extended confirmed) for 5 days.
- Tagging—The edit or change can be 'tagged' with a particular tag, which will be shown on Recent Changes, contributions, logs, new pages, history, and everywhere else.
Initial idea
If progressed to a RfC, I would like to propose that the English Wikipedia enables the blocking action for our edit filters (as, for example, Meta Wiki has done). When enabled, and explicitly set as an action, any users matching the filter will be blocked for the time specified, with a descriptive block summary indicating the rule that was triggered. Some specifics about how this would work was recently discussed at the edit filter noticeboard, but rather than move to a RfC at WP:VPR, I thought it best to gain wider feedback and ideas from the community.
Currently, edit filters set to disallow must be tested in logging mode first. The enabling of the disallow action is then announced on the edit filter noticeboard for review. For a filter to have the blocking action enabled, I would propose that it would need to be rigorously tested; first with the logging action, and then with the disallowing action (with attendant announcement to WP:EFN), before the blocking action could be enabled.
I welcome comments, suggestions and feedback on the idea. Of particular interest are the concerns raised at that EFN discussion around how we would handle non-admin EFMs and false-positive blocks. — TheresNoTime (talk • they/them) 11:47, 29 November 2022 (UTC)
- Here is an example of what the blocks look like: meta-wiki AF blocks
- Note, the AF can not leave a "talk page" notice about the block. Generally the editor will get the Abuse Filter notice when they trip the filter; and then they would get the standard blocked interface if attempting to edit during the block.
- Not sure this is a great idea on enwiki. I've seen several instances of people complaining about incorrect blocks on wikis that use this feature (one memorable instance involved a misconfigured filter on enwikinews that would indef any non-autoconfirmed user who tried to edit a page containing the word "contact", which meant that people trying to report the false positive would also get blocked). Enwiki's sheer number of editors makes false positives more likely and I suspect that our ratio of genuine newbies vs. spambots and LTAs is much higher than sites like meta and wikinews. Enwiki also has a large number of active admins compared to most other wikis, which makes me wonder if this is really necessary. If this is ever implemented I think it should be limited to short-term blocks (a few hours at most) which are immediately reported to somewhere like AN or AIV for review. Spicy (talk) 16:04, 29 November 2022 (UTC)
- I'm not convinced this is necessary. An edit filter set to "disallow" should already stop a spambot attack, and blocking in such a case isn't urgent. On the other hand if an editor/bot makes some edits that are disallowed and some edits that are let through, their edits will likely need human review, and reporting to WP:AIV via DatBot provides that already.
- Of course, given the state of WP:RFA, we are likely to need this in ten years. But I hope we can avert that. —Kusma (talk) 17:09, 29 November 2022 (UTC)
- It was an earth-shattering experience to get auto-blocked at https://meta.wikimedia.beta.wmflabs.org/ just because I added a link to my enwiki userpage, and it was not even my home wiki, just a test account, that led to this unblock request to TNT on their meta talk page. If this proposal passes, there should be a very strong check on its usage, possibly mandating a minimum number of admin support votes before doing anything. —CX Zoom[he/him] (let's talk • {C•X}) 17:22, 29 November 2022 (UTC)
- Removing blatant vandalisms is the worst use of an admin's or patroller's time, and somehow we always choose to show more love towards vandals than towards these hard working people. An IP adding 100k of text, C&P-ing "your mom" 30 times, swearing, adding emojis (...) is likely to continue vandalizing the project, one way or another, flooding the filter along the way, and should be stopped... be it just for 2 hours - trust me, it helps! Full support, Ponor (talk) 18:24, 29 November 2022 (UTC)
- Removing blatant vandalism (which is very rare these days compared to pre-edit filter times) is quick and easy work. Winning back an editor who has been incorrectly blocked by an imperfect filter is hard (or even impossible) and takes a long time, so a blocking filter would need to be absolutely perfect in order to be worth it. Admins try to choose the best block parameters (what range to block, from which pages, soft or hard). If we need to review and adjust filter blocks, this doesn't actually help all that much. —Kusma (talk) 19:15, 29 November 2022 (UTC)
- Fair point, @Kusma. You know "your" vandals better than I, and I'm sure enwiki's many filters are already doing a great job. So blocking would be good (@TheresNoTime?) to prevent EF log flooding, and what else? Ponor (talk) 19:45, 29 November 2022 (UTC)
- Removing blatant vandalism (which is very rare these days compared to pre-edit filter times) is quick and easy work. Winning back an editor who has been incorrectly blocked by an imperfect filter is hard (or even impossible) and takes a long time, so a blocking filter would need to be absolutely perfect in order to be worth it. Admins try to choose the best block parameters (what range to block, from which pages, soft or hard). If we need to review and adjust filter blocks, this doesn't actually help all that much. —Kusma (talk) 19:15, 29 November 2022 (UTC)
- This is going to be a hard no from me. I'm okay with the automated "disallow and report" options that allow AIV to be notified by a bot so an admin can block the account in question, but I still always, 100% of the time, with no exceptions, want an actual human to review the case and make the decision to block. It still requires human nuance to implement a block, IMHO. --Jayron32 19:20, 29 November 2022 (UTC)
- I'm with Jayron32 on this. Humans should do the blocking, after reviewing the situation. We have plenty of people here to address these kinds of issues; it's not like we're a small wiki with 40 editors and one or two admins. And we have so many edit filter creators and managers that there's a strong possibility of misuse (intentional or inadvertent). Risker (talk) 19:43, 29 November 2022 (UTC)
- There's an incorrect premise that's built into a many of the comments here: If we just try really really hard, our filters will have precisely zero false positives. Sorry, Wikipedia is a complex place, and EFMs are only human. There will be false positives. With six million articles, and about 100 edits per minute, you just can't think of everything. Using Ponor's examples, The IP adding 100k of text will be the one reverting a page-blanking vandal. The IP "swearing" is quoting a song title. The one adding emojis is quoting a United States Senator. And it gets worse for the LTA filters. You can test the filter all you want, and then five minutes later, the LTA-who-says-"Ni" is now the LTA-who-until-recently-said-"Ni". You've got to modify the filter, and if you have to extensively test it again, you might as well not bother. If we're going to forward with this, we need to acknowledge that a block is two things: (A) A technical measure preventing a user from editing, and (B) A social "stain" on that user's record. Maybe you don't judge people by the length of their block logs. Good for you. But Wikipedia, collectively, does do this. What I'd suggest is that AbuseFilter blocks do not appear in the user's block log, at least after they have expired. Whether that's accomplished through a software change, or bot that revdels the log of User:Edit filter, I don't care. But consider me opposed otherwise. Suffusion of Yellow (talk) 19:47, 29 November 2022 (UTC)
- I realise that this is a place to be positive, but it's hard to think of an edit that is never constructive or at least tolerable. The furthest we should go is to make a list of edits for which an admin should consider a manual block, but I think the EF logs already provide that data. Certes (talk) 20:01, 29 November 2022 (UTC)
- Do we have any use cases in mind? That is, any specific filters that would be good candidates to eventually change to block? Perhaps a concrete example would help. –Novem Linguae (talk) 20:03, 29 November 2022 (UTC)
- @Novem Linguae I've written filters in the past which contained such absurdly specific or nonsensical strings of text that they ran for months and months without a single false positive, and the user responsible was always blocked. Some examples include "smoothest ashu" (Special:AbuseFilter/690) and "Brian Toussaint Thompson" (Special:AbuseFilter/674). Because those blocks had to wait for me as an administrator to notice that the filter was being tripped, the user in question had fair chance to attempt to circumvent the filter. Blocking would reduce the effectiveness of that. That said, while I've advocated for exploring the use of the blocking function in the past, I understand the hesitance - it is easy to make mistakes with edit filters, and the testing and warning mechanisms aren't great at preventing them. While disallowing a few good edits might be an acceptable mistake, I agree that the risk of outright blocking good faith users is a scarier prospect. I'm personally open to an RfC on this but the policies around its use would need to be restrictive. Sam Walton (talk) 10:31, 1 December 2022 (UTC)
- I agree wuth Jayron32 and Risker. Human discretion is essential when blocking editors. Cullen328 (talk) 20:14, 29 November 2022 (UTC)
- Hearing y'all loud and clear on this — no need for a RfC. The main use case probably would have been very automated attacks from IPs/newly created accounts (the sorts of attacks where an IP will repeatedly attempt to do something very obviously disruptive) but thankfully we can (now) mitigate them well before they can even make an edit, so the usefulness of a blocking filter may be moot. As always, thank you all for the comments and intelligent suggestions — TheresNoTime (talk • they/them) 20:33, 29 November 2022 (UTC)
- Before doing this, I would like to see some proposed situations where this would apply. In the past I have blocked hundreds of spambot ips. But not every detection by filters was obviously a spambot. Some may have been due to some weirdness with web browsers. Filters were already very good at stopping the disruption being saved. Only occasionally did spambot edits get saved. Graeme Bartlett (talk) 10:14, 1 December 2022 (UTC)
Podcast by Wikipedia
Wikipedia has been a very important part of my life and it makes me sad that it's been struggling recently, I've been talking with some people at my college and we have come up with a concept to make some money for the site using volunteers.
Wikipedia is full of a compilation of great information on nearly everything you can think of. I think it could be worthwhile to look into creating a podcast reading over some of the most popular pages. Podcasts get some great sponsorship opportunities and are cheap and easy to put together, I think if you put out an application asking for some volunteers to put record the podcast. Colleges are a great place to target for things like this as well, they have the right equipment and something like this would look very good on a resume or higher education essay.
Not everyone has time to sit down and read a Wikipedia page and I think it would be a great way for Wikipedia to continue evolving with the newer generations and a great and easy way to make money. There are many different platforms, and I've seen many podcasts move over to specific sites exclusively as a partnership.
the logistics could be hard to figure out at first but I think this is a great thing to look into to help the financial stability of this site.
Thanks for your consideration. 97.115.228.152 (talk) 19:17, 1 December 2022 (UTC)
- Just a few things in response to your idea
- Wikipedia is not financially struggling. The Wikimedia Foundation (WMF), the organization that owns and runs Wikipedia among other properties, is actually doing quite well financially (as a side note, they still should ask for donations on the regular, as a continuing income stream from donations is the only way they can remain financially stable. Asking for handouts after you're already broke is a terrible way to budget. Being financially stable involves keeping a steady and reliable source of income, and pledge drives are how the WMF maintains that financial stability. So don't think that because the WMF asks for donations that means it is broke. Its only income stream is donations, so it asks for them regularly not because it is in financial trouble, but rather because it wants to avoid future financial trouble. That's what well-run organizations do from a financial point of view.
- A Wikipedia podcast sounds like a really great idea. If you want to read the direct contents of Wikipedia articles in such a podcast, that is feasible but requires some proper licensing of your podcast so it is compatible with Wikipedia's license. Wikipedia's so-called "copyleft" license allows reuse, and it isn't limited to print copying. See Wikipedia:Reusing Wikipedia content for more information.
- There is already a lot of media out there already about Wikipedia and its culture. There is the Wikipedia Weekly podcast already, that discusses issues Wikimedians may find interesting or relevant. There's also the Wikipediocracy website and probably lots of others I am missing, some run by or in conjunction with the WMF, and some entirely independent of it.
- I hope that helps! --Jayron32 20:11, 1 December 2022 (UTC)
- You should look at Wikipedia:WikiProject Spoken Wikipedia. Donald Albury 20:50, 1 December 2022 (UTC)
- I feel like this is, bluntly, a useless WikiProject. Articles change wildly over time, and audio clips are clunky to work with. Already, text-to-speech exists. The podcast OP suggests is definitely different than this, although I feel it would be too boring to listen to. Sungodtemple (talk • contribs) 02:37, 2 December 2022 (UTC)
- A fairly new podcast called m:WIKIMOVE is also launched since earlier this year. –xenotalk 02:45, 2 December 2022 (UTC)
Ideas sought to arrive at a definition of "article creation at scale (aka mass creation)
A recently closed RfC found consensus to create a definition of "article creation at scale" (sometimes called mass creation). We are still seeking an agreeable definition, and are inviting input here before taking next steps. I'm including below some of the input that has already been provided, however I'll collapse it in case anyone wants to comment unprimed. –xenotalk 23:48, 5 November 2022 (UTC)
Ideas from RfC | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
The following discussion has been closed. Please do not modify it. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Discussion (arriving at a definition of "article creation at scale")
- Pinging other participating editors not quoted. RfC participants (Vanamonde93—Espresso Addict—Red-tailed hawk—Paradise Chronicle—Devonian Wombat—Blue Square Thing—ONUnicorn—Scolaire—Dreamy Jazz—Peter Southwood). –xenotalk 00:00, 6 November 2022 (UTC)
- Second set of pings: More RfC participants (Rhododendrites—Aquillion—Lurking shadow—WhatamIdoing—Seraphimblade—LessHeard vanU—Ovinus—Nabla—Jontesta—TheCatalyst31) –xenotalk 00:04, 6 November 2022 (UTC)
- If someone wants to create a set of articles, they should be able to look at a given policy/guideline and determine the appropriate approach. They should not be expected to divine which of many interpretations [of a combination of almost-applicable policies] people will apply to them in the pursuit of "case by case". It's wild to me that so many people are arguing that more ambiguity is what's needed to avoid the dreaded wikilawyering. We also need to separate the definition of mass creation/article creation at scale from the processes, venues, and penalties associated with mass creation. We haven't even concluded whether (a) rate, (b) quality, (c) sourcing, and/or which combination thereof is what this definition should address. Maybe simply asking that is a good step, and then fleshing out whichever one or more apply? — Rhododendrites talk \\ 02:36, 6 November 2022 (UTC)
- Rhododendrites, if the answer to your question isn't primarily or exclusively a rate, then "mass creation" needs a new name. It does not make any sense to call the creation of one lousy stub a day "mass creation", and it makes a lot of sentence to call a hundred FAC-quality articles uploaded within an hour "mass creation". WhatamIdoing (talk) 03:32, 6 November 2022 (UTC)
- I would just like to be sure that it is made clear that this applies to creation of new pages in article space. Mass creation in draft space should be of no concern, unless it is followed by mass movement of unimproved drafts to article space (the latter of which should be treated as problematic). I would also propose that this should not apply to creation of disambiguation pages (though I do not anticipate their creation in these numbers), as these pages do not require sources and are easy to check. BD2412 T 04:05, 6 November 2022 (UTC)
- If the definition rests in any part on sources then redirects need to be excluded too. Mass creation of these can sometimes cause issues, but those issues are not related to sourcing or depth of coverage in the way articles are. Thryduulf (talk) 12:44, 6 November 2022 (UTC)
- @Thryduulf: I would also agree with this. Articles need to be defined as articles for this purpose. BD2412 T 02:33, 8 November 2022 (UTC)
- If the definition rests in any part on sources then redirects need to be excluded too. Mass creation of these can sometimes cause issues, but those issues are not related to sourcing or depth of coverage in the way articles are. Thryduulf (talk) 12:44, 6 November 2022 (UTC)
- I would just like to be sure that it is made clear that this applies to creation of new pages in article space. Mass creation in draft space should be of no concern, unless it is followed by mass movement of unimproved drafts to article space (the latter of which should be treated as problematic). I would also propose that this should not apply to creation of disambiguation pages (though I do not anticipate their creation in these numbers), as these pages do not require sources and are easy to check. BD2412 T 04:05, 6 November 2022 (UTC)
- Rhododendrites, if the answer to your question isn't primarily or exclusively a rate, then "mass creation" needs a new name. It does not make any sense to call the creation of one lousy stub a day "mass creation", and it makes a lot of sentence to call a hundred FAC-quality articles uploaded within an hour "mass creation". WhatamIdoing (talk) 03:32, 6 November 2022 (UTC)
- I'm definitely repeating myself at this point, but: numerical thresholds are a terrible idea, IMO; they are gamed too easily. My view remains that mass creation occurs when a group of articles is created without the notability of each topic being separately evaluated, and instead the entire group being deemed notable. This isn't always an issue, but it's at the heart of our nasty AfD debates (athletes, villages, and roads are what come to mind). Vanamonde (Talk) 04:10, 6 November 2022 (UTC)
- I'm not sure I agree. The WP:3RR works pretty well, and it's usually pretty obvious when someone is gaming it over an extended period of time. With mass article creations it would be even more clear, since the ultimate purpose of "gaming" it would be to create a bunch of articles without review and there wouldn't be any way to do that without constantly pushing the limits to or near the max again and again over an extended period of time. I think the ideal solution is to say "X articles per day is a hard limit beyond which mass-article creation policies always apply without exception, but is not a guarantee and if you continuously approach this number again and again then that's going to be considered mass creation as well and you're expected to adhere to the relevant policies." You talk about how
mass creation occurs when a group of articles is created without the notability of each topic being separately evaluated, and instead the entire group being deemed notable
, but the core problem is that people are "deeming" it so themselves or without the level of consensus required for such sweeping changes, then trying to force this through via WP:FAIT, which is completely inappropriate. Having a hard, indisputable "above this threshold you must follow these policies, without exception" coupled with "this is an upper limit, not a guarantee, and if you seem to be gaming this system you could face sanctions" would either force them to slow down due to hard sourcing requirements (or whatever system we decide on using this definition), or would provide a clear policy we could point and a straightforward way to argue that they are gaming it so they can be sanctioned in order to force them (and anyone else who wants to do the same thing) to the table, as opposed to the current situation where they often rely on WP:FAIT and the fact that mass-article creations are very hard to reverse to try and force through their policy preferences without a clear consensus backing them. --Aquillion (talk) 04:55, 6 November 2022 (UTC) - We do need some idea of the sort of order of magnitude we're talking about. Is this 10 articles over a year? I'd say no. Is it 10 articles over the entire time someone spends on wikipedia? I'd say absolutely not. Is it tens of articles a day or scores of articles a month? Ah, that's more like it, perhaps. Is it hundreds of articles a year? Yeah, probably. Just leaving it with the current "definition" of no one objected to a figure of 25ish is meh. Blue Square Thing (talk) 08:39, 6 November 2022 (UTC)
- @Vanamonde93, I think it would help me to understand which problem you're trying to solve.
- IMO the main problem that MASSCREATE is trying to solve is swamping the New Page Patrollers without warning. Can you agree with me that if we woke up tomorrow to discover that a million articles had been added to Wikipedia overnight that the sheer volume would be a problem, even if the individual articles were themselves 100% high-quality and on 100% notable subjects? WhatamIdoing (talk) 16:36, 6 November 2022 (UTC)
- Aquillon, 3RR works well, but it's a bright-line for edit-warring, which is the behavior being regulated. I don't have an issue with a similar bright-line for mass creation, but that can't be the definition, just as 3RR isn't the definition of edit-warring. The definition needs to be about repetitive bot-like creation of similar articles. @WhatamIdoing: I agree that NPP being swamped with a million notable articles at once would be a problem. However, I see no evidence that it's currently a problem, whereas this discussion began as the result of conflicts at AfD that are almost entirely about athletes, villages, and roads. Also: if you are checking each individual topic against WP:GNG, rather than deciding an entire group is notable, then you'd have to be superhuman to even produce 20 articles a day, let alone a million. Vanamonde (Talk) 16:48, 6 November 2022 (UTC)
- 20 stubs per day, checked against GNG, is not a superhuman task. You'd just need to work in a subject area that lends itself to the GNG. There are more than 20 red links in the List of ICD-9 codes and List of MeSH codes. Every WHO-recognized disease and approved treatment passes the GNG. Template:Reliable sources for medical articles will even link you straight to the sources that prove it.
- For non-GNG subjects, it's even easier. The question about whether a fish species is a notable subject is basically answered by saying "Does it have a Valid name (zoology)? If yes, then it passes WP:NSPECIES." If you know where to look, it takes maybe five seconds to determine this. WhatamIdoing (talk) 17:06, 6 November 2022 (UTC)
- NSPECIES is an essay, not an SNG. GEOLAND is a better example. BilledMammal (talk) 17:08, 6 November 2022 (UTC)
- And mass-creation under GEOLAND is currently a problem: we are in agreement there. But why is it a problem if someone is able to create 20 sources articles on WHO-recognized diseases? Or the codes you list, which I'm unfamiliar with, but which do not provide scope for hundreds of articles? I think we're talking past each other a little bit. My point is that rate of creation is a symptom of the problem, not the problem itself. The (potential) problem is repetitive creation of a group of articles, and the problem is when that group is one for which no consensus exists on notability, or when the articles are consistently of a poor quality. Also pinging Aquillion, whom I mentioned above but failed to ping correctly. Vanamonde (Talk) 18:13, 6 November 2022 (UTC)
- IMO the point behind MASSCREATE is that, regardless of method (e.g., bot-like) or notability (e.g., obviously good, obviously bad, or not obvious) or article quality, editors who didn't creating the article need a reasonable chance to check the article. We handle about 500–600 articles most days. Increasing that by 20 isn't really going to be noticeable. Increasing that by 200 will probably be a problem. Increasing that by 2,000 will definitely be a problem. The rate of creation itself is the problem for reviewers.
- It sounds like your concern has nothing to do with the "mass" aspect of mass creation. It's more like "I don't want people creating lousy articles, and therefore I especially don't want them creating a lot of lousy articles." That's not really what MASSCREATE's supposed to address. WhatamIdoing (talk) 23:50, 6 November 2022 (UTC)
- And mass-creation under GEOLAND is currently a problem: we are in agreement there. But why is it a problem if someone is able to create 20 sources articles on WHO-recognized diseases? Or the codes you list, which I'm unfamiliar with, but which do not provide scope for hundreds of articles? I think we're talking past each other a little bit. My point is that rate of creation is a symptom of the problem, not the problem itself. The (potential) problem is repetitive creation of a group of articles, and the problem is when that group is one for which no consensus exists on notability, or when the articles are consistently of a poor quality. Also pinging Aquillion, whom I mentioned above but failed to ping correctly. Vanamonde (Talk) 18:13, 6 November 2022 (UTC)
- NSPECIES is an essay, not an SNG. GEOLAND is a better example. BilledMammal (talk) 17:08, 6 November 2022 (UTC)
- Aquillon, 3RR works well, but it's a bright-line for edit-warring, which is the behavior being regulated. I don't have an issue with a similar bright-line for mass creation, but that can't be the definition, just as 3RR isn't the definition of edit-warring. The definition needs to be about repetitive bot-like creation of similar articles. @WhatamIdoing: I agree that NPP being swamped with a million notable articles at once would be a problem. However, I see no evidence that it's currently a problem, whereas this discussion began as the result of conflicts at AfD that are almost entirely about athletes, villages, and roads. Also: if you are checking each individual topic against WP:GNG, rather than deciding an entire group is notable, then you'd have to be superhuman to even produce 20 articles a day, let alone a million. Vanamonde (Talk) 16:48, 6 November 2022 (UTC)
- I'm not sure I agree. The WP:3RR works pretty well, and it's usually pretty obvious when someone is gaming it over an extended period of time. With mass article creations it would be even more clear, since the ultimate purpose of "gaming" it would be to create a bunch of articles without review and there wouldn't be any way to do that without constantly pushing the limits to or near the max again and again over an extended period of time. I think the ideal solution is to say "X articles per day is a hard limit beyond which mass-article creation policies always apply without exception, but is not a guarantee and if you continuously approach this number again and again then that's going to be considered mass creation as well and you're expected to adhere to the relevant policies." You talk about how
- I think a simple definition of Multiple articles created based on boilerplate text would cover most problematic instances of mass creation and cannot be gamed. I also don't think there would be an issue with multiple definitions, as mass creation can take different forms, and would suggest this allows an RfC with multiple questions, each asking
Does X constitute mass creation
; each question could involve multiple options if there are minor modifications to the same definition X (for example, a minor modification from my above proposal could be "At least ten articles created based on boilerplate text"). Every question that receives a consensus would then be added as a separate definition of mass creation. BilledMammal (talk) 05:20, 6 November 2022 (UTC)- If we did the questions one or two at a time, possibly. But we know where too many questions ends up.
- Now, "multiple". So, more than 2 then? That's so open to interpretation that it becomes useless Blue Square Thing (talk) 08:39, 6 November 2022 (UTC)
- It can work, particularly when the questions are closely related like they would be here.
- I'm not certain where the threshold should be. I can see an argument for two (the boilerplate itself it the bright line that needs approval), but I can also see the argument for slightly broader latitude. Can you explain why it's
so open to interpretation that it becomes useless
. BilledMammal (talk) 09:24, 6 November 2022 (UTC)- Multiple articles per lifetime? No, thank you. @Blue Square Thing is absolutely correct that "multiple" will be interpreted as meaning two, but even if you set it at a more reasonable number, then you need to be talking about a rate, not an absolute number. WhatamIdoing (talk) 16:32, 6 November 2022 (UTC)
- Editors aren't going to accidentally be reusing boilerplate text, which is why I'm not convinced we need to set a number. I also don't believe a rate is appropriate; 100 boilerplate articles created over a year should get consensus just as 100 boilerplate articles created over a week should. BilledMammal (talk) 16:49, 6 November 2022 (UTC)
- How about 100 articles per 20 years? How about 10 articles in the same month, and never repeated? Is 10 articles ever "mass" creation? Even if it's not "mass" creation, is creating 10 similar articles something you think is worth adding a bureaucratic pre-approval process for? WhatamIdoing (talk) 16:55, 6 November 2022 (UTC)
- Based on a boilerplate text - this goes beyond just similar - but yes.
is creating 10 similar articles something you think is worth adding a bureaucratic pre-approval process for
I believe that the pre-approval process will be similar in bureaucratic overhead to an AfD, which means I suspect we will reduce the overall bureaucratic overhead by implementing this. BilledMammal (talk) 17:03, 6 November 2022 (UTC)- @BilledMammal, if someone wants to create 10 articles in a month, and those articles are expected to be similar – including if it's absolutely normal for those articles to be similar, because that's what happens if you follow Wikipedia:WikiProject Albums/Album article style advice – then you actually want people to get pre-approval for writing normal articles? WhatamIdoing (talk) 17:11, 6 November 2022 (UTC)
- I think you are misunderstanding my proposal. I'm not proposing that this applies to articles that are merely similar; I'm proposing it applies to articles that are based on boilerplate text. BilledMammal (talk) 17:18, 6 November 2022 (UTC)
- There's no real difference. The accepted start for a notable album appears to be this sentence:
Album is the nth {studio|live} album by the <nationality> <genre> {band|singer} <name>, released on <date>, by <label>.
followed by a track listing. Whether that is "merely similar" or "boilerplate" is in the eye of the beholder. WhatamIdoing (talk) 17:28, 6 November 2022 (UTC)- That would be boilerplate text, proven by you being able to define the boilerplate text used to create the articles. However, I don't see it at WikiProject Albums style guide, and a review of the initial version of a dozen randomly selected albums doesn't appear to follow that text? But if you are right and it is the accepted start for a notable album and thus thousands or tens of thousands of articles are being created based on that boilerplate, then what is the issue with requiring consensus to be obtained for it, to ensure that the project as a whole approves of such actions rather than just WikiProject Albums? BilledMammal (talk) 17:38, 6 November 2022 (UTC)
- Historically, editors interested in a given topic area have worked out basic skeletons for new articles related to that area. We could mandate that such discussions should take place in a specific venue to facilitate someone watching for all of those discussions, and require that all previously established skeletons be reviewed. However it would create a central bottleneck and a long backlog of reviews which seems disproportionate to the benefits that would accrue. The problems with having many articles created rapidly have centred on a small number of editors, and not the vast many who have followed the same basic skeletons in different topic areas. isaacl (talk) 18:00, 6 November 2022 (UTC)
- Do you have some examples of these skeletons, and a rough estimate about how many exist? BilledMammal (talk) 18:03, 6 November 2022 (UTC)
- I know of examples for some sports. Nearly all articles, though, can be categorized with other similar articles. Editors will typically base the creation of a new article on existing ones of the same type. Mandating that creating a new article that mimics the skeleton of existing ones has to be centrally approved would affect virtually all of them. This would be possible to do, and would in essence be creating minimum stub standards for all topics. If we're going to pay that cost, though, personally I'd prefer to focus on the desired content to include, rather than a specific text layout. isaacl (talk) 18:19, 6 November 2022 (UTC)
- Can you link those examples? BilledMammal (talk) 18:37, 6 November 2022 (UTC)
- The football people certainly used to have them - very useful as well. There's also guides such as WP:UKCITIES. Blue Square Thing (talk) 19:32, 6 November 2022 (UTC)
- You're probably thinking of the guides like Wikipedia:WikiProject Football/Players. WhatamIdoing (talk) 21:28, 6 November 2022 (UTC)
- UKCITIES appears to be a style guide, not a boilerplate. The WikiProject football text could be a boilerplate, if editors are only using the introduction, but given the issues we have had with the creation of articles on football players I don't believe requiring editors wishing to create several or more sub-stubs based on that boilerplate to get consensus is a bad thing. BilledMammal (talk) 23:40, 6 November 2022 (UTC)
- You're probably thinking of the guides like Wikipedia:WikiProject Football/Players. WhatamIdoing (talk) 21:28, 6 November 2022 (UTC)
- The football people certainly used to have them - very useful as well. There's also guides such as WP:UKCITIES. Blue Square Thing (talk) 19:32, 6 November 2022 (UTC)
- Can you link those examples? BilledMammal (talk) 18:37, 6 November 2022 (UTC)
- I know of examples for some sports. Nearly all articles, though, can be categorized with other similar articles. Editors will typically base the creation of a new article on existing ones of the same type. Mandating that creating a new article that mimics the skeleton of existing ones has to be centrally approved would affect virtually all of them. This would be possible to do, and would in essence be creating minimum stub standards for all topics. If we're going to pay that cost, though, personally I'd prefer to focus on the desired content to include, rather than a specific text layout. isaacl (talk) 18:19, 6 November 2022 (UTC)
- Do you have some examples of these skeletons, and a rough estimate about how many exist? BilledMammal (talk) 18:03, 6 November 2022 (UTC)
- Historically, editors interested in a given topic area have worked out basic skeletons for new articles related to that area. We could mandate that such discussions should take place in a specific venue to facilitate someone watching for all of those discussions, and require that all previously established skeletons be reviewed. However it would create a central bottleneck and a long backlog of reviews which seems disproportionate to the benefits that would accrue. The problems with having many articles created rapidly have centred on a small number of editors, and not the vast many who have followed the same basic skeletons in different topic areas. isaacl (talk) 18:00, 6 November 2022 (UTC)
- That would be boilerplate text, proven by you being able to define the boilerplate text used to create the articles. However, I don't see it at WikiProject Albums style guide, and a review of the initial version of a dozen randomly selected albums doesn't appear to follow that text? But if you are right and it is the accepted start for a notable album and thus thousands or tens of thousands of articles are being created based on that boilerplate, then what is the issue with requiring consensus to be obtained for it, to ensure that the project as a whole approves of such actions rather than just WikiProject Albums? BilledMammal (talk) 17:38, 6 November 2022 (UTC)
- There's no real difference. The accepted start for a notable album appears to be this sentence:
- I think you are misunderstanding my proposal. I'm not proposing that this applies to articles that are merely similar; I'm proposing it applies to articles that are based on boilerplate text. BilledMammal (talk) 17:18, 6 November 2022 (UTC)
- @BilledMammal, if someone wants to create 10 articles in a month, and those articles are expected to be similar – including if it's absolutely normal for those articles to be similar, because that's what happens if you follow Wikipedia:WikiProject Albums/Album article style advice – then you actually want people to get pre-approval for writing normal articles? WhatamIdoing (talk) 17:11, 6 November 2022 (UTC)
- How about 100 articles per 20 years? How about 10 articles in the same month, and never repeated? Is 10 articles ever "mass" creation? Even if it's not "mass" creation, is creating 10 similar articles something you think is worth adding a bureaucratic pre-approval process for? WhatamIdoing (talk) 16:55, 6 November 2022 (UTC)
- Editors aren't going to accidentally be reusing boilerplate text, which is why I'm not convinced we need to set a number. I also don't believe a rate is appropriate; 100 boilerplate articles created over a year should get consensus just as 100 boilerplate articles created over a week should. BilledMammal (talk) 16:49, 6 November 2022 (UTC)
- re: open to interpretation - it doesn't provide any level of distinction between people creating a few articles about similar things (Danish cycle races, for example) and the creation of articles at a scale that becomes potentially problematic. It's just too open I'm afraid - 2 is probably multiple, 3 certainly is. Neither is problematic. Blue Square Thing (talk) 19:34, 6 November 2022 (UTC)
- This shouldn't impact editors if they are writing articles, rather than sub-stubs, even if they are on similar topics. BilledMammal (talk) 23:40, 6 November 2022 (UTC)
- BilledMammal, here are some examples:
- Moothedath Higher Secondary School is a high school in Taliparamba, Kerala, India. Founded in 1894...
- William L. Sayre High School (commonly referred to as Sayre High School) is a high school in Philadelphia. It was founded in 1949...
- John L. Forster Secondary School, often referred to as J.L. Forster or Forster, was a high school in the west end of Windsor, Ontario, Canada. Founded in 1922...
- Mamelodi High School, also called Mamelodi Secondary School, is a high school in Mamelodi township, Tshwane, South Africa. The school was founded in 1956.
- Grigore Moisil High School is a high school in Timișoara, Romania, founded in 1971.
- Spanish Fort High School is a high school in Spanish Fort, Alabama, United States that was founded in 2005.
- The [[West Technical College (Romanian: Colegiul Tehnic de Vest) is a high school in Timișoara. It was founded in 1946.
- Raoul Wallenberg Traditional High School is a high school in San Francisco, California, USA. It was founded in 1981.
- Los Gatos High School (LGHS) is a high school in Los Gatos, California. It was founded in 1908.
- Towson High School is a high school in Baltimore County, Maryland, United States, founded in 1873.
- I believe, but have not checked, that all of these articles were written by different people at different times. they all begin with the fill-in-the-blank pattern of "Name is a high school in <place>" followed by a statement about the year the school opened.
- I don't think that editors could look at these 10 articles and agree whether these are "merely similar" or an undesirable "boilerplate text". WhatamIdoing (talk) 20:58, 6 November 2022 (UTC)
- These are good examples of why we absolutely need more than just "uses boilerplate" as a definition. I like the idea below of also saying that the articles reuse the same few sources (generally one). That would clarify that article doesn't count as mass created if it uses multiple/unique reliable sources. Maybe we even restrict it to "exclusively uses boilerplate and a single common source". Steven Walling • talk 21:39, 6 November 2022 (UTC)
- Using the "same few sources" is not a problem though. As you seem to suggest, there could be a problem if they use the same one source though. So I agree with your last sentence. But I think we still need a numerical threshold, since 2 or 3 articles that use the same single source are probably not a problem. Rlendog (talk) 22:08, 6 November 2022 (UTC)
- None of those are boilerplate articles, either now or when they were created.
- Old revision of Moothedath High School
- Old revision of William L. Sayre High School
- Old revision of J. L. Forster Secondary School
- Old revision of Mamelodi High School
- Old revision of Raoul Wallenberg Traditional High School
- Old revision of Los Gatos High School
- Old revision of Towson High School
- Old revision of Grigore Moisil High School (Timișoara)
- Old revision of Spanish Fort High School
- Old revision of West Technical College
- If you disagree, try to define a boilerplate that would allow you to create those articles - you won't be able to. BilledMammal (talk) 23:40, 6 November 2022 (UTC)
- These are good examples of why we absolutely need more than just "uses boilerplate" as a definition. I like the idea below of also saying that the articles reuse the same few sources (generally one). That would clarify that article doesn't count as mass created if it uses multiple/unique reliable sources. Maybe we even restrict it to "exclusively uses boilerplate and a single common source". Steven Walling • talk 21:39, 6 November 2022 (UTC)
- Multiple articles per lifetime? No, thank you. @Blue Square Thing is absolutely correct that "multiple" will be interpreted as meaning two, but even if you set it at a more reasonable number, then you need to be talking about a rate, not an absolute number. WhatamIdoing (talk) 16:32, 6 November 2022 (UTC)
- Based on discussion, I believe A single editor, creating several articles based on boilerplate text and referenced to the same group of sources is better than my initial proposal. A slight alternative, that I believe could discussed in the same section, would be A single editor, creating dozens of articles based on boilerplate text and referenced to the same group of sources. BilledMammal (talk) 23:40, 6 November 2022 (UTC)
- At this point, I think we can safely conclude that what you think constitutes boilerplate text and what every single person who's responded to you so far thinks is a boilerplate text are not the same thing. WhatamIdoing (talk) 23:52, 6 November 2022 (UTC)
- Can you define a boilerplate text that would allow you to create those articles? BilledMammal (talk) 23:56, 6 November 2022 (UTC)
- Articles that are published with only boilerplate text are distinct from articles that just begin with or contain a boilerplate. The examples you provide all fit in the latter, while the hundreds of footballer microstubs that are wholly interchangeable if one removes the text from user entry fields are what BilledMammal is talking about. JoelleJay (talk) 22:07, 9 November 2022 (UTC)
- At this point, I think we can safely conclude that what you think constitutes boilerplate text and what every single person who's responded to you so far thinks is a boilerplate text are not the same thing. WhatamIdoing (talk) 23:52, 6 November 2022 (UTC)
- We already have an existing definition at WP:MASSCREATE which is effectively that it's the use of a bot or similar software to mechanically create batches of articles as a single task. The issue seems to be that some want to extend the definition to include manually created stubs. That's a different issue IMO. The problem with stubs is not their mass but their minimal nature. But it doesn't seem to be a big problem as the NPP queue seems to be under control and there are lots of existing ways of handling its entries. So, I'm not seeing any need for an expansion of existing guidelines and policies. If it works, don't fix it. Andrew🐉(talk) 11:17, 6 November 2022 (UTC)
- Speaking of WP:MASSCREATE, with the recent changes I'm starting to think it would benefit from being moved out of WP:BOTPOL to its own policy, as it's getting less and less relevant to only bots and automated editing. If nothing else, it not being in the "Bot policy" may reduce the chances of someone arguing that it somehow doesn't apply to their "manual" mass creation. Anomie⚔ 12:28, 6 November 2022 (UTC)
- @Andrew Davidson, MASSCREATE says "While no specific definition of "large-scale" was decided". Please explain why you believe " We already have an existing definition at WP:MASSCREATE" when MASSCREATE explicitly says there is no definition. WhatamIdoing (talk) 16:52, 6 November 2022 (UTC)
- WP:MASSCREATE exists and it's policy. Its explicit scope is "large scale" creation by bot and it indicates that the threshold for "large scale" is 25 to 50. That's more specific than WP:SIGCOV, say and, as our policies go, it seems quite clear. Andrew🐉(talk) 17:14, 6 November 2022 (UTC)
- @Andrew Davidson, where in MASSCREATE does it explicitly say anything like "the use of a bot or similar software to mechanically create batches of articles as a single task"?
- In your opinion, does MASSCREATE's suggestion refer to 25 to 50 articles ever, or 25 to 50 articles per some time period (your "single task" language)? WhatamIdoing (talk) 17:18, 6 November 2022 (UTC)
- MASSCREATE starts "Any large-scale content page creation task must be approved at Wikipedia:Bots/Requests for approval." And that seems fairly clear. Once approval is given then there doesn't seem to be a ceiling for the creations. The number and extent of the bot runs would presumably be part of the approval process. Have there been any or many cases where this approval has bee sought since Rambot? By looking at these, we might see how the policy works in practise. Andrew🐉(talk) 17:36, 6 November 2022 (UTC)
- There were two discussions recently, Wikipedia:Bots/Noticeboard/Archive 17#Dams articles and Wikipedia:Village pump (proposals)/Archive 194#Mass creation of pages on fish species. Older examples are Wikipedia:Bots/Requests for approval/Rich Farmbrough (mass article creation), Wikipedia:Bots/Requests for approval/qbugbot, Wikipedia:Bots/Requests for approval/Qbugbot 2, and Wikipedia:Bots/Requests for approval/anybot. Others also exist. BilledMammal (talk) 17:53, 6 November 2022 (UTC)
- @Andrew Davidson,
- I'm still not seeing any reference to the use of "a bot or similar software", nor any reference to "mechanically creating batches of articles as a single task". Is that actually in MASSCREATE, such that non-software-assisted large-scale article creation is unrestricted?
- The question about number is about the floor, not the ceiling. What's the smallest number of articles that you'd call "large-scale"?
- WhatamIdoing (talk) 21:18, 6 November 2022 (UTC)
- @Andrew Davidson,
- There were two discussions recently, Wikipedia:Bots/Noticeboard/Archive 17#Dams articles and Wikipedia:Village pump (proposals)/Archive 194#Mass creation of pages on fish species. Older examples are Wikipedia:Bots/Requests for approval/Rich Farmbrough (mass article creation), Wikipedia:Bots/Requests for approval/qbugbot, Wikipedia:Bots/Requests for approval/Qbugbot 2, and Wikipedia:Bots/Requests for approval/anybot. Others also exist. BilledMammal (talk) 17:53, 6 November 2022 (UTC)
- MASSCREATE starts "Any large-scale content page creation task must be approved at Wikipedia:Bots/Requests for approval." And that seems fairly clear. Once approval is given then there doesn't seem to be a ceiling for the creations. The number and extent of the bot runs would presumably be part of the approval process. Have there been any or many cases where this approval has bee sought since Rambot? By looking at these, we might see how the policy works in practise. Andrew🐉(talk) 17:36, 6 November 2022 (UTC)
- WP:MASSCREATE exists and it's policy. Its explicit scope is "large scale" creation by bot and it indicates that the threshold for "large scale" is 25 to 50. That's more specific than WP:SIGCOV, say and, as our policies go, it seems quite clear. Andrew🐉(talk) 17:14, 6 November 2022 (UTC)
- @Andrew Davidson, MASSCREATE says "While no specific definition of "large-scale" was decided". Please explain why you believe " We already have an existing definition at WP:MASSCREATE" when MASSCREATE explicitly says there is no definition. WhatamIdoing (talk) 16:52, 6 November 2022 (UTC)
- Speaking of WP:MASSCREATE, with the recent changes I'm starting to think it would benefit from being moved out of WP:BOTPOL to its own policy, as it's getting less and less relevant to only bots and automated editing. If nothing else, it not being in the "Bot policy" may reduce the chances of someone arguing that it somehow doesn't apply to their "manual" mass creation. Anomie⚔ 12:28, 6 November 2022 (UTC)
- WP:MASSCREATE is not a free-standing, separate policy. It's part of the WP:Bot policy which "covers the operation of all bots and automated scripts used to provide automation of Wikipedia edits". That section covers the use of such bots and scripts to create pages. See Context (language use). Andrew🐉(talk) 21:49, 6 November 2022 (UTC)
- What's the smallest number of articles that you'd call "large-scale"? WhatamIdoing (talk) 23:52, 6 November 2022 (UTC)
- Also, one of the problems we're having is that people are pointing at MASSCREATE to try to stop editors from creating articles using 100% manual methods, with no hint of a bot or script anywhere in the process. It sounds like, from your contextual reading, that you need bot approval if you want to use a script to create more than n articles, but if you do the work 100% by hand, then you can create them at a rate limited only by how fast you can type. Is that a fair summary of your interpretation? WhatamIdoing (talk) 23:54, 6 November 2022 (UTC)
- The main point of MASSCREATE is that creations by bot/script should be pre-approved. There's other parts of the bot policy that say things like "Note that high-speed semi-automated editing may effectively be considered bots in some cases (see WP:MEATBOT), even if performed by a human editor." So, botlike work that seems to be erroneous or inattentive may be shut down. But that's generally true for any manual work. For example, see Utterly horrendously written articles from an auto patrolled user. In this case, an editor has created over 1,000 articles which have been criticised as too garbled in some cases. There doesn't seem to be a particular policy issue beyond WP:CIR. People don't seem to need any additional policy to address such cases of manual incompetence. Andrew🐉(talk) 09:12, 7 November 2022 (UTC)
- What if it's neither "high-speed" nor "semi-automated"? I'm not asking to be picky. I'm asking because we just spilled some thousands of words about whether creating one or two articles per day for a year should be banned under MASSCREATE because that's more than than the "25 to 50" listed in MASSCREATE.
- In your opinion, if someone creates just one or two stubs per day for a year, is that mass creation/large-scale creation/creation at scale? WhatamIdoing (talk) 22:39, 7 November 2022 (UTC)
- One or two stubs per day is not seen by MASSCREATE as a problem. MASSCREATE says "Alternatives to simply creating mass quantities of content pages include creating the pages in small batches..." and doing one or two per day would be such small batches. The idea seems to be that when the rate is low enough for individual human review then it's ok. One or two creations per day will obviously be getting individual attention from the primary author and is well within the capacity of our standard review processes like NPP. Andrew🐉(talk) 09:46, 8 November 2022 (UTC)
- The main point of MASSCREATE is that creations by bot/script should be pre-approved. There's other parts of the bot policy that say things like "Note that high-speed semi-automated editing may effectively be considered bots in some cases (see WP:MEATBOT), even if performed by a human editor." So, botlike work that seems to be erroneous or inattentive may be shut down. But that's generally true for any manual work. For example, see Utterly horrendously written articles from an auto patrolled user. In this case, an editor has created over 1,000 articles which have been criticised as too garbled in some cases. There doesn't seem to be a particular policy issue beyond WP:CIR. People don't seem to need any additional policy to address such cases of manual incompetence. Andrew🐉(talk) 09:12, 7 November 2022 (UTC)
- Also, one of the problems we're having is that people are pointing at MASSCREATE to try to stop editors from creating articles using 100% manual methods, with no hint of a bot or script anywhere in the process. It sounds like, from your contextual reading, that you need bot approval if you want to use a script to create more than n articles, but if you do the work 100% by hand, then you can create them at a rate limited only by how fast you can type. Is that a fair summary of your interpretation? WhatamIdoing (talk) 23:54, 6 November 2022 (UTC)
- What's the smallest number of articles that you'd call "large-scale"? WhatamIdoing (talk) 23:52, 6 November 2022 (UTC)
- WP:MASSCREATE is not a free-standing, separate policy. It's part of the WP:Bot policy which "covers the operation of all bots and automated scripts used to provide automation of Wikipedia edits". That section covers the use of such bots and scripts to create pages. See Context (language use). Andrew🐉(talk) 21:49, 6 November 2022 (UTC)
- I can only repeat my comment quoted in the ideas box above that trying to define something as arbitrary as "mass creation" is impractical. The definition will fail when an influx is needed and will be circumvented when it is not. The only solutions to the perceived problem are to (a) insist that at least one acceptably significant source is cited in each new article; and (b) consider each case of mass creation on its individual merits. For (b), is it a one-off and is it justified; or is there a pattern that amounts to disruption? From what I can see of the Lugnuts case, the basic objection was not that he suddenly produced a mass input to, for example, fill a new category. It was more a case of persistent creation of minimal stubs over a long period of time. The key factor in any question of "mass creation" is circumstance and you cannot impose a rigorous definition on a concept which has such wide variations. There is another side to this and, turning to a point raised earlier by another editor (this might be at the RFC), I think retrospective action would be morally wrong. It must be acknowledged that the proverbial goalposts have shifted and that when Lugnuts and others were churning out their stubs in years gone by, the creation of placeholders was not only acceptable but probably a necessity to get the encyclopaedia up and running. We can now insist on quality before quantity, which is great because it means progress has been made. For the older stubs that cannot be expanded, there is WP:ATD and redirection to a suitable list. If no suitable list exists, it's easy to create one with a few items (get the relevant project to help if necessary) and then expand it in due course. There really is no need for "mass create" to evolve into "mass delete". BJóv | talk UTC 14:06, 6 November 2022 (UTC)
- Comment - I realize that this will be slightly off topic, but from my perspective the issue goes beyond mass-creation and mass-deletion… the issue is doing anything “en mass”. Mass editing is disruptive no matter what you are doing. For example: while going through an article to conform it to our Manual of Style is considered commendable (and encouraged), we have sanctioned editors who do so at hundreds of articles at the same time. We start accusing the editor of “going on a Crusade” and “acting robotically”.Look at any individual edit, and there was nothing “wrong”, but we balk at edits done “en mass”. I suspect that the problem is that doing things (even “good” things) “en mass” overwhelms our community’s ability to mentally process the edits. We just can’t deal with so many edits at once.To relate this back to the topic at hand… perhaps what is needed is a broader WP:No mass edits guideline that makes it clear that: “Mass editing of any kind is considered disruptive” Blueboar (talk) 14:56, 6 November 2022 (UTC)
- Mass edits become a practical problem when watchlists get flooded. Mass article creation is a problem with the reviewers get swamped.
- Separately, I think we have a bias towards a certain style of editing that we have called "generalist" in the past. We want to see a certain amount of randomness, because that makes it look like you're a real human just like me. An editor who obsesses about a single thing (whether that's a subject area or a single typo) is not admired as much as someone who whimsically skips around editing articles they don't really know anything about. WhatamIdoing (talk) 17:15, 6 November 2022 (UTC)
- Making many similar edits that have community support is not inherently disruptive. There are a lot of editors making style changes in accordance with community consensus without anyone objecting. isaacl (talk) 17:50, 6 November 2022 (UTC)
- I fundamentally disagree that all mass editing is disruptive. This place is dodgy enough in places anyway. Remove all the people making minor fixes behind the scenes to things like MOS, and it'll be full of absolute garbage that's much harder to go and fix. It would also mean that even more of my selling errors would litter pages than already do. Blue Square Thing (talk) 20:06, 6 November 2022 (UTC)
- If I was asked to define "mass creation", I would say that the creation of articles is not "mass creation" unless the total number of articles created exceeds X large number. (This criteria would be a "lifetime" total and would not depend on rate or time period). At this point in time 5,000 articles is the absolute minimum number that I would be prepared to even consider for "X" (for manually created articles). I would prefer something on the order of 10,000 or more. (These seem like relatively large numbers now, but might be less applicable to someone who has been editing for 70 years on our 70th anniversary in 2071.) In my opinion the one-off creation of a single batch of 26 or 51 articles in 24 hours, or even in 24 minutes, is not mass creation, because 51 articles is actually a small number. If someone creates 24 or 49 articles every day, for months and years on end, that might well be mass creation. A definition of mass creation would not be acceptable to the community if it affected large numbers of editors, especially now that it is being said to require BRFA approval, and this kind of criteria is one way to avoid that, since it would confine the definition to less than one hundred article creators at the moment. [There were 29 editors with more than 10,000 articles in 2021. Lugnuts created more than 90,000 articles and Carlossuarez46 created more than 80,000 articles]. James500 (talk) 18:35, 6 November 2022 (UTC)
- 5,000 articles spread over 20 years is not much. That's less than one a day. But 5,000 articles in one year could be a challenge for the Wikipedia:New pages patrol work, and 5,000 in a month will be a problem. The definition needs to include a rate that takes into account the ability of the NPP to review the articles. WhatamIdoing (talk) 21:24, 6 November 2022 (UTC)
- A lifetime cap doesn't really get to the issue at hand, and is probably counterproductive. The issue is a lot of low quality, possibly non-notable articles dumped in a short period of time. I don't think a lifetime cap gets to that at all. — Preceding unsigned comment added by Rlendog (talk • contribs) 22:18, 6 November 2022 (UTC)
- 5,000 articles spread over 20 years is not much. That's less than one a day. But 5,000 articles in one year could be a challenge for the Wikipedia:New pages patrol work, and 5,000 in a month will be a problem. The definition needs to include a rate that takes into account the ability of the NPP to review the articles. WhatamIdoing (talk) 21:24, 6 November 2022 (UTC)
- I'd define mass creation as creating articles without giving individual attention to the articles; in practice, that looks like a combination of high-volume creation, similar article content, and similar sources (e.g. a single database or document). Of course, exact thresholds for that are hard to pin down, but that's where I'd start. TheCatalyst31 Reaction•Creation 20:33, 6 November 2022 (UTC)
- There needs to be some level below which article creation can't be "mass" creation, because there's so little volume involved. We really don't need editors accusing the m:100wikidays editors of mass creation. WhatamIdoing (talk) 21:21, 6 November 2022 (UTC)
- This trifecta of factors (large scale, boilterplate text, and repeated sources) is definitely better as a definition than any kind of numeric threshold. I'd maybe take it one stretch further and say that it would help to narrow the scope to articles created en masse from a single source, since that's the most problematic case by far. Steven Walling • talk 21:26, 6 November 2022 (UTC)
- The problem with not providing a numeric definition is that we already have editors claiming that one or two articles per day is "mass creation" (even when more than one source is provided, even when the subject is known to be notable). What Thryduulf says below about the basic desire here is to prevent people creating large numbers of articles that the person expressing the desire doesn't like resonates with me. Mass creation is being weaponized to stop things I don't like, rather than being about the "mass" creation of articles. WhatamIdoing (talk) 21:31, 6 November 2022 (UTC)
- I agree 100% with the goal of limiting bureaucracy. What about a definition that includes no sources or only a single source? It is legitimately risky for people to create a large number of articles based only on one source or without references at all, since we know that they can have errors. If we limit the scope to "a large number of stubs created using only boilerplate text and one (or no) source", then the problem can be addressed in ways other than deletion or merging, such as by adding more sources. There is definitely no consensus for making people ask for prior permission to create articles, so what that will produce is normal, healthy "should we delete this or can it be improved with better sources?" discussions at AFD. Steven Walling • talk 21:47, 6 November 2022 (UTC)
- @Steven Walling, what's "a large number"? If you do m:100wikidays for three years, you'll have created more than 1,000 articles. Is that "a large number"? WhatamIdoing (talk) 23:59, 6 November 2022 (UTC)
- I think it's clear from the RFC that no one likes trying to specifically define a numeric threshold. A single threshold only works in theory, not in practice. Lack of a single number that crosses the Rubicon from "normal" to "mass creation" prevents people from gaming it and also prevents people from going on a witchhunt looking for any author of more than N articles. Just saying "a large number" and then adding key attributes of the suggested minimum quality bar for mass creation is more effective. Steven Walling • talk 17:34, 7 November 2022 (UTC)
- @Steven Walling, I think it's clear from the RFC that a substantial number of editors, including me, think it's a good idea to define a numeric threshold. I'd settle for "If you come around crying 'mass creation' over one or two articles a day, we're going to ban you on the twin grounds of disruption and competence", but I think we really do need some numbers. We already have people claiming that one or two articles a day is a violation of "mass creation". WhatamIdoing (talk) 22:42, 7 November 2022 (UTC)
- I think it's clear from the RFC that no one likes trying to specifically define a numeric threshold. A single threshold only works in theory, not in practice. Lack of a single number that crosses the Rubicon from "normal" to "mass creation" prevents people from gaming it and also prevents people from going on a witchhunt looking for any author of more than N articles. Just saying "a large number" and then adding key attributes of the suggested minimum quality bar for mass creation is more effective. Steven Walling • talk 17:34, 7 November 2022 (UTC)
- @Steven Walling, what's "a large number"? If you do m:100wikidays for three years, you'll have created more than 1,000 articles. Is that "a large number"? WhatamIdoing (talk) 23:59, 6 November 2022 (UTC)
- I agree 100% with the goal of limiting bureaucracy. What about a definition that includes no sources or only a single source? It is legitimately risky for people to create a large number of articles based only on one source or without references at all, since we know that they can have errors. If we limit the scope to "a large number of stubs created using only boilerplate text and one (or no) source", then the problem can be addressed in ways other than deletion or merging, such as by adding more sources. There is definitely no consensus for making people ask for prior permission to create articles, so what that will produce is normal, healthy "should we delete this or can it be improved with better sources?" discussions at AFD. Steven Walling • talk 21:47, 6 November 2022 (UTC)
- The problem with not providing a numeric definition is that we already have editors claiming that one or two articles per day is "mass creation" (even when more than one source is provided, even when the subject is known to be notable). What Thryduulf says below about the basic desire here is to prevent people creating large numbers of articles that the person expressing the desire doesn't like resonates with me. Mass creation is being weaponized to stop things I don't like, rather than being about the "mass" creation of articles. WhatamIdoing (talk) 21:31, 6 November 2022 (UTC)
- This trifecta of factors (large scale, boilterplate text, and repeated sources) is definitely better as a definition than any kind of numeric threshold. I'd maybe take it one stretch further and say that it would help to narrow the scope to articles created en masse from a single source, since that's the most problematic case by far. Steven Walling • talk 21:26, 6 November 2022 (UTC)
- As an adjustment on my previous proposal, A single editor, creating many articles based on boilerplate text and referenced to the same group of sources? I would agree that I don't think we need an explicit number of articles that need to be created; I don't see a benefit to having a bright line definition. BilledMammal (talk) 00:27, 8 November 2022 (UTC)
- If there is a "group of sources" then (assuming the sources are consistent with GNG guidelines) there should not be a problem. Boilerplate text might be a start, but if there are multiple reliable sources attached to the boilerplate text then I don't see an issue. The potential issue comes if there are a lot of articles using boilerplate text without appropriate sources, but then it comes down to how many. I strongly disagree that 10 in a month (or even a week) is any sort of problem (and even 10 in a day, if not repeated, I doubt is a real problem).Rlendog (talk) 00:41, 8 November 2022 (UTC)
- "Group of sources" are that they are all created from the same collective group of sources. Whether you use database A, database B, or database A and B, if they use the same boilerplate template they are all part of the same mass creation.
- I will add that the purpose of this definition isn't to define problematic mass creation, it is to define all mass creation, as the goal isn't to prevent such actions (the proposal to do that was overwhelmingly rejected), it is to give the community greater oversight of them. This means that appropriate mass creation, such as those that include sufficient sources to demonstrate compliance with GNG, should be included in the definition. BilledMammal (talk) 01:26, 8 November 2022 (UTC)
- I think you've identified one of the reasons we didn't agree on a definition there. There was too much "it's only mass creation if it doesn't have the right sort of sources" or "it's only mass creation if I can't figure out why anybody would care about this subject" and not enough "it's mass creation if there's a ton of it, and only after we've figured out whether there's a lot of it can we talk about whether it's a desirable or problematic mass creation".
- Along those lines, I think your proposal has too little attention on the key point (How many is "many"?) and too much on the desirable/problematic point (boilerplate text, wrong references). WhatamIdoing (talk) 01:45, 8 November 2022 (UTC)
- An editor making many articles isn't necessarily engaging in mass creation if the articles aren't related; the purpose of the focus on boilerplate text and reuse of sources is to establish that.
- Regarding the definition of many, I don't think we need a definition of that; WP:MEATBOT's definition is simply
high-speed or large-scale
, and that ambiguity hasn't caused issues in the past, and I don't believe something similar will here. However, repeating that wording might be useful; A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources BilledMammal (talk) 01:55, 8 November 2022 (UTC)
- But you are saying "database A" and "database B" as sources. If the sources are clearly not in line with GNG then that may be a problem. But if the sources are in line with GNG then boilerplate text should not be a problem at all. The mere use of boilerplate text and the same group of reasonably good sources is not an issue. So maybe ''A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources that are not reasonably consistent with GNG.'' Rlendog (talk) 14:56, 8 November 2022 (UTC)
- I would agree that an editor working through the Oxford Dictionary of National Biography to create boilerplate stubs on every individual currently lacking an article is not an issue, assuming the boilerplate is suitable, but they are still engaged in mass creation. What we are trying to do here is create a definition of mass creation; when we have this definition we can consider how to determine which mass creations are problematic and which are not. BilledMammal (talk) 15:09, 8 November 2022 (UTC)
- But mass creation wouldn't matter if they use the same sources. It would be mass creation if they used no sources. Once we have defined mass creation we can address when mass creation is problematic. I do think we need some sort of quantitative guidelines since "high speed or large scale" can be subjective. We can either define mass creation based on the high speed large scale creation based on boilerplate text, and then say that such mass creation is not problematic if the sources are acceptable, or we can define mass creation based on the high speed large scale creation based on boilerplate text with defined unacceptable sourcing, and then the mass creation is by definition problematic. Rlendog (talk) 17:59, 8 November 2022 (UTC)
- I included the line about "same group of sources" because I see finding and reviewing individual sources to not be a "mass" action, but I can see why editors can believe that the broader context means it still is mass creation and I have no objection to removing that line from the proposal: A single editor, creating articles at high-speed or large-scale, based on boilerplate text.
- I think it is better to leave "high speed and large scale" subjective, both because it will prevent gaming and because I don't see a benefit of a bright-line definition. In addition, I think including a bright-line definition would prevent this from finding a consensus. BilledMammal (talk) 18:46, 8 November 2022 (UTC)
- But mass creation wouldn't matter if they use the same sources. It would be mass creation if they used no sources. Once we have defined mass creation we can address when mass creation is problematic. I do think we need some sort of quantitative guidelines since "high speed or large scale" can be subjective. We can either define mass creation based on the high speed large scale creation based on boilerplate text, and then say that such mass creation is not problematic if the sources are acceptable, or we can define mass creation based on the high speed large scale creation based on boilerplate text with defined unacceptable sourcing, and then the mass creation is by definition problematic. Rlendog (talk) 17:59, 8 November 2022 (UTC)
- I would agree that an editor working through the Oxford Dictionary of National Biography to create boilerplate stubs on every individual currently lacking an article is not an issue, assuming the boilerplate is suitable, but they are still engaged in mass creation. What we are trying to do here is create a definition of mass creation; when we have this definition we can consider how to determine which mass creations are problematic and which are not. BilledMammal (talk) 15:09, 8 November 2022 (UTC)
- If there is a "group of sources" then (assuming the sources are consistent with GNG guidelines) there should not be a problem. Boilerplate text might be a start, but if there are multiple reliable sources attached to the boilerplate text then I don't see an issue. The potential issue comes if there are a lot of articles using boilerplate text without appropriate sources, but then it comes down to how many. I strongly disagree that 10 in a month (or even a week) is any sort of problem (and even 10 in a day, if not repeated, I doubt is a real problem).Rlendog (talk) 00:41, 8 November 2022 (UTC)
- There needs to be some level below which article creation can't be "mass" creation, because there's so little volume involved. We really don't need editors accusing the m:100wikidays editors of mass creation. WhatamIdoing (talk) 21:21, 6 November 2022 (UTC)
- There are times when it is reasonably desirable to create a bunch of similar articles with basic content so that we have articles that will, naturally, be fleshed out over time. At the start many of these articles will look similar to the extent that they appear to be written based on a framework, possibly even they were. The example that comes to mind for this is following an election to a body that confers notability to its members where many, sometimes most, were not notable previously (e.g. national parliaments). We don't want bureaucracy to get in the way of creating those articles. Thryduulf (talk) 21:22, 6 November 2022 (UTC)
- It seems to me that the basic desire here is to prevent people creating large numbers of articles that the person expressing the desire doesn't like. However, few people have been able to objectively define what separates articles of the type they don't like from ones that they do, and different people's definitions correlate poorly with each other. Thryduulf (talk) 21:22, 6 November 2022 (UTC)
- I don't think that is an accurate way to describe it. I would say that the basic desire is to give the community input into large scale creations, as the status quo can result in WP:FAIT issues. BilledMammal (talk) 04:09, 7 November 2022 (UTC)
- How does "give the community input into large scale creations" actually differ from "prevent people creating large numbers of articles that the person expressing the desire doesn't like", in practical terms? I agree that the one sounds a lot more friendly, but both of them mean "other people can vote to tell you that your contributions aren't wanted". WhatamIdoing (talk) 22:43, 7 November 2022 (UTC)
- The first means that the community can decide that your proposed contributions won't improve the encyclopedia; this is aligned with existing community processes such as AfD, with the only difference being that it discusses proposed contributions, in order to address WP:FAIT issues, rather than existing contributions. BilledMammal (talk) 02:50, 8 November 2022 (UTC)
- Both of these are talking about proposed contributions. Both of these are preventive. The comment from Thryduulf, which you disagreed with, actually contains the word prevent. It is illogical to talk about preventing "existing contributions". Thryduulf says that some folks want to prevent editors from creating articles that they (=editors who consider themselves to represent "the community") don't want – prevent, as in before the proposed contributions are made. You say you want to "the community" to decide that some proposed contributions are unwanted, before those proposed contributions are made.
- There is no difference between these two statements. They both mean preventing editors from writing articles that you don't want. IMO the only difference is that you approve of preventing content contributions, and Thryduulf reports this as a view held people others (e.g., editors like you) rather than a view held by himself. WhatamIdoing (talk) 00:56, 9 November 2022 (UTC)
- The first means that the community can decide that your proposed contributions won't improve the encyclopedia; this is aligned with existing community processes such as AfD, with the only difference being that it discusses proposed contributions, in order to address WP:FAIT issues, rather than existing contributions. BilledMammal (talk) 02:50, 8 November 2022 (UTC)
- How does "give the community input into large scale creations" actually differ from "prevent people creating large numbers of articles that the person expressing the desire doesn't like", in practical terms? I agree that the one sounds a lot more friendly, but both of them mean "other people can vote to tell you that your contributions aren't wanted". WhatamIdoing (talk) 22:43, 7 November 2022 (UTC)
- I don't think that is an accurate way to describe it. I would say that the basic desire is to give the community input into large scale creations, as the status quo can result in WP:FAIT issues. BilledMammal (talk) 04:09, 7 November 2022 (UTC)
- I think there are two facets to the issue - quantity and quality. A definition of mass creation needs to get to the quantity, and either the definition itself or the guidelines around it would get to the quality. I would suggest several thresholds as to quantity to avoid gaming. My opening suggestion (but open to revisions) is 25 in a day, 75 in a week, 200 in a month. Not sure if we need to go onto a year. But even a lot of similar articles with multiple reliable sources are not a problem. So as to quality, I would suggest that once a group of articles on a similar topic trips the threshold we should insist on certain quality parameters, such as at least one or two sources that plausibly meet GNG, and possibly others (maybe length, although I am not sure that in itself is a problem if there are multiple appropriate sources). So the guideline could be "If a single editor creates more than 25 articles on a similar topic in a day, or 75 in a week, or 200 in a month, it is expected that each of the articles will have at least 1 (or 2) reliable sources that plausibly meet GNG." If not, I guess the net step would be to figure out, but could result in some sort of restriction on further creation and/or a bias towards deletion at AfD. Rlendog (talk) 22:18, 6 November 2022 (UTC)
- I find it highly paradoxical that basically the only meaningful result of the RFC was to recommend creating a definition, while every meaningful attempt to do anything that would use that definition was soundly defeated. So we're creating a definition of "article creation at scale" to do what, exactly? We're defining a term that, in the end, is pointless as most of the attempts to regulate "article creation at scale" have been soundly defeated. Of the 23 questions in the RFC, 3 were spun off into a new RFC, 11 failed either unanimously or "by a wide margin", 6 were unclear because they either received too little comments to judge consensus, or were phrased in such a way to be confusing, 2 were too close to call, and this was the only one that passed clearly. Let's say we can come up with a definition. What are we going to use that definition to do if the community is so dead-set against regulating at-scale article creation? --Jayron32 13:39, 7 November 2022 (UTC)
- This is a fair point. Put simply, I'd say the problem to solve is that the current definition at WP:MASSCREATE ('While no specific definition of "large-scale" was decided, a suggestion of "anything more than 25 or 50" was not opposed.') is bad. The RFC showed clear consensus that a single numeric threshold was unhelpful and ineffective. The RFC rejected prohibiting or discouraging people from creating articles at scale, but there was more support for saying we should suggest a minimum quality bar for sourcing, etc. There's a related discussion kicking off about how to handle them at AFD which also will fall completely flat unless we have a shared working definition. Steven Walling • talk 17:42, 7 November 2022 (UTC)
- The problem with the old definition isn't that having a number is bad; the problem is that it's unclear what that particular number means. Editors currently disagree whether "anything more than 25 or 50" means "anything more than 25 or 50 in a short time period" or "anything more than 25 or 50 in your entire life". People holding the second view then make up their own extra restrictions (e.g., that only articles which are short, poorly sourced, of doubtful notability, similar to other articles, etc. 'count' towards the limit of 25 or 50). I assume that this is because they have some subconscious recognition that a plain reading of "25 or 50 per lifetime" means that anyone who qualifies for Wikipedia:Autopatrolled would have to "violate" MASSCREATE to reach that point. WhatamIdoing (talk) 22:53, 7 November 2022 (UTC)
- I don't think I agree with the paradox. Part of the reason many of the suggested "fixes" were opposed was because different editors were coming to the discussion with different ideas of what "mass creation" meant. I know that I opposed some proposals because they seemed to apply to creations that were not mass creations, and that I didn't think were appropriate outside that context. If we have a definition, then editors can participate in a more meaningful discussion of proposals to fix the issues generated by creations that meet that particular definition. If the definition is too broad most proposals will likely be defeated, based on the previous discussion. But if the definition is more narrowly focused I think we could get agreement on some of the proposals directed at the specific problem. Rlendog (talk) 00:15, 8 November 2022 (UTC)
- This is a fair point. Put simply, I'd say the problem to solve is that the current definition at WP:MASSCREATE ('While no specific definition of "large-scale" was decided, a suggestion of "anything more than 25 or 50" was not opposed.') is bad. The RFC showed clear consensus that a single numeric threshold was unhelpful and ineffective. The RFC rejected prohibiting or discouraging people from creating articles at scale, but there was more support for saying we should suggest a minimum quality bar for sourcing, etc. There's a related discussion kicking off about how to handle them at AFD which also will fall completely flat unless we have a shared working definition. Steven Walling • talk 17:42, 7 November 2022 (UTC)
The big picture
There's a good measure of article creations at Wikimedia stats. A snapshot to date is shown (right).
The number of creations peaked in 2007 with about 64,000 in July. The latest month of October 2022 was much lower with just 14,253. Note also the spike in October 2002 which was caused by Rambot.
What this seems to show is that mass creation is a declining issue rather than a pressing problem. The latest fuss was mainly about Lugnuts but he was quite exceptional and extreme. With his case resolved, I'm not seeing a need for an exact definition. There aren't lots of Lugnuts out there and the NPP queue of new articles seems to be under control.
If we want to provide some guidance to warn new editors then I suggest it be in the form of outcomes or case studies such as Lugnuts and Rambot which show when someone went too far and crossed a line. We know exactly what happened in those cases and so can give details. When we have a comprehensive list of such test cases, we might try to summarise them in an evidence-based way.
Andrew🐉(talk) 11:03, 8 November 2022 (UTC)
- This is my general feeling as well. - Enos733 (talk) 16:33, 8 November 2022 (UTC)
- This fact is what blows my mind about the whole debate. Article creation has slowed way down (in part due to things like WP:ACTRIAL) but for example, we still have tens of thousands of scientifically described species that lack any coverage in Wikipedia beyond a redlink. That's just one subject! We really need to be discussing ways to encourage article creation, not adding more rules that discourage or prevent it. Steven Walling • talk 17:24, 8 November 2022 (UTC)
- +1 ---Another Believer (Talk) 17:28, 8 November 2022 (UTC)
- -1. The encyclopedia doesn't need a separate page about every species in order to cover every species. Article count is a poor measure of topical coverage. Fewer stand-alone pages are easier to maintain than more stand-alone pages, so we should merge where we can. It's both inevitable and good that article creation has slowed down, and the current rate is still unsustainably high (as it always has been). Levivich (talk) 18:53, 8 November 2022 (UTC)
- The idea that it's easier to not have species articles is totally laughable. It's quite common that just one genus of plants, animals, or fungi can contain more than a thousand species with detailed scientific descriptions. To collapse species up to the genus or family level would require creating and editing a set of truly gigantic, unwieldy lists that would be super complex to edit if they actually contained even a summary of the verifiable information about each one. Not to mention the fact that we'd be making it significantly harder for readers to find encyclopedic information. If we even got Google to index redirects as effectively as articles (which it doesn't), you'd have to scroll or search again within some huge list to find whatever information you were looking for. Steven Walling • talk 20:47, 8 November 2022 (UTC)
- This assumes that a general encyclopedia should summarize all the verifiable information on each species in the first place. Why should we exempt species from WP:INDISCRIMINATE but not astronomical objects or published mathematical lemmas or school council members? Why do we need a million individual articles on insects when most of them can only ever contain the same boilerplate infobox parameters? We explicitly do not want every verifiable detail or anything close to that on a subject, so if the only material with which we can expand a stub is a collection of uncontextualized and/or primary-sourced facts, a standalone is just not merited. JoelleJay (talk) 22:38, 9 November 2022 (UTC)
- Free access to the sum of all human knowledge, that's what we're doing. Steven Walling • talk 18:22, 10 November 2022 (UTC)
- And what qualifies as
the sum
is explicitly restricted by what Wikipedia is NOT. JoelleJay (talk) 20:17, 14 November 2022 (UTC) - Jimbo's not a reliable source anyway. (And no, to collapse species up to the genus would not require creating a set of truly gigantic, unwieldy lists, that's just a straw man.) Levivich (talk) 21:28, 14 November 2022 (UTC)
- And what qualifies as
- Free access to the sum of all human knowledge, that's what we're doing. Steven Walling • talk 18:22, 10 November 2022 (UTC)
- This assumes that a general encyclopedia should summarize all the verifiable information on each species in the first place. Why should we exempt species from WP:INDISCRIMINATE but not astronomical objects or published mathematical lemmas or school council members? Why do we need a million individual articles on insects when most of them can only ever contain the same boilerplate infobox parameters? We explicitly do not want every verifiable detail or anything close to that on a subject, so if the only material with which we can expand a stub is a collection of uncontextualized and/or primary-sourced facts, a standalone is just not merited. JoelleJay (talk) 22:38, 9 November 2022 (UTC)
- The idea that it's easier to not have species articles is totally laughable. It's quite common that just one genus of plants, animals, or fungi can contain more than a thousand species with detailed scientific descriptions. To collapse species up to the genus or family level would require creating and editing a set of truly gigantic, unwieldy lists that would be super complex to edit if they actually contained even a summary of the verifiable information about each one. Not to mention the fact that we'd be making it significantly harder for readers to find encyclopedic information. If we even got Google to index redirects as effectively as articles (which it doesn't), you'd have to scroll or search again within some huge list to find whatever information you were looking for. Steven Walling • talk 20:47, 8 November 2022 (UTC)
- +1. I agree. We need more articles rather than less. There are so many notable topics which are not covered here. BeanieFan11 (talk) 20:13, 8 November 2022 (UTC)
- +1. One of my takeaways from looking at the detailed article creation stats that someone recently posted at one of the workshop threads is that mass creation is so rare that dealing with it on an editor-by-editor basis is preferable to trying to come up with a system of rules about it. Levivich (talk) 18:53, 8 November 2022 (UTC)
Sometimes if you can look at common sense / common practice and put it into words you get a good answer. If an editor spends a substantial amount of time writing text specific to the article and / or finding references for the specidic article, and they do that individually for many articles, nobody is going to have a problem with that from a mass-creation standpoint. If an editor finds a way to make a large amount of articles with very little time investment for each one, many will have a problem with that from a mass creation standpoint. So maybe, even though this will sound simple minded, it's "if you're creating a larger amount of articles, be sure to make a substantial effort on each one to create text unique to that article". If a question arises, see if that practice has been followed. North8000 (talk) 19:09, 8 November 2022 (UTC)
If an editor spends a substantial amount of time writing text specific to the article and / or finding references for the specidic article, and they do that individually for many articles, nobody is going to have a problem with that from a mass-creation standpoint.
I wouldn't even consider that mass creation. I made a proposal above that I believe aligns to that with less subjectivity: A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources. An alternative, as an editor is not convinced by the sources aspect, is A single editor, creating articles at high-speed or large-scale, based on boilerplate text. BilledMammal (talk) 19:18, 8 November 2022 (UTC)- To me the first bolded definition you have seems sufficiently clear in scope to be useful. I think it's worth having a followup RFC that compares support for that vs. a definition that also includes a minimum number for what large scale or high speed means, like WhatamIdoing and others seem to prefer. Steven Walling • talk 20:58, 8 November 2022 (UTC)
- So if the definition is approved in a first RfC, we hold a second RfC proposing to add quantitative values to "large-scale" and "high-speed"? I think that is a good idea. BilledMammal (talk) 21:49, 8 November 2022 (UTC)
- It would probably be simpler to just do a single runoff RFC where we propose one or the other. (First we need to discuss what the numeric thresholds might be.) Steven Walling • talk 23:44, 8 November 2022 (UTC)
- Are you suggesting we use instant-runoff voting, as discussed at the AfD at scale RfC, and ask:
Which proposed definition of mass creation should we implement?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences will be determined through IRV.A: A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources.
B: A single editor, creating more than X articles per Y or Z articles overall, based on boilerplate text and referenced to the same group of sources.
C: Status quo- BilledMammal (talk) 00:33, 9 November 2022 (UTC)
- Yep, ranked choice voting would be fine I think. Either we do it that way, or we have to have a whole other fullblown RFC to solicit various definitions. Steven Walling • talk 05:00, 9 November 2022 (UTC)
- @BilledMammal: I would be very much in favour of your RfC proposal. If I might add a few comments: 1) should it say "
creating stub-size articles
? 2) should it say "referenced to a single source or to the same small group of sources"
? 3) would you envisage having it before the AfD RfC starts, or running it in parallel? 4) – me being pedantic – "high speed" and "large scale" should not have hyphens, unless they are adjectives, which they're not here. Scolaire (talk) 14:12, 9 November 2022 (UTC)- Good clarifying questions. Including "stub" as a size qualifier makes sense to me. Other than that, we should say either one or add "only" to the same group of sources, so that it's clear that just because one source happens to be reused, that's not a problem as long as there are other unique sources. Also you don't need to say large scale or high speed. If scale is the key attribute just say that. So end result is:
- A: A single editor creating a large number of stub articles based on boilerplate text and referenced only to the same sources.
- B: A single editor, creating more than [X] stub articles per [period of time], based on boilerplate text and referenced only to the same sources.
- How's that? Steven Walling • talk 16:38, 9 November 2022 (UTC)
- I think we need to stick with "the same group of sources", as small variations in what source is used (for example, an editor switching between using database A or database B or database A and B) shouldn't be enough to mean this is not mass creation, or that it is a different group of mass creation that would need to be discussed separately. BilledMammal (talk) 21:17, 9 November 2022 (UTC)
- I'm not certain it needs it; it adds subjectively that will result in difficulties at the RfC, since we don't have a clear definition of what a stub, and I doubt that anyone can create a boilerplate that would create an article beyond stub-size.
- The first option won't work, as mass creation can use multiple databases (for example, see some of Lugnuts Olympic stubs, which use Olympedia and Olympics). The second might be better than the original, since "the same group group" could be excessively large.
- I believe the plan is to run it before the AfD RfC starts, so that we have a definition of mass creation for the RfC.
- Good point, we should change that at WP:MEATBOT as well.
- BilledMammal (talk) 21:23, 9 November 2022 (UTC)
- They are used as adjectives at WP:MEATBOT, so they are correct there. Scolaire (talk) 12:29, 10 November 2022 (UTC)
- Good clarifying questions. Including "stub" as a size qualifier makes sense to me. Other than that, we should say either one or add "only" to the same group of sources, so that it's clear that just because one source happens to be reused, that's not a problem as long as there are other unique sources. Also you don't need to say large scale or high speed. If scale is the key attribute just say that. So end result is:
- It would probably be simpler to just do a single runoff RFC where we propose one or the other. (First we need to discuss what the numeric thresholds might be.) Steven Walling • talk 23:44, 8 November 2022 (UTC)
- So if the definition is approved in a first RfC, we hold a second RfC proposing to add quantitative values to "large-scale" and "high-speed"? I think that is a good idea. BilledMammal (talk) 21:49, 8 November 2022 (UTC)
- To me the first bolded definition you have seems sufficiently clear in scope to be useful. I think it's worth having a followup RFC that compares support for that vs. a definition that also includes a minimum number for what large scale or high speed means, like WhatamIdoing and others seem to prefer. Steven Walling • talk 20:58, 8 November 2022 (UTC)
- While I'm commenting, I'd like to address the oft-repeated criticism that setting X at, say, 50 articles will cause editors to "game the system" by creating 49 articles. Of course, by reductio ad absurdum, you can go on decreasing the value of X until it is two articles, and people will "game the system" by creating one. The question is not whether or how we can enforce a given number, but what sort of number patrollers or AfD can comfortably handle. If they can comfortably handle 50 articles in a given time period, who cares whether someone creates 49 or 51? Scolaire (talk) 14:45, 9 November 2022 (UTC)
- Let's set it to 10 a day then, that way no matter what the processes we have in place will work. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 18:02, 9 November 2022 (UTC)
- We need to chose a number that can be handled by AfD even if there are multiple editors engaged in such mass creation, and even if the mass creation isn't noticed for a few months. BilledMammal (talk) 21:17, 9 November 2022 (UTC)
- 10 a day, 20 a week, 40 a month. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 21:47, 9 November 2022 (UTC)
- @Scolaire, I don't think we need to define creation speeds in terms of what AFD can handle at one time. Consider:
- If I create a hundred articles today, I'm not being fair to NPP, but the AFDs could be spread out over the next month, or even the next year.
- If the articles are very similar, then you could nominate multiple related pages for deletion. The suggested process is to nominate one, see what happens, and then come back with more. Perhaps today you send the first to AFD (or maybe the first couple, in separate nominations), and next week you send a bundle of five or ten, and the week after, you send all the rest in a large bundle.
- Fairly often, the kinds of articles that folks are complaining about (e.g., substubs about early Olympic athletes) don't need to go to AFD anyway. The heaviest process needed is often Wikipedia:Proposed article mergers, to turn Rae Runner into a redirect to the List of Ruritanian Olympic athletes.
- Consequently, I suggest that article creation ought to be limited according to what the article creation-related review processes can handle, and not according to what might be needed if (and only if) AFD becomes relevant. WhatamIdoing (talk) 21:41, 10 November 2022 (UTC)
- WhatamIdoing I'm speaking as a layman here. My point is that we should pick a number that the system can comfortably handle, not any specific component of the system (I just used those two as examples). The responses above and below suggest that 40–50 articles a month is a reasonable figure. If we set it at that, then one or two creators "gaming the system" will not overwhelm the process. Scolaire (talk) 10:56, 11 November 2022 (UTC)
- Thinking about the system as a whole, I think we could safely set the limit-at-which-following-the-rules-isn't-disruptive at 100 articles a month without any part of the system being overwhelmed, so long as those 100 articles aren't all posted in less than a week.
- I fully agree with your reductio ad absurdum analysis of the "gaming" fear. WhatamIdoing (talk) 18:36, 11 November 2022 (UTC)
- Could we achieve this by splitting B, with B2 becoming
A single editor, creating more than 100 articles per week, based on ...
? That would give respondents a lower a higher level to choose from. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 18:41, 11 November 2022 (UTC)
- Could we achieve this by splitting B, with B2 becoming
- WhatamIdoing I'm speaking as a layman here. My point is that we should pick a number that the system can comfortably handle, not any specific component of the system (I just used those two as examples). The responses above and below suggest that 40–50 articles a month is a reasonable figure. If we set it at that, then one or two creators "gaming the system" will not overwhelm the process. Scolaire (talk) 10:56, 11 November 2022 (UTC)
- @Scolaire, I don't think we need to define creation speeds in terms of what AFD can handle at one time. Consider:
- 10 a day, 20 a week, 40 a month. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 21:47, 9 November 2022 (UTC)
- We need to chose a number that can be handled by AfD even if there are multiple editors engaged in such mass creation, and even if the mass creation isn't noticed for a few months. BilledMammal (talk) 21:17, 9 November 2022 (UTC)
- Let's set it to 10 a day then, that way no matter what the processes we have in place will work. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 18:02, 9 November 2022 (UTC)
A single editor, creating articles at high-speed or large-scale, based on boilerplate text and referenced to the same group of sources
- As long as we say this doesn't go into effect until we've defined "high-speed" and "large-scale", that seems more or less ok. Might change "and" to "or" in the second sentence, though (or change "same" to "same or substantially similar"). You don't need the same group of sources to work with a boilerplate, after all. — Rhododendrites talk \\ 21:55, 9 November 2022 (UTC)- The current proposal is to hold an RfC with three options; one without definitions for those terms, one with definitions for those terms, and one for the status quo. However, we still need a proposal on what definitions we should use for those terms.
Might change "and" to "or" in the second sentence, though (or change "same" to "same or substantially similar").
I agree with that second change, or something like it; the intent is to say that not all of the sources need to be used on every article for it to still be mass creation, but I think that might be being misinterpreted?- I don't support changing it to or, however, as I don't think merely using the same sources is enough for it to be mass creation, but if other editors disagree I won't object. BilledMammal (talk) 07:52, 10 November 2022 (UTC)
- I think BilledMammal is right. The "or" option will lead to someone claiming that if you use any overlapping source, or any source that's the same basic type, that it's culpable mass creation. This is just the nature of the systems we've set up. We've gone so far down the path of rules-lawyering that people feel like they have to claim that all conditions are fulfilled, to accomplish the end that they want.
- We used to say that in computing the three scariest things were a programmer with a soldering iron, a hardware tech with a compiler, and a user with an idea. In the current era, I'd change it to "users who can't do their jobs unless they break the security policies". If you tell a sales team that they can either follow the corporate policy about not uploading proprietary content to third-party websites, or they can get the deal-clinching, paycheck-producing document into the customers' hands, they're going to break the policy without a second thought. They will click any button, visit any site, and agree to any terms, so long as it accomplishes the goals. (See also efforts to stop people from uploading copyvios to Commons. Sure, Commons, I definitely took that picture of Queen Victoria myself, because if I don't claim that, the software* won't let me upload it.) [*depending on which software you're using]
- We have a similar thing here: We tell people they can only get articles deleted under certain circumstances. If they deeply believe that the article should be deleted, then they will do and say whatever is necessary to reach their goal. If submitting AFD required you to tick a box claiming that this nomination was endorsed by a Nobel Prize winner, people would tick that box.
- I don't think there is an easy solution to this, but so long as this problem exists, the rules should be written to defend against overblown claims. WhatamIdoing (talk) 22:00, 10 November 2022 (UTC)
- The picture that I am getting here is that the basic problem that actually needs solving is that some editors are putting a disproportionate burden on NPP by creating articles at a rate that NPP cannot comfortably handle. If every registered editor suddenly created a new stub on one specific day it would crash the system, but the probability is vanishingly low so we accept the risk. When we have a few editors who persistently create sufficient articles of dubious quality at a rate which taxes the capacity of NPP we do not want to accept the risk because it is known to happen often enough to be a problem.
There are two aspects to this recognised problem. Rate of creation, and quality of article after creation is finished. (articles are not necessarily created in one edit, and should not be reviewed until the initial creation is complete - In use and Under construction tags should keep them off the queue).
The total number of articles created over an editor's Wikipedia career is not relevant to this specific problem, but may be a separate problem where incompetence is involved (there are examples of this happening).
A limit to rate of creation may be the easiest way to deal with this. I think a running average limit, with a superposed peak daily rate may be a suitable constraint. Ideally it would be automated to choke the floe when it gets too high, and ideally it would be a function of NPP backlog size at the time, but those are technical issues and we do not have the capacity at present (as far as I know), so a dumbed down system that is simple enough for the average editor to follow would be needed as a starting point.
Daily, weekly and monthly caps should be manageable (something like 10 per day, 30 per week, 50 per month, subject to change if they are found to be unmanageable. this could be linked to backlog status, with the understanding that there would be a hard bottom limit, so that no-one ever gets shut off completely, of 1 per day, whatever the backlog gets to.)
These limits would only apply to articles that go through NPP.
If anyone wants to exceed the limits for a period, they apply for permission, and special conditions will apply (on a case by case basis, for the specified batch). If they keep within the limits, no special permission is required. If they inadvertently exceed the limits, they get notified by whoever notices, and are required to slow down. If they fail to respond (by slowing down), they get a 24 hour block, which will slow them down. This block would be preventative and effective as a brake, would attract their attention, and would be applied as often as necessary.
With a article creation rate limit system like this there is no need to define mass creation or creation at scale, which is basically a red herring. Cheers, · · · Peter Southwood (talk): 01:57, 10 November 2022 (UTC)
What I'm getting from the above, then is:
Which proposed definition of mass creation should we implement?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences will be determined through IRV.A: A single editor, creating articles at high speed or large scale, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
B: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
C: Status quo
Is that about right? Scolaire (talk) 12:20, 10 November 2022 (UTC)
By the way, what is the status quo? That we continue to discuss mass creation without defining it? Scolaire (talk) 16:14, 10 November 2022 (UTC)
- The status quo would be continued discussion on coming up with a definition. Question 3B of the RFC passed by a wide margin, so this discussion can only come up with a definition not decide not to define one. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 17:16, 10 November 2022 (UTC)
- Thanks, AD. Scolaire (talk) 20:12, 10 November 2022 (UTC)
- Well, you can't squeeze water out of a stone—if after significant attempts have been made there is no agreement, that's in essence a new consensus reversing the previous one. But what really matters are any new procedures that are agreed upon by consensus. Those will have to establish their own specific criteria on when they apply. isaacl (talk) 21:12, 10 November 2022 (UTC)
- We should make sure to ping all those who voted on Question 3B once the RFC wording has been decided. That way noone can later claim any shenanigans. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 21:36, 10 November 2022 (UTC)
- I still think A is a little too wordy, but yeah Scolaire I think those options are roughly worth having the followup definition RFC about. Steven Walling • talk 18:29, 10 November 2022 (UTC)
- I think we should word it as
and referenced to a subset of a small group of sources.
, but otherwise yes. BilledMammal (talk) 21:09, 10 November 2022 (UTC)- @Scolaire, could option A be a little clearer about its purely subjective nature, by saying creating articles at a rate that you subjectively believe should be called high speed or large scale? WhatamIdoing (talk) 22:02, 10 November 2022 (UTC)
- I would disagree with both of those. @WhatamIdoing: "high speed or large scale" is objectively subjective (if you'll pardon the expression); anyone !voting for it will know that they are defining it as something that they "subjectively believe" is high speed or large scale. There is no need to spell it out.
- @BilledMammal: "a subset of a small group of sources" is confusing; a small group of sources is about as "sub" as you can get; what is meant by a "subset" of those, and why do we need to specify it? On reflection, I disagree with "or substantially similar" for the same reason. If an article creator uses the same two sources for half his new articles, and two "substantially similar" sources for the other half, that is still the same small number (four) of sources for the whole lot.
- I missed Steven Walling's earlier suggestion that the word "only" be added before "the same group of sources". That is kind of necessary, I think. I therefore suggest:
based on boilerplate text and referenced only to the same small group of sources.
Scolaire (talk) 12:03, 11 November 2022 (UTC)- I also see that Steven Walling suggested changing A to "creating a large number of articles based on boilerplate text..." I'd be inclined to agree: (a) it's plain English, and (b) "high speed" isn't really necessary, since anyone creating a large number of articles based on boilerplate text isn't going to do it over a period of years. Scolaire (talk) 12:39, 11 November 2022 (UTC)
- @Scolaire, please reconcile these things:
- You: "anyone creating a large number of articles based on boilerplate text isn't going to do it over a period of years"
- An editor creating more than 500 short, similar articles, individually, manually, after checking the sources (the editor reports having found errors in FishBase), spread out over the course of the last year. This is an average of about one and a half articles per day.
- Note, too, Wikipedia:Village pump (proposals)/Archive 194#Mass creation of pages on fish species, where multiple editors have claimed (and others have disagreed) that creating these articles is a violation of MASSCREATE. The editor started this discussion because @BilledMammal complained on the editor's talk page that creating one or two articles per day is a violation of MASSCREATE.
- Based on these facts, I have to assume that stopping this "high-speed or large-scale" article creation of just one or two articles per day, involving "boilerplate" (there are only so many ways to say that <Species> is a <common name> in <genus>", so of course they're going to look like "boilerplate" articles), is exactly what BilledMammal wants to accomplish. Is this what you want to accomplish? WhatamIdoing (talk) 16:14, 11 November 2022 (UTC)
- Isn't this discussion about the RFC wording? Once the RFC is underway editors can express their opinions on the options. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:36, 11 November 2022 (UTC)
- Yes, and my goal here is to make it clear to RFC participants that if "A" is chosen, there will be disputes about whether one or two articles per day counts as "high-speed or large-scale", because (a) there is no agreed-upon definition for these terms anywhere, so each editor gets to make up their own numbers, and (b) we have already had disputes about what counts as "high-speed or large-scale", so we can't even pretend to be surprised when (not "if") it happens again in the future. WhatamIdoing (talk) 17:00, 11 November 2022 (UTC)
- That why the second option that includes some kind of definition is a suggestion. Separately I'd argue that the fish articles fall under the first option due to "or large scale". The "or" would imply one or the other, so even if they were created at one or two a day (not high speed) there is amlot of them (large scale).
- So the first option would include them, while the second option would only do so if they were created at a speed reaching the mentioned limits. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 17:14, 11 November 2022 (UTC)
- And Scolaire incorrectly claimed above that nobody would create "a large number of articles [...] over a period of years", which is how we ended up with this sub-thread. Some editors have created a large number of articles over very long periods of time, and we also know that certain other editors, specifically including some editors active in these discussions, believe that creating 500 articles per year (=42 articles per month, 1.4 articles per day) should be banned as "large-scale".
- As for your suggestion above about proposing a limit of more than 100 articles per week, I think it's a good idea. There will certainly be editors who oppose it; for example, BilledMammal has previously said that "100 boilerplate articles created over a year should get consensus". I suspect that this is due to a personal dislike of "boilerplate articles" rather than any practical concerns about whether reviewers can handle two articles a week. (Boilerplate articles are generally easier to process than most, because you know exactly what to expect from them and their main sources.) But from the POV of ranked-choice voting, I think that providing people with a range helps them identify their real preferences. You want to have options at both ends of the spectrum that most people are willing to vote against.
- Another thing that would help people figure out their preferences is providing some background information, like the number of articles NPP handles in the same time period and the number of editors last year who might have been constrained by such a rule, and whether the rule would have moderated any notorious editors in the past. WhatamIdoing (talk) 20:19, 11 November 2022 (UTC)
- That why the second option that includes some kind of definition is a suggestion. Separately I'd argue that the fish articles fall under the first option due to "or large scale". The "or" would imply one or the other, so even if they were created at one or two a day (not high speed) there is amlot of them (large scale).
- Yes, and my goal here is to make it clear to RFC participants that if "A" is chosen, there will be disputes about whether one or two articles per day counts as "high-speed or large-scale", because (a) there is no agreed-upon definition for these terms anywhere, so each editor gets to make up their own numbers, and (b) we have already had disputes about what counts as "high-speed or large-scale", so we can't even pretend to be surprised when (not "if") it happens again in the future. WhatamIdoing (talk) 17:00, 11 November 2022 (UTC)
- Isn't this discussion about the RFC wording? Once the RFC is underway editors can express their opinions on the options. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:36, 11 November 2022 (UTC)
- @Scolaire, please reconcile these things:
- I also see that Steven Walling suggested changing A to "creating a large number of articles based on boilerplate text..." I'd be inclined to agree: (a) it's plain English, and (b) "high speed" isn't really necessary, since anyone creating a large number of articles based on boilerplate text isn't going to do it over a period of years. Scolaire (talk) 12:39, 11 November 2022 (UTC)
- @Scolaire, could option A be a little clearer about its purely subjective nature, by saying creating articles at a rate that you subjectively believe should be called high speed or large scale? WhatamIdoing (talk) 22:02, 10 November 2022 (UTC)
- Maybe the first option should change from
high speed or large scale
tohigh speed or large scale regardless of speed
.
Also C should change toOther
to give respondents a chance to give additional input, with D beingDo not create a specific definition
? Pinging all past participants from the original RFC as previously stated. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 17:20, 11 November 2022 (UTC)- I like your suggested change to "A".
- "Other" does not work well in instant-runoff voting schemes. You can end up with lots of people voting for "Other" but none of them voting for the same thing.
- Your suggestion of "Do not create a specific definition" would be more accurately re-phrased as "Overturn the results of the WP:ACAS RFC, whose clearest result was that the community should create a definition". WhatamIdoing (talk) 20:24, 11 November 2022 (UTC)
- See my tweekes text below. I changed Overturn the results of the WP:ACAS RFC as it's overturning Question 3B not the whole RFC. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 20:49, 11 November 2022 (UTC)
Arbitrary break (article creation at scale)
- New suggestion for RFC text.
Question 3B Should we create a definition of "article creation at scale"? of the WP:ACAS RFC was passed, but no specific definition suggested in WP:ACAS managed to pass.
-- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 20:46, 11 November 2022 (UTC)
Which proposed definition of mass creation should we implement?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences will be determined through IRV
A: A single editor, creating articles at high speed or large scale regardless of speed, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
B1: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
B2: A single editor, creating more than 100 articles per week, based on boilerplate text and referenced to the same, or substantially similar, small group of sources.
C: Overturn the results of the WP:ACAS RFC Question 3B, and leave mass creation undefined.
- A is still a little bit of a word salad. I would really suggest keeping it simple and saying
A single editor rapidly creating a large number of articles based on boilerplate text and referenced mostly to the same few sources.
How's that? Steven Walling • talk 04:58, 12 November 2022 (UTC)- @Steven Walling, at least some of the folks here are trying to stop both The Tortoise and the Hare. That is, they want to prevent a single editor from slowly but persistently creating a large number of articles. WhatamIdoing (talk) 07:12, 13 November 2022 (UTC)
A large number of boilerplate articles
, nota large number of articles
. No one is trying to prevent editors from creating large numbers of non-boilerplate articles.- The reason we are focused on scale, not speed, is because speed is irrelevant to AfD's ability to handle mass creation. AfD cannot handle an editor creating 1000 boilerplate articles, regardless of whether it takes them one day or one year to create those articles. BilledMammal (talk) 07:31, 13 November 2022 (UTC)
- I'm convinced that AFD can handle 1000 boilerplate articles much more easily than it can handle 1000 completely different articles. But the point is that when someone sets out to create 1000 articles (boilerplate or otherwise), if they're doing it slowly, you can stop them before there are 1000 articles available for AFD to handle. WhatamIdoing (talk) 22:02, 14 November 2022 (UTC)
- Almost all of the boilerplate articles need to be taken through at least part of the AfD process (at a minimum, they need to be taken through WP:BEFORE), which means that the 1000 articles are harder to handle than the 1000 completely different articles, almost all of which need to be taken through no part of the process.
But the point is that when someone sets out to create 1000 articles (boilerplate or otherwise), if they're doing it slowly, you can stop them before there are 1000 articles available for AFD to handle.
See Lugnuts for evidence of why that isn't the case. BilledMammal (talk) 23:30, 15 November 2022 (UTC)
- I'm convinced that AFD can handle 1000 boilerplate articles much more easily than it can handle 1000 completely different articles. But the point is that when someone sets out to create 1000 articles (boilerplate or otherwise), if they're doing it slowly, you can stop them before there are 1000 articles available for AFD to handle. WhatamIdoing (talk) 22:02, 14 November 2022 (UTC)
- @Steven Walling, at least some of the folks here are trying to stop both The Tortoise and the Hare. That is, they want to prevent a single editor from slowly but persistently creating a large number of articles. WhatamIdoing (talk) 07:12, 13 November 2022 (UTC)
- Instant runoff works well for scenarios where one of the provided options must be selected. In this case, since these aren't the only possible options, I think it would be best to establish which options have consensus support, and then using the instant runoff procedure amongst those choices. (In essence, this combines approval voting with ranked voting.) isaacl (talk) 23:22, 11 November 2022 (UTC)
We seem to be going back and forth a bit here. None of my most recent comments were taken into account in this most recent draft. To summarise them: (1) Steven Walling's "A single editor rapidly creating a large number of articles" says the same thing as completely and more simply than "A single editor, creating articles at high speed or large scale regardless of speed"; (2) "or substantially similar" is also unnecessarily complicated, since they don't all need to be the identical sources to be a small group, and a large group of non-identical sources would not fit the definition we're looking for; and (3) the word "only" should be inserted before "the same sources".
Re the latest proposed wording: "overturn 3B" would mean we don't want to define mass creation at all; Surely a better alternative to "status quo" would be "none of the above"? I also agree with isaacl that IRV is not necessarily the best choice here; we would be better to leave that out of the question, and let the moderators/closers choose how to decide. One further thing: I would suggest that "which...should we implement" should be changed to "which...should we adopt". My alternative proposal, then, is:Which proposed definition of mass creation should we adopt?
Please rank your choices by listing, in order of preference from most preferred to least preferred:
A: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced only to the same small group of sources.
B: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced only to the same small group of sources.
C: A single editor, creating more than 100 articles per week, based on boilerplate text and referenced only to the same small group of sources.
D None of the above
Scolaire (talk) 14:51, 12 November 2022 (UTC)
- I'd still rather D was explicit in that it would overturn 3B, but it's not a sticking point for me. I'm happy with the text either way. BilledMammal, WhatamIdoing any thoughts? -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 18:17, 12 November 2022 (UTC)
- Another way of saying D could be "Leave WP:MASSCREATE as currently defined." Anyway, I'm supportive of Scolaire's suggested version too. Steven Walling • talk 19:58, 12 November 2022 (UTC)
- We need to keep options for defining it in terms of scale, rather than just rate, as myself and many other editors prefer such a definition. I suggest that editors who prefer defining it based on rate create no more than two definitions that they can agree on, and editors who prefer defining it in terms of scale should do the same.
- For scale, I suggest:
- A: A single editor creating a large number of articles based on boilerplate text
- B: A single editor creating more than 100 articles based on boilerplate text
- We need to discuss whether we want to add something similar to "using the same group of sources", as some editors appear to believe it is unneeded and adds unnecessary complexity. BilledMammal (talk) 02:22, 13 November 2022 (UTC)
- @BilledMammal: I'm not aware of any editors saying that "using the same group of sources" is unneeded. As far as I can see, using boilerplate text and using only the same small group of sources are both considered necessary for whatever proposals are put forward. On the other hand, to start talking about definition in terms of numbers alone seems to me to be going off on a tangent, just as we were close to agreement on a wording. It is a completely new idea to me, and I haven't seen "many other editors" showing a preference for it. 100 articles in a week is one thing, but 100 articles in a lifetime? How would we even count them? Obviously you are free to propose them, but as D and E, not as A and B. Too many people have said that they agree with the current three proposed definitions. Scolaire (talk) 14:44, 13 November 2022 (UTC)
- See the discussions above. Personally, I support their inclusion and have no objection to them being restored.
- I don't think we were close to an agreement on wording as your most recent wording varies significantly from previous wordings, such as the wording of A that you posted at 12:20, 10 November 2022, or the wording of A that ActivelyDisinterested posted at 20:46, 11 November 2022.
- The option of B (alternatively A2, to match the format proposed by ActivelyDisinterested) is because we are now including options for the rate of creation that includes an explicit figure; to keep the RFC neutral I believe we need to provide an equivalent option for scale of creation. BilledMammal (talk) 14:48, 13 November 2022 (UTC)
- @BilledMammal: I'm not aware of any editors saying that "using the same group of sources" is unneeded. As far as I can see, using boilerplate text and using only the same small group of sources are both considered necessary for whatever proposals are put forward. On the other hand, to start talking about definition in terms of numbers alone seems to me to be going off on a tangent, just as we were close to agreement on a wording. It is a completely new idea to me, and I haven't seen "many other editors" showing a preference for it. 100 articles in a week is one thing, but 100 articles in a lifetime? How would we even count them? Obviously you are free to propose them, but as D and E, not as A and B. Too many people have said that they agree with the current three proposed definitions. Scolaire (talk) 14:44, 13 November 2022 (UTC)
- @Scolaire, these two options are materially different:
- "A single editor rapidly creating a large number of articles"
- "A single editor, creating articles at high speed or large scale regardless of speed"
- The first covers Alice making 500 articles in a week. It does not cover Bob making 500 articles over the course of ten years.
- The second covers Alice making 500 articles in a week plus Bob making 500 articles over the course of ten years.
- A few editors (e.g., see BilledMammal's comments) actually want to ban you from creating one article per week over the course of a decade, unless you get written permission to make so many similar articles. You can vote against them, but please don't claim these are the same thing. We really don't need to end up with the written rules saying "high speed" but editors claiming that you're just supposed to know that very slow rates of article creation count as being rapid if you don't quit soon enough. WhatamIdoing (talk) 07:19, 13 November 2022 (UTC)
- Again this is pushing the idea of having a VOTE rather than establishing consensus. Trying to mandate draconian rules by straight majority voting is divisive and often results in the opposite of consensus. What you get is trainwrecks like Brexit and the current Blue/Red split in the US. Andrew🐉(talk) 09:13, 13 November 2022 (UTC)
- Unless we can narrow it back down to one option we're going to need a series of options that editors order. However, I would suggest clarifying the question by saying
Preferences, weighted by strength of argument, will be resolved through IRV
. BilledMammal (talk) 09:17, 13 November 2022 (UTC)- We don't need this aggravation. The big picture above shows that mass creation is not a pressing problem requiring more complex and creepy rules. The real mass creation problem is the endless argumentation and tinkering with the rules. We have multiple policies discouraging this – WP:NOTLAW, WP:BURO, WP:IAR, &c.
- Unless we can narrow it back down to one option we're going to need a series of options that editors order. However, I would suggest clarifying the question by saying
- Now we had a complex RfC about mass creation and it's over and being closed, right? If it failed to arrive at a consensus about the exact nature of mass creation then it failed and that's that. Time to move on, not repeat the process.
- Andrew🐉(talk) 09:53, 13 November 2022 (UTC)
- From the top of this discussion:
A recently closed RfC found consensus to create a definition of "article creation at scale" (sometimes called mass creation).
BilledMammal (talk) 10:27, 13 November 2022 (UTC) - That's not the consensus of the RFC, question 3B as per the closers notes passed by a wide margin. This discussion is about how to deal with that. There will be an option in the proposed RFC to overturn 3B and not create a definition. But the RFC consensus was very clear that one should be created. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 15:08, 13 November 2022 (UTC)
- @Andrew Davidson, IMO to stop the "endless argumentation", we need to define what mass creation is, or at least what it isn't. I am firmly in favor of having a definition that even the most motivated wikilawyer could not re-interpret as meaning that participation in m:100wikidays is "mass creation". Right now, editors such as BilledMammal have argued that creating one or two articles a day is "mass creation". Stopping (what I think are) incorrect claims about this would stop the endless argumentation. Leaving things as-is will perpetuate the arguing. WhatamIdoing (talk) 22:32, 13 November 2022 (UTC)
- The trouble is that producing a definition doesn't stop the argumentation; it feeds it as the fanatics proceed to argue about the definition, nibbling away at it to try to achieve their goal. Just look at a simple core policy like WP:V which was produced nearly 20 years ago and which still seems to be a constant battleground over issues like WP:ONUS, verifiability vs verification and whatever else. That policy's talk page now has 76 pages of archive. 76!
- The delusion is that making more rules solves problems. This is not a given. Maybe the rules will be counter-productive or dysfunctional. The only certainty is that they will generate more complexity, argument, game-playing and wikilawyering.
- To stop this, we already have the policy WP:CREEP. We just have to apply it.
- Andrew🐉(talk) 00:21, 14 November 2022 (UTC)
- Providing clear definitions stops arguments over what the definition means. Nobody who sees "Speed Limit 25" is going to start a fight about whether 12 is bigger than 25. They might complain about reckless driving, but they won't even try to claim a speed limit violation. WhatamIdoing (talk) 22:05, 14 November 2022 (UTC)
- @Andrew Davidson, IMO to stop the "endless argumentation", we need to define what mass creation is, or at least what it isn't. I am firmly in favor of having a definition that even the most motivated wikilawyer could not re-interpret as meaning that participation in m:100wikidays is "mass creation". Right now, editors such as BilledMammal have argued that creating one or two articles a day is "mass creation". Stopping (what I think are) incorrect claims about this would stop the endless argumentation. Leaving things as-is will perpetuate the arguing. WhatamIdoing (talk) 22:32, 13 November 2022 (UTC)
- From the top of this discussion:
- As there is two version of the wording with numbered limits, could there be two versions of the first question? One as is and one closer to BilledMammal's ideas. I realise it's becoming more unwieldy than the original text, but at least having the options would put these questions to rest. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 15:11, 13 November 2022 (UTC)
- In case you missed it, Scolaire's version of the first question is different from earlier proposals as it adds the requirement that the article creation be rapid; that didn't previously exist.
- My preference is for three options:
- A: A single editor creating a large number of articles based on boilerplate text and referenced to the same small group of sources.
- B: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced to the same small group of sources.
- C: Overturn consensus to create a definition of "article creation at scale"
- However, I believe some editors prefer to include options with explicit numbers. BilledMammal (talk) 15:18, 13 November 2022 (UTC)
- My point was that it didn't include an option for scale. There has been clear preference for options with numeric limits. So what I'm suggesting A, B as you stated, then options C/D would be with limits and E would be not define. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:11, 13 November 2022 (UTC)
- Agreed - many of the objections to proposals within the RfC this came from was that they lacked quantifiable limits so people didn't actually know what they were really voting for. Given that, I'd be dubious about voting for "large" because I'm not sure how that's being defined (which brings us back to the current definition that suggests 25 or more but gives no timeframe...). "Rapidly and large" I can understand and is a better option, but I imagine people would like to see something with actual values in it as an option at least. Blue Square Thing (talk) 18:50, 13 November 2022 (UTC)
- My point was that it didn't include an option for scale. There has been clear preference for options with numeric limits. So what I'm suggesting A, B as you stated, then options C/D would be with limits and E would be not define. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:11, 13 November 2022 (UTC)
Another suggested wording
Another go at a final wording.
Which proposed definition of mass creation should we adopt?
Please rank your choices by listing, in order of preference from most preferred to least preferred:
A: A single editor creating a large number of articles based on boilerplate text and referenced to the same small group of sources.
B: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced only to the same small group of sources.
C: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced only to the same small group of sources.
D: A single editor, creating more than 100 articles per week, based on boilerplate text and referenced only to the same small group of sources.
E: None of the above
-- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:15, 13 November 2022 (UTC)
- If we are including rate definitions with numeric limits we need to include scale definitions with numeric limits. I suggest
A single editor creating more than 100 articles based on boilerplate text and referenced to the same small group of sources.
BilledMammal (talk) 16:21, 13 November 2022 (UTC)- As a modification of option A? -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:23, 13 November 2022 (UTC)
- As an alternative to option A. BilledMammal (talk) 16:24, 13 November 2022 (UTC)
- That seems over kill, is there a specific reason you think it needs inclusion? Someone saying that 100 or more articles isn't a large quantity of articles doesn't seem likely to be accepted as a valid argument. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:36, 13 November 2022 (UTC)
- +1, also we are already coming towards way too many options. More than 3-4 options is going to result in lower RFC participation, which is not good. Steven Walling • talk 16:42, 13 November 2022 (UTC)
- For neutrality; editors who prefer definitions with numeric limits should not be limited to rate based definitions. If we are concerned that there are too many options I would suggest merging C and D, so that there are two rate based definitions and two scale based definitions. BilledMammal (talk) 17:04, 13 November 2022 (UTC)
- That seems over kill, is there a specific reason you think it needs inclusion? Someone saying that 100 or more articles isn't a large quantity of articles doesn't seem likely to be accepted as a valid argument. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:36, 13 November 2022 (UTC)
- As an alternative to option A. BilledMammal (talk) 16:24, 13 November 2022 (UTC)
- The issue with placing a limit of just 100 articles ever, is that it's entirely possible that someone who was editing in 2005 could easily have created 100 articles about really worthwhile stuff, well referenced but based on very similar formats. I'm really not sure that works. The concerns only seem to be valid when the rate of creation overwhelms the ability of systems to deal with it - e.g. NPP, RC etc... That, I imagine, is why we've focussed on rate. 100 articles since 2005 is, what, 1 every couple of months? That's not problematic.
- I can see the point of options B ("rapidly" is the key as it overhwhelms), C and D here and could support any of them. Tbh, I'm not sure if the number in C are too low if we're talking applying this retrospectively, but the numbers can always be changed if there seem to be issues, or people can use common sense when applying things to articles created back in the early days of Wikipedia. I guess we have to have D, so I don't think it's possible to get this down to less than 4 options and still have a sensible set of possibilities. Blue Square Thing (talk) 18:45, 13 November 2022 (UTC)
- I don't think it's possible that someone who was not using a boilerplate would manage to create 100 articles that appear to be using a boilerplate, but that is a discussion that should be held during the RfC.
- The system that was overwhelmed resulting in the ArbCom case and these RfC's was AfD, and for AfD what matters is not the speed of creation but the scale. BilledMammal (talk) 18:53, 13 November 2022 (UTC)
- Not really? When article creation happens slowly, then the community can (and frequently does) intervene early. If I set out to create five thousand articles, at a rate of one per day, and I'm producing garbage, I'll likely be asked to stop within a month (frequently even within days). At that point, AFD only has to manage a couple dozen articles, which is not difficult. It's unlikely that my work will be completely overlooked for 5,000 days (more than 13 years). WhatamIdoing (talk) 22:36, 13 November 2022 (UTC)
- The level of disruption required to get the community to intervene is higher than the level required to cause issues at AfD, even when the article creation is happening rapidly. For example it took years to topic ban Lugnuts from article creation and that is the case for most problematic mass-creators. BilledMammal (talk) 01:58, 14 November 2022 (UTC)
- I don't think so. If I set out to create 5,000 articles, and you decide that article #16, created in day 16, is garbage, why wouldn't you send that lone article to AFD right away?
- Lugnuts' work was accepted at the time of creation. WhatamIdoing (talk) 22:19, 14 November 2022 (UTC)
- That's not the community intervening on your 5000-article plan, that's a single article being sent to AfD. That wouldn't prevent you from continuing to create those 5000 articles, and the community would not do anything whatsoever about you until a much larger number had been taken to AfD, by which point you very well could have 1000+ articles. JoelleJay (talk) 02:35, 16 November 2022 (UTC)
- The level of disruption required to get the community to intervene is higher than the level required to cause issues at AfD, even when the article creation is happening rapidly. For example it took years to topic ban Lugnuts from article creation and that is the case for most problematic mass-creators. BilledMammal (talk) 01:58, 14 November 2022 (UTC)
- Not really? When article creation happens slowly, then the community can (and frequently does) intervene early. If I set out to create five thousand articles, at a rate of one per day, and I'm producing garbage, I'll likely be asked to stop within a month (frequently even within days). At that point, AFD only has to manage a couple dozen articles, which is not difficult. It's unlikely that my work will be completely overlooked for 5,000 days (more than 13 years). WhatamIdoing (talk) 22:36, 13 November 2022 (UTC)
- As a modification of option A? -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 16:23, 13 November 2022 (UTC)
Revised suggested wording
Another go at a final wording.
Which proposed definition of mass creation should we adopt?
Please rank your choices by listing, in order of preference from most preferred to least preferred. Preferences, weighted by strength of argument, will be resolved through IRV.
A: A single editor creating a large number of articles based on boilerplate text and referenced to the same small group of sources.
B: A single editor creating more than 100 articles based on boilerplate text and referenced to the same small group of sources.
C: A single editor, rapidly creating a large number of articles, based on boilerplate text and referenced only to the same small group of sources.
D: A single editor, creating more than 10 articles per day, 20 articles per week or 50 articles per month, based on boilerplate text and referenced only to the same small group of sources.
E: None of the above
BilledMammal (talk) 18:58, 13 November 2022 (UTC)
- I could go with that. Scolaire (talk) 19:51, 13 November 2022 (UTC)
- Also fine with me, I for getting this over and done with. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 20:04, 13 November 2022 (UTC)
- I would like to see the option for "A single editor, creating more than 100 articles per week" restored. I can imagine someone balking at the 50 per month limit (that's less than two per day) and still want to set some sort of rate limit.
- I would also like to see an option that does not require "boilerplate text" or "same small group of sources". In fact, that could be separated out, which would result in two questions:
- What speed or volume qualifies as "mass creation"? (with all of the options)
- MASSCREATE has historically been limited to "automated or semi-automated content page creation". Should the rules around mass creation require pre-approval for "articles based on boilerplate text and referenced to the same small group of sources", or for all articles regardless of content? (If "yes", then the words "based on boilerplate text and referenced to the same small group of sources" will be appended to whichever choice in #1 is chosen; if "no", then they won't, and all articles, regardless of content, would count towards any limit that is adopted. Neither choice directly affects the existing wording about "automated or semi-automated".)
- Finally, I think we need to add some context. I suggest something along these lines:
- "Relatively few editors create more than one article in a day, and except for m:100wikidays, few have ever created more than 100 articles. However, people have had good-faith disagreements over whether certain editing patterns should be subject to the restrictions in WP:MASSCREATE. The Wikipedia:Arbitration Committee/Requests for comment/Article creation at scale RFC concluded with a recommendation that the wording be clarified, to help editors understand which situations are covered by this rule."
- This should give editors who haven't been following this for months some idea of why they're getting this question now. WhatamIdoing (talk) 22:54, 13 November 2022 (UTC)
- That first section appears to lead the question. I think editors should add any statistics to there replies. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 00:30, 14 November 2022 (UTC)
- I do agree about the removal of the higher limit option. There appeared to be a lot of discussion in the original RFC about where it should be set, and the low limit / high limit options gives that a voice. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 00:33, 14 November 2022 (UTC)
- The higher limit I removed because the list was getting too long and there is little difference between it and declaring that mass creation does not exist; over the past few years I believe it would only apply to two editors; Lugnuts and Estopedist1. Both of whom would probably have been able to game it. If such an option is desired I would suggest instead asking
Should mass-create be abolished
as it will have the same result with less WP:CREEP. BilledMammal (talk) 01:55, 14 November 2022 (UTC)
- The higher limit I removed because the list was getting too long and there is little difference between it and declaring that mass creation does not exist; over the past few years I believe it would only apply to two editors; Lugnuts and Estopedist1. Both of whom would probably have been able to game it. If such an option is desired I would suggest instead asking
- This is all still junk. Every definition seems to rely upon the phrase boilerplate text as if that is supposed to be clear. But all articles use boilerplate. We call them templates and they are routinely used for citations, infoboxes, navigation templates and more. And if an article is a stub, it is likely to have a fairly clichéd phrasing as we have a house style in which articles tend to follow a common pattern for that type of topic.
- And talk of a small number of sources is nonsense too as what matters is the quality of sources not their number. For example, see WP:ANYBIO which makes it very clear that "The person has an entry in a country's standard national biographical dictionary" then that's a sure sign of notability. One just needs a single good reference work like that to generate articles with guaranteed notability.
- Andrew🐉(talk) 00:34, 14 November 2022 (UTC)
- I'd be happy to hear suggestions for better wording, the current wording is a bit clunky. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 01:03, 14 November 2022 (UTC)
- Just asking about boilerplate/sources separately should address Andrew's concern. WhatamIdoing (talk) 22:25, 14 November 2022 (UTC)
- The intent is to only cover articles that are entirely based on boilerplate text. I believe it is clear and that most editors here have understood that, but I'm also happy to hear suggestions for better wording.
And talk of a small number of sources is nonsense too as what matters is the quality of sources not their number.
This RfC is to determine what mass creation is; determining whether specific examples of mass creation are problematic is a different discussion.I would agree that an editor working through the Oxford Dictionary of National Biography to create boilerplate stubs on every individual currently lacking an article is not an issue, assuming the boilerplate is suitable, but they are still engaged in mass creation.
BilledMammal (talk) 01:55, 14 November 2022 (UTC)- No one is arguing the mere presence of templates automatically makes an article "boilerplate", that would be stupid. But if editors are making many stubs with only boilerplate text, using the same sources, why shouldn't that be considered mass creation? Also, this RfC is still under the umbrella of the 3B option, so, as explained to you earlier, articles that do not need to meet GNG are not even covered here. JoelleJay (talk) 02:48, 16 November 2022 (UTC)
- I'd be happy to hear suggestions for better wording, the current wording is a bit clunky. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 01:03, 14 November 2022 (UTC)
- Why does the presence or absence of "boilerplate text" matter? Do we have a clear, relevant and unambiguous definition for boilerplate text in this context? Has it been accepted by the community as suitable for this purpose? If so, where is it? The Wikipedia article Boilerplate text is unsuitable as it may change and was not written for this purpose. If we do not have a suitable, accepted-as-policy definition, then trying to define mass creation in terms of boilerplate text is inherently futile. So "E: None of the above" is the only viable option. · · · Peter Southwood (talk): 13:34, 16 November 2022 (UTC)
- Of course we don't have a clear definition of boilerplate text. Also, what people mostly seem to care about is very short articles ("Alice Athlete was a Ruritanian gymnast in the 1928 Olympics."), where the "boilerplate text" is identical to the best practice as outlined in Wikipedia:Manual of Style/Lead section#First sentence. The actual, guaranteed-to-be-mass-created-boilerplate articles that we have approved a bot to create in the past (e.g., thousands of multi-paragraph articles on US cities) don't seem to be meant to be included. WhatamIdoing (talk) 17:29, 17 November 2022 (UTC)
- If a text and format formula is acceptable for one article, why would it not be acceptable for another article on the same class of topic? · · · Peter Southwood (talk): 13:52, 16 November 2022 (UTC)
- As per BilledMammal,
This RfC is to determine what mass creation is; determining whether specific examples of mass creation are problematic is a different discussion.
The proposed definition encompasses both good and problematic mass creation, so there will be plenty of articles based on boilerplate text and few sources that have no threat of being challenged. JoelleJay (talk) 02:26, 17 November 2022 (UTC)
- As per BilledMammal,
Clearly mass creation can be done without "boileplate text" which makes all 5 of the choices not work. But I think that good core elements would be larger amounts of articles with very little content that is unique to the article. North8000 (talk) 18:02, 17 November 2022 (UTC)
- So if I have a database or two and just start copying simple facts from them, without using a boilerplate for the text, none of this applies? — Rhododendrites talk \\ 14:13, 21 November 2022 (UTC)
- Can you give a single example of a large number of articles that do nothing but copy simple facts from a database or two, but format the text differently in each article? Scolaire (talk) 11:06, 22 November 2022 (UTC)
- The structure of D is fine, but the levels are too low. 20 articles per week is less than 3 a day, and 50 articles per month is less than 2 a day. That is a ridiculously low threshold for "mass creation." 10 in a day is arguably reasonable, but if someone creates 15 in one day and then stops, that is hardly difficult to deal with, even if the articles are problematic. D should be reworded as 20 per day, 50 per week and 100 per month, or else that should be a separate option. Rlendog (talk) 16:21, 25 November 2022 (UTC)
- There was a separate option with a higher limit, (
A single editor, creating more than 100 articles per week, based on boilerplate text and referenced only to the same small group of sources.
), but it was removed. -- LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 22:36, 27 November 2022 (UTC)- I think that was a mistake. I, for one, cannot support the limits in the proposal but would be able to support something like what I proposed. If we do not offer more liberal choices and if other editors feel as I do, this will likely be shot down, even though its structure has merit. Rlendog (talk) 00:04, 28 November 2022 (UTC)
- There was a separate option with a higher limit, (
Close/open?
@Valereee and Xeno: This discussion has ground to a halt. Do you still intend to run an RfC? If so, do you want to declare this closed and decide on a question based on all of the above? It seems nothing is going to get universal support. Scolaire (talk) 16:02, 25 November 2022 (UTC)
- Since it’s the idea lab, I don’t feel particularly strongly about needing to formally close it. If the discussion well has run dry, it can be archived naturally. If Valereee or I (or anyone else, for that matter) can glean something useful to spin into another RfC then we will do so :) [I haven’t had a chance to read it in depth yet!]. I do want to thank everyone for contribution their thoughts here. –xenotalk 18:09, 25 November 2022 (UTC)
- Sorry for the delay. I think we probably do need to run an RfC on this, and it's probably necessary before we can run the RfC on article deletions at scale. We're in a pickle, here. Valereee (talk) 14:22, 1 December 2022 (UTC)
- @Valereee and Xeno:
Since it’s the idea lab, I don’t feel particularly strongly about needing to formally close it.
By the same token, there's no need necessarily to try and ascertain what the "consensus" is. That's why I started this thread. There's nothing to stop the two of you from putting your heads together and deciding on a question. This could mean choosing one out of all the proposals or, more likely, framing a different question based on your reading of the discussion. Whatever question is asked, participants are going to suggest alternatives anyway. I'd say, just go for it. Scolaire (talk) 12:01, 2 December 2022 (UTC)
- @Valereee and Xeno:
Deprecate "estimated hits" from search engines
The estimated hits from search engines are not reliable or informative, can be misleading, and should be deprecated. We should reword guidelines to explicitly discourage their use in move request discussions and elsewhere.
As an example:
- On Google:
"antisemitic tropes" ADL
gives me 8,640 estimated hits (runs out on page 13 after 126 results)."antisemitic myths" ADL
: 2,850 estimated hits (runs out after only 81 results)."antisemitic canards" ADL
: 2,290 estimated hits (runs out after only 95 results).
- On Bing:
"antisemitic tropes" ADL
: 233 million esimated hits (runs out on page 15)."antisemitic myths" ADL
: 166 million estimated hits (runs out on page 9)."antisemitic canards" ADL
: 259 million estimated hits (runs out on page 10).
Notice the lowest-used term became the highest. Search engines don't (and can't) survey their entire dataset every time a user completes a search. They give you just around a hundred results using fancy-schmancy algorithms. No search engine has ever claimed these estimates were reliable, and they were never meant to be used for the purposes some editors use them for. Even if they weren't complete spitballs, they're nonselective, and count SEO sites, forums, and junk, which are not relevant to us.
One research paper[1] found that estimated counts vary wildly over short periods of time, and don't fluctuate around a central value, making them practically meaningless. Past examples of unsupported use of search engine estimates on Wikipedia can be found here: [2][3][4] (peep that first one, where HTML tags and nonselectiveness led to a very misleading estimate). No doubt you've encountered other examples. DFlhb (talk) 18:00, 1 December 2022 (UTC)
- Neither "estimated hits" nor "runs out" are any good for anything. In particular the "runs out" number for Google is totally invalid. Google limits hits that it displays to 1,000 and then eliminates duplicates, so, for example a search for "Donald Trump" only displays, for me at this time, 117 results. I don't know about Bing, but it seems that it uses a similar algorithm. Of course there are many reliable sources for "Apple Mac" as can be found by looking at the books found this search, rather than just counting the results. Phil Bridger (talk) 19:29, 1 December 2022 (UTC)
- I stated that "runs out" was invalid when I said
can't survey their entire dataset
andgive you just around a hundred results
. And you made the same mistake I pointed out for "Apple Mac"; the top books are self-published (don't count), and the search ignores formatting, returning many results that contain "Apple's Mac [...]", as well as non-context text like " Apple / Mac" and "Apple - Mac OS X" from thes books' reference lists (also don't count). Which is more informative, that GBooks link, or this? DFlhb (talk) 20:24, 1 December 2022 (UTC)
- I stated that "runs out" was invalid when I said
- I mean, you can't stop people from making whatever argument they wish in such discussions. You can only counter their points with points of your own. The idea that we can make some type of argument verboten and that will somehow magically stop people from trying to make it is just silly. If the use of Ghits is a bad metric for deciding an article title, then just tell people that after they try to make the argument. Wikipedia already contains guidance that recommends against using Ghits as a metric for all kinds of purposes, see WP:GHITS "Although using a search engine like Google can be useful in determining how common or well-known a particular topic is, a large number of hits on a search engine is no guarantee that the subject is suitable for inclusion in Wikipedia. Similarly, a lack of search engine hits may only indicate that the topic is highly specialized or not generally sourceable via the internet.", WP:GTEST 'Raw "hit" (search result) count is a very crude measure of importance. Some unimportant subjects have many "hits", some notable ones have few or none... Hit-count numbers alone can only rarely "prove" anything about notability, without further discussion of the type of hits, what's been searched for, how it was searched, and what interpretation to give the results...A raw hit count should never be relied upon to prove notability.", WP:UCN "When using Google, generally a search of Google Books and News Archive should be defaulted to before a web search, as they concentrate reliable sources...Search engine results are subject to certain biases and technical limitations". Wikipedia already contains the guidance the OP seeks, as shown above, in multiple places. You can't actually force people to refrain from using bad arguments. They're going to make them. Wikipedia already tells people how to not do so. I'm not sure what you want to do other than that, or why you think that "deprecation" would change anything... --Jayron32 20:02, 1 December 2022 (UTC)
- WP:UCRN goes out of its way to suggest using Google, and WP:GTEST incorrectly states these numbers can "Confirm roughly how popularly referenced an expression is". While it also includes your quote, no one does
further discussion of the type of hits, [etc.]
. It also makes me cringe that WP:UCN recommends Google Books, see my reply above. It should recommend Ngram instead; same dataset, but graphs instead of an estimated count. - IMO, the point of Wikipedia proposals is to solve problems systemically, to avoid having to run around and tell people they're wrong in each individual discussion; I have better things to do than that. You're completely right that it won't be fully effective, but I don't see why WP:Requested moves shouldn't include something on this. DFlhb (talk) 20:24, 1 December 2022 (UTC)
- WP:UCRN goes out of its way to say avoid using raw basic Google hit numbers, though it does say that using some of the specialized Google products can be useful. If you're saying "No one should ever use Google to help find any information ever" I'm not sure that's a reasonable position to take. If you are trying to say "Avoid using raw "hits" data from a basic Google search when making decisions about things", Wikipedia already recommends that. See above. --Jayron32 12:13, 2 December 2022 (UTC)
- WP:UCRN goes out of its way to suggest using Google, and WP:GTEST incorrectly states these numbers can "Confirm roughly how popularly referenced an expression is". While it also includes your quote, no one does
Article Editing Reviews
As I'm sure many of you may know, Wikipedia is usually made a meme out of due to the site being considered an unreliable source. One of the main reasons for this sentiment is, according to [5]https://www.dw.com/en/fact-check-as-wikipedia-turns-20-how-credible-is-it/a-56228222#:~:text=So%20is%20Wikipedia%20a%20credible,manipulated%2C%20and%20sometimes%20almost%20undetectably."So is Wikipedia a credible source? Many of the entries are well-documented, checked for quality and — as opposed to reference books — often completely up-to-date, but, 20 years after its creation, the online encyclopedia is not 100% reliable, because information can be manipulated, and sometimes almost undetectably."
So here are list of ideas that I have that do correlate with eachother:
-A certain amount of users must review an edit before it is included in the article.
-Have articles be reviewed by experts of their respective field, and have an article be marked with a token of approval from one or multiple experts. (i.e. "Expert-Certified")
I think we may need different certification tokens depending on HOW many approve of the article.
-Articles may also be reviewed constantly by others for grammar, vocabulary, etc. 198.175.205.71 (talk) 15:13, 30 November 2022 (UTC)
- Wikipedia:Pending changes, when used, requires that a pending changes reviewer review an edit before it will appear to other users. The community has rejected any use of pending changes except in exceptional cases where necessary to protect articles subject to high and continuing levels of vandalism. Due to the limited number of pending changes reviewers, it would disruptive to apply pending changes to all articles. Remember, we are all volunteers, and we cannot expect that large numbers of experienced users would dedicate a large amount of their time on Wikipedia to approving pending changes. Various levels of protection, detection and reversal of obvious vandalism by bots, and the marking of suspect edits in watchlists, provide multiple layers of protection from vandalism. Detection of incorrect information in articles often requires scrutiny by users with particular knowledge of the subject area, and requiring that a subject-area specialist review pending changes would mean that most pending changes would not be reviewed for months or years, or maybe never. I will also note that the community has in the past rejected the idea of certifying experts in a field. As for your last suggestion, we do have a lot of Wikignomes constantly fixing all kinds of things. Donald Albury 15:48, 30 November 2022 (UTC)
- (1) there are two fundamental ways to make an encyclopedia reliable: make sure it's written by someone who knows what they're doing ("Britannica"), or make sure it's proof-read by as wide a range of experts as possible ("Wikipedia"). There's a risk that what you're proposing would combine the worst of both worlds: a situation where the article was written by someone who didn't know what they were doing, and appears in print because it's in a niche area with few reviewers, and it's accepted by one reviewer with a wildly non-neutral point of view, and no one else can be bothered to fight through the bureaucracy to deal with the reviewer's ownership issues.
- (2) In some fields it's hard enough to find someone prepared to handle an AfC, let alone provide any meaningful review based on expert knowledge; we don't want to end up like Citizendium, with its handful of articles that very few people read.
- (3) we don't really need to worry about people thinking Wikipedia is unreliable. Usually this is uttered from the mouths of frustrated teachers and lecturers who don't want 30 copies of the Wikipedia article handed in as homework. They'll feel just the same whoever wrote the article. If people think Wikipedia is unreliable, that's a good thing, because it might encourage them to go and check the references for themselves. And we do have references! So although you're right the debate is always worth having, I favour no change. Elemimele (talk) 17:46, 30 November 2022 (UTC)
- As a teacher, I the reason I tell my students not to cite Wikipedia is not because the information can't be trusted. The reason is IT'S AN ENCYCLOPEDIA. I expect them to go deeper. They need to use the same sorts of sources we do here.--User:Khajidha (talk) (contributions) 21:22, 30 November 2022 (UTC)
- You wrote that "the community has in the past rejected the idea of certifying experts in a field..."
- Hmm, but for me it is obvious, that one can get acquainted with contributions of a user and get to know a field of his/her expertise. Sometimes this process happens not on purpose, and I just know a user, because I used to notice him/her here and there. I understand that I'm not a certification board, and can't give a certificate to the user. But what keeps me from attracting the user's attention (whom I find more or less expert) to one or another article/edit? What do think as an expert in the Wikipedia ethics, @Donald Albury? Tosha Langue (talk) 09:38, 1 December 2022 (UTC)
- Wikipedia editors are anonymous. Even when an editor appears to identify with a real life person, we should not rely on such identification, and outing, i.e., posting any personal identifying information about another user on Wikipedia, is defined as harassment, and may end up with the offender being blocked. Therefore, judgement about the expertise of an editor in any area is dependent on the editor's reputation based entirely on their record of editing on Wikipedia. We do not accept any claims as to expertise in real-life as relevant on Wikipedia. See Essjay controversy for one incident in which self-claims of expertise did not end well. This is not likely to change in the near future. The Wikimedia Foundation is pushing to warn new users to carefully maintain their anonymity, as some Wikipedia users in certain countries have been subject to harassment, and even arrest, for the content of their editing on Wikipedia. Donald Albury 17:09, 1 December 2022 (UTC)
- To comment further, the criteria for what the content of an article states is what is supported by reliable sources, and not what any user, 'expert' or not, says. Experienced and trusted users can help judge the reliability of sources, but, in any dispute, a consensus of such users trumps any single 'expert' on the suitability of content. Donald Albury 17:21, 1 December 2022 (UTC)
- There is also the issue of (often unintentional) ownership behaviour. I can think of a number of examples in scientific fields where a Wikipedia article has obviously been written by someone closely associated with the first people to develop an area. These articles are now fossils; the field has maybe developed, other people have brought different insights, and new stuff has happened, but the article cannot change, because any suggested addition is politely reverted with an edit summary that it's not an improvement because the addition is not as important as the original basic concept, or because the new development is subtly different to the original concept and therefore doesn't belong in the article. If you argue with this, the response will be that the original paper has been cited far more than the reference you're using to try to add further information (which is inevitably true, because anyone who's developed the original idea will have cited the original idea), and the subtle-difference argument cannot be refuted either, because of course the new material is subtly different; otherwise it wouldn't be new.
- There are other scientific articles where I'm aware there are multiple viewpoints, and I strongly suspect the article is written by someone from one of the camps; the author honestly believes that the other camp is debunked and that anyone still maintaining that view is adopting a fringe position, and what they write is genuinely an unreliable source. But unfortunately the other camp feel the same about the first. So in the worst case scenario, the wikipedia article can be gate-kept to reflect only one viewpoint by a well-intentioned expert author. Fortunately I can only think of a handful of such articles in wikipedia, because we're very good at having a big bust-up about it, and it takes considerable effort to WP:OWN an article. I suspect if we insisted on expert reviewing, our expert reviewers would often become Owners, and we'd have a lot more fossils as well as accurate-but-one-sided articles. Elemimele (talk) 10:30, 2 December 2022 (UTC)
- "Experienced and trusted users can help judge the reliability of sources, but, in any dispute, a consensus of such users trumps..."
- This is it, @Donald Albury! But my impression of Wikipedia is that attracting users to help with something is considered as bad manners. We all are volunteers, and we should stay volunteer till the end, and not disturb others, right? Tosha Langue (talk) 10:39, 2 December 2022 (UTC)
- It is allowed to advertise disputes in a neutrally worded notice on Wikipedia:Noticeboards, or on talk pages of users who have made more than trivial contributions to an article, or participated in previous dicussions about disputes related to or similar to the current dispute, as long as it is done in accordance with the guidelines at Wikipedia:Canvassing. What will get you in trouble is canvassing only wiki-users you think will support your position, or canvassing off-wiki to bring in partisans of your cause. Donald Albury 14:23, 2 December 2022 (UTC)
Adding rel="me" tags
Would it be feasible to allow users to add rel="me" to links on their user pages? For context, this is prompted by trying and failing to add a link to my Mastodon to my user page, as that's how Mastodon verifies your account, and I'd like to be able to conclusively confirm that my Mastodon and Wikipedia accounts are the same person as I primarily discuss wikipedia there. I'm sure this would be useful in other situations as well.
This wikimedia extension implements this, but I'm not sure if or how this could be added to wikipedia. TheTranarchist ⚧ Ⓐ (talk) 02:23, 4 December 2022 (UTC)
- @TheTranarchist: afaik, that extension is likely to be deployed to Wikimedia sites in the Near Future™ — we may just have to wait for now.. (I "get around" this by linking to a verified web page which lists my Wikimedia account username) — TheresNoTime (talk • they/them) 02:29, 4 December 2022 (UTC)
Proposals to improve corporation-related coverage
In September, I came across Amazon (company), which was in a pretty poor state (now significantly better). Currently, many of our corporation-related articles have significant issues, including inconsistent structures, excessive length, and relative confusion about the inclusion criteria. Our "Criticisms of ... Inc." articles are pretty uniformly bad, with poor structures, poor readability, and very unclear criteria for inclusion. These problems aren't isolated to specific articles, so I'd suggest we try to address them systemically, to reduce overall editor workload.
Two ideas I propose are:
- Creating a topic-specific manual of style guideline for company-related topics, which would give clear examples of a what's due/undue, how to treat controversy sections (since WP:POVFORK is harder to apply to these articles), and would propose an outline (history, products & services, corporate affairs, etc.) to promote consistency. This should ideally cover both main articles on corporations, and criticism-specific splits. We already have equivalent MoS pages for television or music-related articles, for example.
- More tentatively: clarifying or expanding WP:POV (given the surprising obscurity of WP:PROPORTION among editors of corp-related articles) to clarify its application here. It's currently very clear on scientific issues and BLPs, but when it comes to corporations, many editors interpret WP:DUE to support inclusion of practically every controversy that's been covered in reliable sources. When a new controversy appears, many editors indiscriminately add it to the relevant Wikipedia page, resulting in ever-lengthening articles and posing a maintainability burden. It wastes a lot of editors' time to selectively clean up these additions, and potentially need to argue at length in favor of each removal, if anyone disputes them and the content has been in the article for a while.
- The policy's lack of clarity also means that editors who favor removing such content must base their arguments on inherently ambious and less well-known policies like WP:VNOT, WP:NOT, and WP:10YT. (I'll note that WP:RECENTISM is sadly not policy).
- Policy changes may not be required, but perhaps we could add article notices pointing to WP:PROPORTION that people would see when they click Edit; I'm sure others will have better ideas on systemic ways to minimize editor workloads for these articles.
(As a minor side-note, I want to invite editors to check out Google Chrome version history, iOS version history, and List of iOS and iPadOS devices, since those are somewhat corporation-related too. WP:INDISCRIMINATE relies mostly on editors using "common sense", which seems insufficient. Many of our version history articles are extremely long and copy release notes verbatim, which may constitute copyvio. And many "list of devices" articles are far too detailed and hard to browse (especially with accessibility software). Again, this may not require new policies, but we should likely clarify existing ones or make them more prominent. Please discuss this side-note in a separate subsection to maintain clarity). DFlhb (talk) 20:32, 3 November 2022 (UTC)
- One of the few things from WP:N that should apply to article content as well as article existence is WP:SUSTAINED, in particular if something only shows up in a 24 hour news cycle, but then is never mentioned again in any reliable sources; it's likely not WP:DUE. WP:DUE should deal both with how much mainstream sources cover a topic, but also should have a temporal element. If no one is writing about it after the fact, then it likely isn't worth reporting; it doesn't have any historical significance. I have no problem with Wikipedia using current events to add to an article, but I also think that one of the defenses for removing something from an article should be "It's been X years since this happened, and no reliable source has ever covered since". They say that journalism is the first draft of history, but if it doesn't make it to a second draft, it's probably not worth keeping in an article... --Jayron32 12:03, 4 November 2022 (UTC)
- I would strongly support this; it would likely increase the signal-to-noise ratio of content additions. DFlhb (talk) 13:13, 4 November 2022 (UTC)
- This discussion really should be on VPI - it's got a lot of significant changes considered, which would need to be fully spelt out to be taken to VPP as fully formed proposals. Nosebagbear (talk) 13:29, 4 November 2022 (UTC)
- This angle really needs to be placed into WP:DUE, adding that viewpoints/coverage that only come from a short-term burst of news should not be weighed as heavily as views/coverage that come from the long-tail of some event. Effectively this is codifying parts of WP:RECENTISM that, in how many editors edit WP today, gets readily ignored as facts are added the second something changes without looking towards the longer-term narrative. Masem (t) 14:09, 4 November 2022 (UTC)
- I would strongly support this; it would likely increase the signal-to-noise ratio of content additions. DFlhb (talk) 13:13, 4 November 2022 (UTC)
- The first and foremost concern I have with any changes to how we cover companies, is that we not permit any kind of entering wedge for corporate PR departments and other hired guns to cleanse and polish their clients' reps. --Orange Mike | Talk 14:04, 4 November 2022 (UTC)
- Yes, but on the other hand, we should not be a catalogue of every single time a company is mentioned in a newspaper. If there's was a fire in an Amazon warehouse in 2009, even IF a newspaper wrote an article about it, we don't need to have that event in the Amazon article. How do we decide which events are important enough to cover? We discuss them. There's a middle ground between "allowing corporate shills to cleanse Wikipedia of all negative information" and "cataloguing every single time a company is mentioned in any reliable news article ever". We should not allow the latter to happen merely because we're frightened of the former. --Jayron32 15:00, 4 November 2022 (UTC)
- In my experience most articles on companies suffer from the opposite problem. Amazon is probably a bad example; being one of the largest organisations in human history, it probably attracts a bit more attention than the average company. Usually very few editors are motivated to edit these articles proactively – except for those with an interest in promoting the company, who we'd rather not. The result is that negative coverage gets swept under the rug in favour of peacocky corporate histories and overly-detailed product catalogues (some random examples from Category:Companies in the Nasdaq-100: Amgen, Microchip Technology, Xcel Energy, Intuit). The proliferation of "controversies", "legal issues", etc. sections is a symptom of this. Amazon's appalling treatment of its workers, for example, isn't some free-floating, subjective "criticism" of the company, it's a well-documented fact that is an essential part of its history. Facts like these should be presented we're they're relevant, throughout the article text, not relegated to a few lines in a section at the bottom of the page because they happen to reflect poorly on the subject. – Joe (talk) 15:32, 4 November 2022 (UTC)
- What generally happens with "controversies" sections is that you get something that a company did once, got a lot of short term criticism at it, and some WP editor judges that all a "controversy". For example, YouTube recently had temporarily put 4K videos behind its subscription tiers but two weeks later reverted that. Hypothetically, with the way some WP articles are written, that could have been presented as a "4K video controversy" on the WP article, but in reality the change itself likely merits no mention, and if anything, that could be incorporated as criticism related to YouTube's "changes on the fly" practices.
- Absolutely there are valid criticisms and controversies like Amazon's that are mentioned above, but that has the long-tail of coverage and didn't come out of a incident lasting only a couple days. Masem (t) 15:37, 4 November 2022 (UTC)
- Joe, you do realize that it is quite possible for two things to be wrong at the same time, right? Yes, where controversial corporate practices have received widespread, sustained, coverage then it is likely that is worthwhile to mention in an article. No one, not anyone, zero people, are talking about that sort of thing. What we're talking about here is the sort of minor, one off, singular events that are minor blips in the history of a company getting out-sized attention in an article. It's quite possible for us to BOTH cover the sort of long-term covered controversial corporate practices that maybe aren't in these articles, AND to avoid the "minor incident of the day" reporting that these articles fill up with, often instead of the actual real stuff we should be covering. --Jayron32 15:43, 4 November 2022 (UTC)
- @Jayron32: You seem to be under the impression that my comment was intended to be a rebuttal of or challenge to you, and all I can suggest is that you read it again. – Joe (talk) 15:51, 4 November 2022 (UTC)
- Not at all. I don't form impressions. It's not worthwhile. --Jayron32 15:55, 4 November 2022 (UTC)
- @Jayron32: You seem to be under the impression that my comment was intended to be a rebuttal of or challenge to you, and all I can suggest is that you read it again. – Joe (talk) 15:51, 4 November 2022 (UTC)
I will say that this goes both ways. At Sembcorp, for example, several years of UPEs Gone Wild had turned a fossil fuel company into a socially conscious innovator devoted to positive social change #wow #whoa. On the other hand, on Target there was once an entire section devoted to an incident of somebody sneaking into a Target and playing porno clips over the speakers, once, about a decade ago. jp×g 11:43, 17 November 2022 (UTC)
I'll let this discussion die after this, but others may be interested in Talk:Criticism_of_Apple_Inc.#Strays_away_from_WP:POVFORK. I'm wondering why WP:POVFORK seems to be so rarely applied to non-BLPs. It just makes for way worse articles than what we could have. DFlhb (talk) 09:01, 3 December 2022 (UTC)
Came here after seeing the move from Criticism of Apple Inc. to Practices of Apple Inc., which, as I just wrote at WP:NORN, strikes me as strange. It's not a POVFORK if DUE coverage would overwhelm an already long article, as with Apple, Amazon, Nestle, etc. That some people add UNDUE items to it doesn't mean the article itself is bad -- it means it needs standards for inclusion like any other article comprising smaller events. In some cases, it may make sense to spin out, say, "labor practices of Apple Inc" and a few others to separate articles and then include the parts that are smaller in scope in the main article, but I don't see a bigger problem with "criticism of [company]" articles than I do in company articles. We don't actually have very many of them (many of the pages in that category aren't criticism articles). I'd say this is a cause for improving those articles rather than getting rid of them. — Rhododendrites talk \\ 14:33, 20 December 2022 (UTC)
- I don't actually mind that large numbers of our articles on companies give somewhat WP:UNDUE coverage to negative incidents and stories, while at the same time being pretty bad at reporting financial results, changes in top management and even the basic business activities. The negative stuff quickly vanishes from other media coverage, while it is always easy to find the "official" version. Nobody in their right mind would pay attention to WP for anything except negative stories, & I don't mind if that stays the case. Johnbod (talk) 17:32, 20 December 2022 (UTC)
Don't reply to archives
Is there an easy way to suppress those tempting [reply] links on archives of talk pages? I just nearly replied to an old conversation I'd become engrossed in, forgetting that I was reading an archive, and I doubt I'm the first to make that mistake. I know I could create a bit of JavaScript to do it for me (or anyone else who runs or copies it) – CSS might even suffice – but I was thinking of a more universal solution. Certes (talk) 12:27, 20 December 2022 (UTC)
- Conceptually, __NOEDITSECTION__ should disable it. DMacks (talk) 06:58, 21 December 2022 (UTC)
- As this is a place for ideas, I wonder if we should add that when creating an archive. The page could still be edited using the tab at the top, where the title is visible as reminder that one is editing an archive, but it might prevent accidental changes. Or perhaps the problem is sufficiently rare to ignore. Certes (talk) 12:24, 21 December 2022 (UTC)
- The archive header templates generally already add it. But the reply tool doesn't (yet) honor it. Anomie⚔ 12:32, 21 December 2022 (UTC)
- As this is a place for ideas, I wonder if we should add that when creating an archive. The page could still be edited using the tab at the top, where the title is visible as reminder that one is editing an archive, but it might prevent accidental changes. Or perhaps the problem is sufficiently rare to ignore. Certes (talk) 12:24, 21 December 2022 (UTC)
Feedback for my proposal wanted
I have drafted a proposal called Wikipedia:NotReallySoroka's Law on spelling of currency names and symbols, and your feedback is appreciated. Thank you. NotReallySoroka (talk) 06:46, 21 December 2022 (UTC)
- Non-descriptive title. Content is either trying to say what WP:ENGVAR already says or it's trying to override it and/or MOS:FOREIGN, depending on which way you read it and whether the spelling is one used in English. Overall I'd say it's not necessary and if you want to keep it it should be moved into your user space. Anomie⚔ 12:40, 21 December 2022 (UTC)
- As Anomie says - ENGVAR handles the cases where you can say you have sufficient concurrence with your spelling that it's not just a personal choice (except where that lovely national relevance comes in). Your concept has one appreciable aspect beyond ENGVAR, which is that it suggests that currency spelling can be a truly personal choice. Which it can't. If your spelling is no more than your personal preference, despite what everyone in your country/linguistic grouping thinks, then it's functionally a spelling mistake, and thus should not be encouraged. Nosebagbear (talk) 10:21, 26 December 2022 (UTC)
Alt parameter for Maplink template
The maplink template doesn't have an "alt" parameter. Should it have one? I think the only relevant reading on this would be MOS:ALT. FacetsOfNonStickPans (talk) 19:13, 27 December 2022 (UTC)
A centralized place for feedback on comprehensiveness
I like to work on broad subject articles, so oftentimes, when I want feedback on an article, I don't need a formal peer review or a full evaluation. I essentially just need a few users to look at it and identify the major omissions. It would be immensely helpful if there were a page somewhere that users could make a post about whatever article they're working on, and people that participate on that page would give a few brief thoughts about where the article might need expansion or what aspects of the subject no one has thought to include. Thebiguglyalien (talk) 01:48, 31 December 2022 (UTC)
- @Thebiguglyalien: My experience is that the best place for quick feedback is the talk page of the article. You start a new section with an open discussion round, and ping those authors who are actively working on the page as well. It has the advantage that everyone who joins the writing process at a later point and visits the talk page, is informed too.
- If it's a larger scale project, it may help to organize a task force with a detailed to-do list like our team did for this article series. This solution works quite well.
- I hope, these ideas help. Best wishes and a happy new year! Henni147 (talk) 11:31, 31 December 2022 (UTC)
- This would be my approach if it's a popular article that receives regular attention and updates, but that makes up a very small proportion of Wikipedia's articles overall. Thebiguglyalien (talk) 15:20, 31 December 2022 (UTC)
- Asking a relevant Wikiproject could help. That way you are more likely to get people who are familiar with the sources in the particular topic area than if you ask at a centralized page. Phil Bridger (talk) 17:16, 31 December 2022 (UTC)
- This would be my approach if it's a popular article that receives regular attention and updates, but that makes up a very small proportion of Wikipedia's articles overall. Thebiguglyalien (talk) 15:20, 31 December 2022 (UTC)
WW2 : Lost Figures Task Force
My idea is one to give the men and women who fought for liberty a page. The people who would meet criteria would be Resistance leaders (From any country, France Austria and Low Countries mainly) and low ranking military staff who fought in battles of high importance.
My proposal would be an unofficial task force which requires about 20 people to monitor and create these easy 30min pages. The point of this project is to allow people to more easily know of the people who actually died for our freedom.
Please RSVP on my personal talk page if you want. Multi Lingual people wanted.
Thanks — Preceding unsigned comment added by GeekyDave (talk • contribs) 14:59, 2 January 2023 (UTC)
- You may be interested in WikiProject Military history and its biography task force. Thebiguglyalien (talk) 16:25, 2 January 2023 (UTC)
Skip to Top Buttons
Wiki articles can be very long involving much scrolling especially when using a device with a small screen such as cell/mobile phone or tablet. As these devices are increasing popular it would greatly benefit Wiki readers to have Skip to Top Buttons added to articles. — Preceding unsigned comment added by Jossdickie (talk • contribs) 15:36, 31 December 2022 (UTC)
- @Jossdickie: I've moved this to idea lab, as it would need more development before having a community proposal. But that being said, work on this is sort of already being done; in the new skin, Vector-2022, there is now a floating table to contents that has a return to top capability built in. See what this looks like by using this link: https://en.wikipedia.org/wiki/Saturn?useskin=vector-2022 . The (Top) on there should solve this concern? You can enable this skin for yourself in Special:Preferences#mw-prefsection-rendering. — xaosflux Talk 15:48, 31 December 2022 (UTC)
- Yeah, I use the Vector-2022 (Top) button all the time, very handy when I'm working my watchlist, and have scrolled way down a page to find what I'm interested in, then just scroll up the TOC and hit the (Top) button and then the watchlist button to move on. Donald Albury 20:35, 31 December 2022 (UTC)
I’ve been to your try it link but have no floating contents and only a Go to top at the bottom of the them. I’m using Chrome with iOS 15.3.1 on an iPhone. Could you post a link to the ideas lab? — Preceding unsigned comment added by Jossdickie (talk • contribs) 17:01, 1 January 2023 (UTC)
- @Jossdickie korean wikipedia has a gadget to do this. You can use this gadget by adding:
mw.loader.load("https://ko.wikipedia.org/w/index.php?action=raw&title=미디어위키:Gadget-scrollUpButton.js&ctype=text/javascript");
- to your common.js. This will add a up arrow to the bottom right of your screen which will take you to the top of the page. Terasail[✉️] 20:52, 2 January 2023 (UTC)
- Oh, that is a desktop skin; you are in mobile, not sure. — xaosflux Talk 21:16, 2 January 2023 (UTC)
BOT makers Alert
- no text
- paywall
- free
- ....0mtwb9gd5wx (talk) 11:49, 3 January 2023 (UTC)
Series of RfC's on specific questions about Vector 2022
The general RfC was useful, but due to how broad it was I believe a series of specific yes/no questions, about areas that concerns were raised but insufficiently discussed, would be useful.
At the moment, I have three questions that I believe should be included. Work on wordsmithing these questions, as well as work on determining other relevant questions, would be useful.
- Should Vector 2022 be renamed Tensor?
- Should Vector 2022's default width be full width?
- Should the width toggle for Vector 2022 have permanence for logged out users?
BilledMammal (talk) 01:32, 13 December 2022 (UTC)
- I'm not sure that the third question is pointful. If it could be done, it already would have been. From what I'm hearing, the question amounts to "Should we re-write MediaWiki core from the ground up and double the size and complexity of the cacheing systems?", which is not something that non-technical people actually get to decide, unless they're on the Board and are prepared to appropriate tens of millions of dollars to make it happen. The devs tell us Wikipedia:Don't worry about performance except when they say so, and it sounds like this would be a serious performance problem. Whatamidoing (WMF) (talk) 20:10, 15 December 2022 (UTC)
- Based on this proposal I don't believe such an issue exists; if we can "hack" such a solution in, then the WMF can implement a proper solution. BilledMammal (talk) 00:39, 16 December 2022 (UTC)
- A cookie-based solution with Javascript is possible but would potentially result in the layout shifting after initial page load. Not sure which way would cause more dissatisfaction with non-logged in users. isaacl (talk) 00:47, 16 December 2022 (UTC)
- Pinging Alexis Jazz who, I think, had a method for storing logged-out users' preferences which has become buried in the verbose discussions. Certes (talk) 11:08, 16 December 2022 (UTC)
- Certes, there would be either a layout shift after initial load as isaacl described or the preference would be enforced through a URL parameter which is hackish and may impact cacheing complexity. So if a layout shift is unacceptable, sorry. — Alexis Jazz (talk or ping me) 15:00, 16 December 2022 (UTC)
- Btw, whenever the argument is made in discussions like these that "we should be encouraging people to make accounts", that argument should be countered by saying that if every reader actually did that, the servers would collapse as nothing could be cached anymore. The hackish URL parameter could cause a cache split (for a true/false parameter: double the number of cached versions), but every user actually registering an account would be infinitely worse. — Alexis Jazz (talk or ping me) 11:58, 20 December 2022 (UTC)
- I'm not entirely sure if I understand what a layout shift after initial load is - is it the same thing that happens with the WMF donation banners when they suddenly appear in an article? (Addition: if it's the same thing, then I oppose it for donation ads, but support it for width toggles )199.208.172.35 (talk) 21:27, 16 December 2022 (UTC)
- I think the answer is basically yes. It would be similar to a Flash of unstyled content. Whatamidoing (WMF) (talk) 23:53, 3 January 2023 (UTC)
- Pinging Alexis Jazz who, I think, had a method for storing logged-out users' preferences which has become buried in the verbose discussions. Certes (talk) 11:08, 16 December 2022 (UTC)
- A cookie-based solution with Javascript is possible but would potentially result in the layout shifting after initial page load. Not sure which way would cause more dissatisfaction with non-logged in users. isaacl (talk) 00:47, 16 December 2022 (UTC)
- Based on this proposal I don't believe such an issue exists; if we can "hack" such a solution in, then the WMF can implement a proper solution. BilledMammal (talk) 00:39, 16 December 2022 (UTC)
@BilledMammal: Please, add a question for having a hybrid TOC system like Sushi or Song, which preserves the legacy TOC along with the new one.--Æo (talk) 01:10, 19 December 2022 (UTC)
WP/WMF - WP Development approval process? NPP/New Article changes
What is our process for community suggested development? The WMF IT strategy for existing editors has been sustain, but a lot more community suggested development needs to be done. The NPP petition has led to discussion between growth and NPP (see January 2023 NPP newsletter), Well done! BUT they are also specifying changes to the New Article Wizard Process (also well done). So what is the process
- Has to be on the community wishlist first?
- Should big dev changes have to go through ideas/proposals/RfC? Would that stop changes?
- What should our design philosophy be? One size fits all (like Vector 2022) or allow configuration by project/editor?
- Should we concentrate on building tools or specific changes? (For instance with the article wizard, do we need a better wizard/workflow that can access tables and defaults, ability to configure top of page button interactions/workflow, a drop down menu for all editors...)
- Should the RfC/approval process be for the high level design? Or should the RfC be iterative?
- Do we need a list of problems/issues registry?
@Novem Linguae @Kudpung Wakelamp d[@-@]b (talk) 09:27, 5 January 2023 (UTC)
- Right now there are two things we (the NPP coordinators) are collaborating on with the WMF: 1) improving the PageTriage software, which I am leading, and 2) the new landing page proposal, which @Kudpung is leading. We don't want to burden the wishlist, so we have requested and so far been successful in getting some resources assigned to these two things outside the normal WMF annual plan process, which is good. I think #1 is making progress and doesn't really need much community input. It's uncontroversial and just a lot of code writing and software improvements and technical discussions. #2 is still in the brainstorming stage and could perhaps benefit from more community input. If you're interested in participating in #2, I'd recommend attending our next video conference with the WMF. We may eventually need to get wider feedback or RFC some of the items in #2, but we're not there yet, it's still pretty early in the process. Hope this helps. –Novem Linguae (talk) 09:47, 5 January 2023 (UTC)
- @Wakelamp and Novem Linguae: (since I was pinged): With the arrivals in 2022 of a new CEO and a new CPTO, we have enabled dialogues with them creating almost a new precedent for direct collaboration. The greatly oversimplified Article Wizard made by a volunteer that replaced its useful predecessor without substantive discussion as to its utility is not more than a click-through of some very basic article creation principles. It is neither dynamic nor interactive and offers no help. This will be replaced with a truly interactive system which will be built on the UX expertise brought in by volunteers, the anecdotal experience of New Page Reviewers who have most contact with new users, and the Foundation's Growth Team who will provide the coding and any additional good ideas they might have. It is totally beyond the scope and capacity of the the wishlist, and it needs a proper budget to get it done properly. It is unlikely there will be any iterative RfC as they tend to impede rather than help the progress. The otherwise excellent PageCuration system for which Novem Linguae is leading a major overhaul was developed without any RfC but with solid collaboration between the devs and the New Page Patrollers who new exactly what was wanted. It is hoped that this, together with the ideas for a more appealing and welcoming landing page will ultimately relieve NPP, AfC and other editors from providing lengthy explanations to users who do not understand or do not wish to follow the rules. However it's a major development and in the very earliest stage of basic discussion, don't expect an ETA or even anything to show the community much before autumn (fall) this year. or even later. Kudpung กุดผึ้ง (talk) 11:05, 5 January 2023 (UTC)
- @Kudpung Thank-you for your response to the ping. My concern was that NPP have an ongoing IT budget (every time I see the phrase "community wishlist" I throw large mental rocks at this modern version of Santa's list) , and that drafts go through some sort of community process. I've stated my thoughts often enough about new article:-), so I wont repeat them. A
- @Novem Linguae Thank-you for inviting me to the Zoom, but I suspect I am not in WMF's good books, .BUT If you wish I am happy to review any documents. (I am old school though, and I work with use cases.. although I have retired my IBM flowchart template :-)) Wakelamp d[@-@]b (talk) 10:54, 7 January 2023 (UTC)
A Thing and its Recent Events
Hi, it's not always easy to find info on current events when you start at the "top level". My personal experience is about Brazil and the recent riots, but this topic is not about Brazil or the riots. It could have been about The Faeroe Islands and the increasing amount of environmental pollutants in Minke whales. The problem is to find articles on current issues. To sketch my Brazil example: To find out more about the riots, I went to the Brazil main article. There was a subheading on politics, but that was not even updated to include the recent election, AFAICS. Clicking "Bolsonaro" = same thing. Clicking "Recent History since 1985" = same thing. At last I found the article I was looking for in the left hand menu bar under "Current Events", and I still don't know whether there exists a link to this somewhere deep down in some subarticle on Brazil affairs. So ... would it be possible to include something like "This Entity is subject to an article under Current Affairs" (a tag?), or "Featured in Current Events" (infobox?), or "Current events:/Ongoing developments" with a link to articles on current events or ongoing developments"? TiA! T 84.208.65.62 (talk) 05:41, 9 January 2023 (UTC)
- you are raising some important points. @84.208.65.62 as you note, we should be updating things more regularly. as a start, I suggest that you visit the article 2020s in political history, and update the section for Brazil. Sm8900 (talk) 13:59, 9 January 2023 (UTC)
- More directly nearly every major top level topic has "YYYY in <topic>" lists, like 2023 in Brazil, which are good places for key events like this. Masem (t) 14:25, 9 January 2023 (UTC)
- Hi, thank you both for proposals. I don't want to argue, really, but it seems to me that these proposals do not address the core of the problem. Structurally, they are equal to advising going straight to Current Events, or: "why don't you go to where the information is?" (this is not meant to be sarcastic, I just want to pinpoint clearly what the problem is). It still won't help the person who is not aware of the categories and articles you mention and who tries to find information starting with the Top article, in this case "Brazil". But perhaps it is as you say, the key is simply to update e.g. the Brazil article (while keeping in mind Blueboar's caveats). That would turn my problem into a non-problem, which is perhaps the most desirable solution. Thx anyhow. T 84.208.65.62 (talk) 15:45, 9 January 2023 (UTC)
- Hi, thx for your advice and kind invitation to WP. I've been IPing around WP since 2013 and am comfortable in that station. I'm not a native EN speaker and do not feel qualified to add extensive amounts of text, not to mention entire articles, instead I dabble in mopping up minutest errors like typos and such. Happy 2023 :) T 84.208.65.62 (talk) 15:58, 9 January 2023 (UTC)
- More directly nearly every major top level topic has "YYYY in <topic>" lists, like 2023 in Brazil, which are good places for key events like this. Masem (t) 14:25, 9 January 2023 (UTC)
- Although, please see WP:NOTNEWS. While we do want to cover recent events, we also need to hesitate a bit… to ensure that we have enough sourcing to place the events into proper context, and to give those sources a chance to separate rumor from fact. This can be difficult in fluid situations such as riots and similar upheavals. Blueboar (talk) 14:22, 9 January 2023 (UTC)
- Hi, thx. for the advice. Just to be as clear as can be, I'm not advocating creating new "Current Events" articles, merely to allow (perhaps inexperienced) users to somehow join or trace an existing CE article from or to its main entity article, in cases where said CE article is not included or linked from or to its main entity article. T 84.208.65.62 (talk) 15:49, 9 January 2023 (UTC)
Asking for the creation of new sources to cite
A Wikipedia namespace page facilitating Wikipedia editors to ask for the references that are needed for articles in the article/main namespace. Rather than wait passively for a usable reference to organically emerge, a reference can be actively sought (not pro-pro-actively), just actively sought, just listed. Feasibility basics apply such as one open request per user. FacetsOfNonStickPans (talk) 11:33, 10 January 2023 (UTC)
- A wonderful request mechanism is already implemented in a localized and effective manner at Wikipedia:Graphics Lab/Map workshop. However requesting for sources may be a tad different. FacetsOfNonStickPans (talk) 12:38, 10 January 2023 (UTC)
- I think it would be better if we make a page for it @FacetsOfNonStickPans. Maybe WP:Potential sources by topic would be a good page name? CactiStaccingCrane (talk) 14:12, 10 January 2023 (UTC)
- We (sort of) have this already… if you need a source, try asking at the appropriate WP:Reference desk. Blueboar (talk) 14:38, 10 January 2023 (UTC)
- Or, just go digging around in your favorite search engine. When I run across items that I cannot use immediately, but which look like they might be a potential source for something, I add them in a Template:Refideas section on the talk page of an appropriate article (see , for example, Talk:Shell ring), or, if an appropriate article doesn't exist yet, on a subpage of my user page. - Donald Albury 18:47, 10 January 2023 (UTC)
- We should clarify that we're not literally asking for the creation of new sources. Certes (talk) 17:03, 12 January 2023 (UTC)
Frivolous proposals at village pump
Is there something that can or should be done about “serious” (non-idea lab) proposals that fail the WP:SNOW test at the Village Pump? Recently we’ve had such unworkable and patently trivial proposals as “change the Wikipedia slogan” or “add percentage of women ministers in government to country infoboxes”. These would never have passed rough consensus at idea lab, let alone an actual vote. Dronebogus (talk) 01:56, 12 January 2023 (UTC)
- I understand your point, but in my opinion, the whole point of the "Idea tab" is to provide a non-judgmental forum, precisley to include ideas which might not be considered credible in other venues. Sm8900 (talk) 15:42, 12 January 2023 (UTC)
- I’m not talking about the idea lab, I’m talking about the proposals section of the pump. Dronebogus (talk) 23:40, 12 January 2023 (UTC)
- For proposals that have no chance of passing, the most efficient way to deal with them is to ignore them. Don't give them any oxygen. isaacl (talk) 05:06, 13 January 2023 (UTC)
need help with assessment
would anyone like to assist with assessing articles at WikiProject History? I just made some improvements to the assessment page for that Wikiproject. feel free to let me know. thanks!! Sm8900 (talk) 21:27, 12 January 2023 (UTC)
- @Sm8900 Side-note that you may want to check Wikipedia:Village_pump_(miscellaneous)#Improper_handling_of_assessment_for_inactive_WikiProjects. Piotr Konieczny aka Prokonsul Piotrus| reply here 05:53, 13 January 2023 (UTC)
- @Prokonsul Piotrus, thanks so much! I agree entirely with all of your points at that section. I appreciate you letting me know. thanks! Sm8900 (talk) 12:57, 13 January 2023 (UTC)
Making ANI less cesspit-like
Wikipedia:Administrators' noticeboard/Incidents as it stands right now is a free-for-all basically. How can we fix this? CactiStaccingCrane (talk) 10:19, 10 January 2023 (UTC)
- It will be fixed when people stop behaving badly in the rest of Wikipedia. --Jayron32 13:44, 10 January 2023 (UTC)
- That's not a really crazy thing to say as most of the long ANI threads involves only experienced users. I think there should be something that's needed to be done about this. CactiStaccingCrane (talk) 13:49, 10 January 2023 (UTC)
- I think a great start would be for somebody to propose a solution, which you seemingly haven't. I don't really have one either - I do think it's weird that a noticeboard with Administrators' in the title has a lot of regulars that aren't administrators. Do we need to change that? I don't know. casualdejekyll 13:46, 10 January 2023 (UTC)
- I've kinda done that here, but it doesn't seem to attract many comments. CactiStaccingCrane (talk) 13:47, 10 January 2023 (UTC)
- To illustrate the issue with trying to get editors to align on desirable behaviour: in the first item of your proposal at the talk page for the administrator's noticeboard, you propose using neutral language to discuss issues, and yet you repeated your use of "cesspit" in the heading for this discussion. Surely you could have found a word that describes the situation from your point of view that has more neutral connotations? "Free-for-all" in your lead sentence might not be perfect, but personally I feel it is more suitable. But with English Wikipedia's consensus-based decision-making traditions, no one person can decide what's neutral. Any proposal needs to do more than lay out end goals, but also address how to get there. isaacl (talk) 17:08, 10 January 2023 (UTC)
- What I meant by making the headers more neutral is to not make it provocative to the editors. Yes, it's kinda ironic that I've also made this header provocative with "cesspit" but there's something to be said when the people at ANI calls the other person in the header like "Hounding and edit warring by X", "Papua Conflict, 3rd mediator, Nationalist Agenda", etc. It seems to be that this is not a good way to build a constructive discussion and rather a good way to build drama and tension. CactiStaccingCrane (talk) 23:36, 10 January 2023 (UTC)
- By the time something reaches ANI, there is already drama and tension. Thousands of discussions every day happen, and most of them don't end up at ANI, because the participants are already willing to discuss things in good faith with one another. By the time someone's asking for administrative intervention, those normal expectations of civil and reasonable discussion have, presumably, already broken down. So, how do you propose to ask an editor who comes to ANI because someone else called them an "asshole with the intelligence of a brick" not be upset at that point? (And how do you propose to handle it if the editor in question really was acting foolish, and the other person dealing with them finally made a nasty comment out of frustration with that?) I'm not asking rhetorically—if you have better ideas for calming situations like that, I think we would all love to hear them. But it's not as easy as "Just be cool, alright?". If it were that easy, we wouldn't need ANI in the first place. Seraphimblade Talk to me 23:44, 10 January 2023 (UTC)
- I actually think that most discussions shouldn't be in ANI in the first place. Plenty of drama can be averted if Wikipedia:Third opinion is used more frequently, and trivial matters such as blocking an IP for vandalism can be moved to Wikipedia:Administrator intervention against vandalism. ANI should only be for the more egregious cases when alternative venues has been exhausted. CactiStaccingCrane (talk) 23:50, 10 January 2023 (UTC)
- Generally, if it made it to ANI it's past the 3O threshold. Although there are some content issues like that, it's more often not. Also, almost all of the vandalism ends up at AIV, and the ones that make it to ANI are handled very quickly. The problem is sometimes people don't get along, and by the time it reaches ANI it's normally boiling over already. ScottishFinnishRadish (talk) 23:59, 10 January 2023 (UTC)
- Most discussions already aren't at the incidents noticeboard, and editors who (knowingly or unknowingly) skip the more appropriate venues are redirected to them. I don't think matters such as blocking IP addresses for vandalism are the sources of acrimonious discussion about which you are concerned. The third opinion process is explicitly limited to cases where there exactly two editors involved in the discussion, and the third opinion has no special weight, so it won't resolve highly contentious disputes.
- My suggestion to try to dampen the escalation of contention is to ask participants to engage in a round-robin discussion phase, where a moderator decides when to move onto the next round, and when to exit the round-robin discussion phase. Participants can make one comment per round. Unfortunately, like many other proposals, this proposal relies on the good will of participants to agree to this procedure, and it also needs someone willing to act as moderator. To date, I have only seen a few (maybe a couple?) editors express interest in this approach, probably because many editors don't want their comments to be throttled in either number or time. I appreciate that it's difficult to keep a large group of editors in different time zones actively engaging in discussion, and it gets harder as the discussion extends further, so I understand why some want to be able to discuss matters as fast as possible. isaacl (talk) 02:41, 11 January 2023 (UTC)
So, how do you propose to ask an editor who comes to ANI because someone else called them an "asshole with the intelligence of a brick" not be upset at that point? (And how do you propose to handle it if the editor in question really was acting foolish, and the other person dealing with them finally made a nasty comment out of frustration with that?) I'm not asking rhetorically—if you have better ideas for calming situations like that, I think we would all love to hear them.
- This point really captures the crux of the issue for me. I have an idea for situations like that, and it's for Wikipedia to do the same thing that we humans do to handle these situations in the "real world": hire professional "police". I'm entirely serious: we should take some of those hundreds of millions of donated dollars and use them to hire, train, and pay people to be the first responders for conduct disputes like "someone called me an asshole". The professionals--who would be entirely uninvolved, unlike fellow editors who currently field such complaints--would process complaints in accordance with policies set by the volunteer community, and they would be supervised by and answerable to the volunteer community (who should retain the power to overrule/suspend/terminate any individual professional for poor performance). This would eliminate multiple problems: (1) complaints being handled by people who have an inherent conflict of interest because they're fellow editors, (2) complaints being handled by amateurs with no dispute resolution training, experience, and, sometimes, skill, (3) complaints being handled (or not handled) by people who don't care and have no reason to care, and (4) no real procedure to 'watch the watchers' or review whether the people handling complaints are doing it well (TBANs from ANI are rare; unhelpful participation, less so). These are some of the reasons why in the real world we've learned to hire professional police with civilian oversight, as well as professional judges, rather than just having a "mob court" run by untrained, unsupervised volunteers from the community. We have the resources (the money) to do the same. And, yes, I'm suggesting T&S should handle ANI, and the community should only get involved in setting the policy that T&S enforces and hearing appeals of T&S decisions. Levivich (talk) 20:59, 15 January 2023 (UTC)
- I actually think that most discussions shouldn't be in ANI in the first place. Plenty of drama can be averted if Wikipedia:Third opinion is used more frequently, and trivial matters such as blocking an IP for vandalism can be moved to Wikipedia:Administrator intervention against vandalism. ANI should only be for the more egregious cases when alternative venues has been exhausted. CactiStaccingCrane (talk) 23:50, 10 January 2023 (UTC)
- By the time something reaches ANI, there is already drama and tension. Thousands of discussions every day happen, and most of them don't end up at ANI, because the participants are already willing to discuss things in good faith with one another. By the time someone's asking for administrative intervention, those normal expectations of civil and reasonable discussion have, presumably, already broken down. So, how do you propose to ask an editor who comes to ANI because someone else called them an "asshole with the intelligence of a brick" not be upset at that point? (And how do you propose to handle it if the editor in question really was acting foolish, and the other person dealing with them finally made a nasty comment out of frustration with that?) I'm not asking rhetorically—if you have better ideas for calming situations like that, I think we would all love to hear them. But it's not as easy as "Just be cool, alright?". If it were that easy, we wouldn't need ANI in the first place. Seraphimblade Talk to me 23:44, 10 January 2023 (UTC)
- What I meant by making the headers more neutral is to not make it provocative to the editors. Yes, it's kinda ironic that I've also made this header provocative with "cesspit" but there's something to be said when the people at ANI calls the other person in the header like "Hounding and edit warring by X", "Papua Conflict, 3rd mediator, Nationalist Agenda", etc. It seems to be that this is not a good way to build a constructive discussion and rather a good way to build drama and tension. CactiStaccingCrane (talk) 23:36, 10 January 2023 (UTC)
- To illustrate the issue with trying to get editors to align on desirable behaviour: in the first item of your proposal at the talk page for the administrator's noticeboard, you propose using neutral language to discuss issues, and yet you repeated your use of "cesspit" in the heading for this discussion. Surely you could have found a word that describes the situation from your point of view that has more neutral connotations? "Free-for-all" in your lead sentence might not be perfect, but personally I feel it is more suitable. But with English Wikipedia's consensus-based decision-making traditions, no one person can decide what's neutral. Any proposal needs to do more than lay out end goals, but also address how to get there. isaacl (talk) 17:08, 10 January 2023 (UTC)
- I've kinda done that here, but it doesn't seem to attract many comments. CactiStaccingCrane (talk) 13:47, 10 January 2023 (UTC)
- For reference, here's the thread when you also raised this topic last year: Wikipedia talk:Administrators' noticeboard/Archive 16 § Making ANI less toxic isaacl (talk) 05:03, 11 January 2023 (UTC)
- I have wondered whether we could re-structure ANI so that it's not so big. Perhaps automatically rotate between the days of the week, e.g., [[Wikipedia:Administrators' noticeboard/Incidents/Monday]] for all threads started on a Monday? The display could be the same (just transclude the pages; several large wikis have used this approach for their village pumps for years) and the archiving can be the same, but you'd have a smaller subset to work on at any given point in time.
- This began as a technical thought. There are 1.2 million revisions of that page. If we started fresh every now and again (e.g., [[Wikipedia:Administrators' noticeboard/Incidents 2023]]), there are some stats tools that could track those pages but currently give up because the history is too long.
- Now I wonder whether making the page smaller in some way (subject, date, etc.) would make the page seem more manageable to people who don't love the drama but would be willing to help out with a little. You could watchlist the one that you wanted, without having to see, and feel discouraged by, all the other incidents. Someone who wanted to help out, but not have to read everything, could choose to put one or two days' pages on their watch lists, and leave the other days to other people. Whatamidoing (WMF) (talk) 22:28, 11 January 2023 (UTC)
- In that vein, it used to be more common to create subpages for extended discussions or topics that generated repeat ANI threads; I don't know why that practice stopped. Legoktm (talk) 00:04, 12 January 2023 (UTC)
- I strongly support creating subpages for each individual complaint. Dronebogus (talk) 01:43, 12 January 2023 (UTC)
- In that vein, it used to be more common to create subpages for extended discussions or topics that generated repeat ANI threads; I don't know why that practice stopped. Legoktm (talk) 00:04, 12 January 2023 (UTC)
I think uninvolved administrators need to be given broader discretion to clerk discussions and tell editors leaving divisive or otherwise unhelpful comments to stop commenting on threads or on the board entirely. This seems to work at other venues that have tougher restrictions like WP:AE, ArbCom, SPI, etc. It seems that in most discussions non-admins are dominating the discussion; while I certainly would not support keeping them out of the discussion, it is somewhat ironic and counterintuitive for an admin noticeboard to operate that way. --Rschen7754 01:10, 12 January 2023 (UTC)
- This may be a rare case of more bureaucracy being better. The easiest way to restrict conflict is to restrict overall discussion in a formalized manner similar to ArbCom. ANI already is the equivalent of a civil dispute court if ArbCom is a criminal court, so why not make it similar to how ArbCom handles things? Dronebogus (talk) 01:47, 12 January 2023 (UTC)
- What do you mean "similar to how ArbCom handles things"? If you mean like how they do cases, the whole system there seems designed to strongly discourage discussion except among arbs, and makes any back-and-forth in the "preliminary statements" nearly impossible to follow after the fact. Anomie⚔ 12:52, 12 January 2023 (UTC)
Make the top navigation bar fixed
I noticed this when I'd scroll and want to get to get back to the top to perform a search. On large pages, this gets really tedious and I use Wikipedia search engine a lot. Can we please correct this by making the top navigation bar have a fixed position regardless of scrolling both on mobile and desktop? This should be the case regardless of the user skin. — Python Drink (talk) 09:39, 15 January 2023 (UTC)
- No, it should not be the case regardless of the user skin. Some skins don't even have the search bar you're talking about in the top navigation bar (Monobook has it in the sidebar instead, for example). If you want the bar's position fixed but aren't using a skin like Timeless where that's already the case, you can try customizing your own user CSS to make it happen. Anomie⚔ 17:50, 15 January 2023 (UTC)
- @Python Drink You might be interested in my user CSS which does exactly this, for all screen widths. mw:User:Quiddity/Vector-2022-condensed.css. I.e. Even on a phone, if you use the "Desktop" version (link in every page-footer) and Vector-2022, it will keep a fixed top-bar when you scroll. It does a lot of other things, and I've tried to add explanatory comments thoughout, so you may prefer to only use parts of it, or may like to import the whole thing directly (my example) into your global.css. Hope that helps! Quiddity (talk) 20:13, 17 January 2023 (UTC)
- You will be happy to know that Vector 2022, the new skin that WMF has been developing, will have a fixed top bar. IznoPublic (talk) 19:13, 15 January 2023 (UTC)
- @IznoPublic, I hope it applies to mobile as well. — Python Drink (talk) 12:41, 16 January 2023 (UTC)
- Not yet, but maybe later. IznoPublic (talk) 16:51, 16 January 2023 (UTC)
- @IznoPublic, I hope it applies to mobile as well. — Python Drink (talk) 12:41, 16 January 2023 (UTC)
Persistent incorrect use of a parameter of Template:Clarify
I have noticed that some users of Template:Clarify attempt to add a reason using an unnamed or numbered parameter instead of correct reason=
. I don't know how common this really is, but this came up at Hexachlorophosphazene (permanent link):
{{chem2|NH3 + [PCl4]+ → "HN\dPCl3" + HCl}}{{clarification needed|Incorrect chemical reaction! Where the + charge from the [PCl4]+ is missing at the right side of the reaction!|date=October 2022}}}}
- NH3 + [PCl4]+ → "HN=PCl3" + HCl[clarification needed]
which should be:
{{chem2|NH3 + [PCl4]+ → "HN\dPCl3" + HCl}}{{clarification needed|reason=Incorrect chemical reaction! Where the + charge from the [PCl4]+ is missing at the right side of the reaction!|date=October 2022}}}}
- NH3 + [PCl4]+ → "HN=PCl3" + HCl[clarification needed]
If you are using a platform with a tooltip, hovering over the tag in the clarify tag in the latter example should give the contents of the reason=
parameter instead of the default message.
Since renaming the parameter to 1=
might lead to the opposite issue, should we automatically correct this common mistake with a bot, or add built-in support for the unnamed parameter as an alias of reason=
? I strongly suspect that other issue tags with a similar parameter (such as Template:Dubious) have the same problem, but again, I don't know how to verify this. –LaundryPizza03 (dc̄) 12:49, 16 January 2023 (UTC)
- @LaundryPizza03 Why not allow the use both named and unnamed parameters in the template? You can use something like
{{{reason|{{{1|The text near this tag may need clarification or removal of jargon.}}}}}}
, so that if the template has a reason parameter set it uses that as the reason, if not it uses the first unnamed parameter, and if neither of those exist it uses the default text. 192.76.8.75 (talk) 11:30, 17 January 2023 (UTC) - Using the first unnamed parameter as a reason when
|reason=
is not specified would boost usability at very little cost. Something as simple as{{{reason|{{{1|The text near this tag may need clarification or removal of jargon.}}} }}}
should do the job. Certes (talk) 12:00, 17 January 2023 (UTC)- I don't understand what specific problem this causes. It doesn't produce the tooltip, but maybe I don't want the tooltip to be produced (e.g., if I'm writing something snippy or that names a particular editor, which readers shouldn't see). WhatamIdoing (talk) 20:47, 17 January 2023 (UTC)
- If you specifically don't want the text to appear in a tooltip, you can always use a comment in the wikitext. This has the advantage of making it clearer that you don't intend it to be part of a tooltip through the reason parameter, so people are less likely to "fix" it in an attempt to be helpful. Caeciliusinhorto (talk) 21:14, 17 January 2023 (UTC)
- Do you use an unused unnamed parameter as a comment? If so then just write
|comment=Is this the man born 1876
or similar with any other unused parameter but, as Caeciliusinhorto suggests, an HTML comment would be better. Certes (talk) 16:24, 18 January 2023 (UTC)
- I don't understand what specific problem this causes. It doesn't produce the tooltip, but maybe I don't want the tooltip to be produced (e.g., if I'm writing something snippy or that names a particular editor, which readers shouldn't see). WhatamIdoing (talk) 20:47, 17 January 2023 (UTC)
Make an Internet 1.0~2.0 conservatory: Internet Stamps, Userbars, Avatars + Signatures, Referal buttons, other cultures from forums and more.
I was trying to find about stamps on internet, but I only found other things like postal stamps instead. Wikipedia has an article about Userbars in Spanish Wiki version, but there's nothing in English's version. I find it a shame because those things were part of 2000'~ internet culture, and there's almost nothing about it online, even when these graphics have got a ton of rules that everyone followed, and made it one more of the group.
For example: Avatars used to be 100x100, 150x150 and some weird cases 120x200 Buttons were always 88x31, userbars 350x19 with a diagonal strip pattern and the used font was Visitor at 10px Stamps were 98x54 or around those sizes, always with the same border with some variations, animated or colored, with visitor font or custom text inside.
You wouldn't find anything that will break those rules, even if there's a couple of cases with userbars and stamps, most of them were always the same size, or people made a couple of different sizes to let you combine when you put the button and link from another page into your page. (MSN Groups memories, anyone?)
Another concepts like "png" or "render" that was used for taking out the background of an image (usually characters from anime pics or other things), and the term recoloring, that was used for changing the colors of those images. RPC culture (role playing character) that used ALSO other images as base for making a character, mostly used for roleplaying on forums (Also a lot of people making bases for the same reason)
If you weren't in those ages, you will never know anything about this, and there's a couple more that I wouldn't even remember. It's a shame that it isn't anywhere when it is part of internet's story that was quite strong in those times, not for nothing there's a person that made a page for storing the old net things, and also people that made a new "geocities" to mantain the old internet customs and let new people experiment and try it. (Also people trying to save info about Geocities, like OoCities and Reocities)
And yet, there's almost nothing about this part of internet's story on Wikipedia. 181.28.91.80 (talk) 18:10, 18 January 2023 (UTC)
- If something is missing from Wikipedia that should be there, then by all means go ahead and add it. --Jayron32 18:11, 18 January 2023 (UTC)
- I would like but, I'm not quite used to Wikipedia as editor, I would need help with the article and even more with the info because I know a couple of parts of the story, people on their 30' or 40' would remember more than me, also where to connect it to which other article that is about that internet era so it can have people that notices about it?, that's a thing that I'm not sure too how to intervene with. 181.28.91.80 (talk) 18:21, 18 January 2023 (UTC)
- Just in case; I'm not sure if this is the correct zone to post this, maybe I should have posted it here instead. 181.28.91.80 (talk) 18:15, 18 January 2023 (UTC)
- Anything that has significant coverage in independent reliable secondary sources can potentially be an article. I was there at the time (I was using the Internet before Tim Berners-Lee invented the World Wide Web), but I have to say that I don't remember there being standards for such things, formal or informal. Phil Bridger (talk) 18:58, 18 January 2023 (UTC)
- I know that those populated a lot between 2006 and 2014 before social medias swallowed everything.
- Those things were more seen on forums oriented to games, anime, comics, rolplay, and other type of forums that were about entertainment or full of creative people, like deviantart and similar social medias that allowed custom things.
- Some of them limited your image display to some sizes, being 100x100 one of them, like a standard, though there were a couple more too.
- It's hard to find those pages or forums as examples because they're gone. But there was a trend there.
- I remember that there were also people putting custom cursors, stars and things that fell from the cursor, that was a thing too. Cinni had saved that thing pretty well. I would need more people to check those things. 201.177.253.223 (talk) 05:29, 19 January 2023 (UTC)
- You seem to be talking about things (phenomena) that have proven to be ephemeral. Borrowing from the guideline at Wikipedia:Notability (events)#Inclusion criteria, I think something like the following would apply to the phenomena you are describing:
- Phenomena are probably notable if they have enduring historical significance and meet the general notability guideline, or if they have a significant lasting effect.
- Phenomena are also very likely to be notable if they have widespread (national or international) impact and were very widely covered in diverse sources, especially if also re-analyzed afterwards (as described below).
- Phenomena having lesser coverage or more limited scope may or may not be notable.
- Routine kinds of phenomena – whether or not widely reported at the time – are usually not notable unless something further gives them additional enduring significance.
- So, if reliable sources establishing the notability of these phenomena you are talking about are no longer available, then they have not had the enduring significance that would earn them a place in an encyclopedia.
- Donald Albury 13:31, 19 January 2023 (UTC)
- You seem to be talking about things (phenomena) that have proven to be ephemeral. Borrowing from the guideline at Wikipedia:Notability (events)#Inclusion criteria, I think something like the following would apply to the phenomena you are describing:
- Anything that has significant coverage in independent reliable secondary sources can potentially be an article. I was there at the time (I was using the Internet before Tim Berners-Lee invented the World Wide Web), but I have to say that I don't remember there being standards for such things, formal or informal. Phil Bridger (talk) 18:58, 18 January 2023 (UTC)