Jump to content

User:Eastlaw/Not all business articles are spam

From Wikipedia, the free encyclopedia
Like Wikipedia itself, this essay is a work in progress. I invite anyone reading this to help me improve it, or otherwise to leave some constructive criticism on the talk page.

Nobody likes spam. It is an annoyance, whether in e-mail, on discussion boards, or on a wiki such as this one. And since Wikipedia is such a popular website, it is understandable that those people with businesses, products, or musical groups to promote would want to use Wikipedia's visibility to hawk their wares. This, of course, is why there are procedures for eliminating spam from Wikipedia.

Unfortunately, the combination of hypervigilance over spam in Wikipedia, combined with an increasingly deletionist attitude among Wikipedia's editors, has led to a rather disturbing phenomenon. I am referring here to situations in which overzealous new page patrollers and recent changes patrollers decide to place speedy deletion tags (especially {{db-spam}}) on articles which are otherwise properly sourced and assert notability, solely for the reason that the article is about a private company or organization.

My purpose in writing this essay is to explain why such behavior can be a problem, and explore potential solutions which will allow the elimination of spam articles without disrupting the creation of new content or disturbing editors who contribute such content.

Why this is a problem

[edit]

Discouraging contributors

[edit]

Deletion and other cleanup tasks are an inevitable part of keeping Wikipedia functioning and reliable. But overreliance on speedy deletion, particularly right at the time when new articles are created and submitted, creates a sort of chilling effect on contributors of new material. It irritates longtime contributing editors, and it also discourages newcomers from staying and contributing more material. It sends the message that "your contributions are unwelcome, no matter how closely you followed the rules". This is a violation of the tenet "assume good faith", as it makes the automatic assumption that the contributor is out to spam Wikipedia, and treats the person as if he were guilty until proven innocent.

Being overly trigger-happy about speedily deleting articles suppresses the creation of good content, by presenting editors with the threat that all your hours of good work and edits will be destroyed by a single, anonymous person, whom you've never met and who probably has little to no knowledge of the subject about which you are writing. (Which, in and of itself, demonstrates why following process is so important, but I'll get more into that another time.)

Websites, including Wikipedia, are nothing without content. The people who do research and write articles are the ones who build Wikipedia--the English Wikipedia didn't get to 2.5 million articles by means of hyper-deletionism. Discouraging the creation of new content is damaging to this project, and any behavior which discourages such creativity must be minimized as much as possible.

Additional work for admins

[edit]

The overuse of speedy deletion tags also creates more work for administrators, who already have a large backlog to work through. And while many articles which get sent to WP:CSD are truly crap which should be deleted (e.g. Myspace bands no one has heard of, advertisements for shady online pharmacies, stuff that kids made up in school, etc.), plenty of others are potentially viable, either as is or with a modicum of improvement. Nominating articles for speedy deletion without examination of the proper criteria simply results in more pages for admins to examine, when they have many other better things to do with their time.

How to solve it

[edit]

This problem is solvable, but it will take effort from everyone involved--admins, longtime regular editors, and relatively new users alike.

When all else fails, read the policies

[edit]

The first, and perhaps most important step in addressing this issue is for people to actually read and understand the criteria for speedy deletion and the Wikipedia:Notability guidelines, and make a good-faith effort to actually comply with them, in spirit if not in letter.

Since this essay is focusing mostly on spam, let me refer you to the basic notability criterion for speedy deletion, WP:CSD#A7 (emphasis mine):

[T]o avoid speedy deletion an article does not have to prove that its subject is notable, just give a reasonable indication of why it might be notable. A7 applies only to articles about web content or articles on people and organizations themselves, not articles on their books, albums, software and so on.

What this means is that the article must assert notability. This is a lower standard than merely being "non-notable"--it means that the author hasn't even taken the time to explain why the subject of the article might be notable.

Now let's have a look at WP:CSD#G11, the anti-spam criterion:

Blatant advertising. Pages which exclusively promote some entity and which would need to be fundamentally rewritten to become encyclopedic. Note that simply having a company or product as its subject does not qualify an article for this criterion.

Further down the page at WP:CSD are a list of Non-criteria for speedy deletion:

Articles that seem to have obviously non-notable subjects are only eligible for speedy deletion if the article does not assert the importance or significance of its subject.

This sums up my earlier point quite succinctly. If the article makes some claim to notability, if it gives a "reasonable indication of why it might be notable", it is not an appropriate candidate for speedy deletion.

A related point is that articles about businesses or organizations which contain biased or promotional language may be salvageable, so long as they do not need to be "fundamentally rewritten".

Don't take the lazy way out

[edit]

This applies both to contributors of content, and to those who patrol and examine it. If you are writing a new article, regardless of the subject, put indicia of notability right up front in the lead section (the heading of the article). Include reliable sources, and cite them, with footnotes where possible. And, even if only for the sake of appearances, use a spell checker and paragraph breaks (or section headings) whenever possible. You don't have to be a master of wiki markup language to do that.

Likewise--and this is absolutely critical--if you are reviewing new pages, don't just look at it and say "Oh, this page is about a rock band/rapper/company/law firm/charity that I've never heard of, therefore it must be spam, therefore I must use {{db-spam}}". You need to actually look carefully at the article, see which (if any) sources are cited, and to what extent they confer notability. This isn't that hard if you think about it: is there at least one reputable third-party/independent source for the material? If so, and it is cited in the article, it probably isn't a candidate for speedy deletion. On the other hand, if the article consists mostly of self-published sources and/or original research, or is mostly just promotional puffery, it probably is. Remember also that just because a subject is not widely known outside a specialty field, that does not automatically render it non-notable.

Better means of detecting spam articles

[edit]

Look at the overall appearance of the article

[edit]

Is the article sloppy and unformatted? Is it riddled with mistakes in spelling, grammar, or punctuation? Or does it look like the creator actually put a bit of effort into it? While the overall layout and appearance of a new article is not in itself a determinant of whether or not it should be deleted, it can give you some idea of how important it is to the person who created it, and how potentially valuable it could be to the project. Like I said, no one likes to see good work flushed down the drain.

And of course, if the article is a blatant copyright violation, then it must be deleted.

Then, look at content and writing style

[edit]

This is where the policy of NPOV comes in so handy: not only is it a good principle to follow when writing, but whether or not it has been followed can tell you a lot about both the subject of the article and its author.

Is the article an obvious puff piece, or is it a disparaging attack? Ideally, it should be neither. If it does either, it certainly warrants further scrutiny by new page patrollers and/or admins. The same holds true if the article was created by a person who has a blatant conflict of interest.

If the article offers a general description of the subject, without resorting to peacock terms, it is probably not spam. Likewise, if the article offers background about a subject, which, while properly sourced and verifiable, paints the article's subject in a less-than-flattering light, you can bet that it is probably not spam.

  • Example: Let's say User:John Doe creates an article about a company called Alpha Corporation. He describes the type of company it is, what products it makes, where it is located, etc. He also describes a lawsuit against the company by some of its former employees, and provides at least one reliable source about it. John Doe's article is probably not spam, because no spam advertiser would want to include information which divulges negative information about his company (especially if that information is true and documented by a third party).

Another thing to examine what, if any, "inside" information the article offers about its subject. If the author gives details about the history of an organization, or its social/political connections, and does so in a neutral manner with reliable sources, this is also a sign that the article is probably legitimate and not spam. Even though notability is not necessarily "inherited", such content can form at least an assertion of notability which could save the article from speedy deletion, and give you a window into the intent of the author. Conversely, data such as telephone numbers, e-mail addresses of employees, pricing or ordering information, or other information which would be used to lure potential customers, would not constitute an assertion of notability, but rather a promotional come-on which does not belong in Wikipedia.

  • Example: User:Bob Barrister decides to write an article on the law firm of Smith, Jones & Brown. He describes the size and office locations of the firm and which areas of law it practices. He also gives a brief overview of the history of the firm, including a few famous politicians and law professors who had previously worked there, all backed with sources. This article is probably not spam, and should not be marked as such unless the rest of the language in the article is blatantly promotional.

Who is the contributor?

[edit]

I'm not going to explain the finer details of the policy known as "WP:COI" here. You can read it yourself (in case you haven't done so already). However, I will say this: You would be amazed at how many times I have seen articles on companies or organizations written by an editor who is not only a member or employee of that organization, but explicitly identify themselves as such. For example, they may have a username which suggests a connection with the company they are writing about, or they give information on their user page stating their employment or connections.

Conflicts of interest can be difficult to spot--but there are always people who are stupid enough not to hide their personal agendas.

Examine the sources (if any)

[edit]

I have already mentioned the necessity of reliable sources above, but there is a further point I would like to make about sourcing.

Just because a source is not familiar to a general audience, that does not automatically render it biased, unreliable, or otherwise inadequate. Certain concepts, as well as the names of the specialists and companies who deal with said concept, may not be known to the general public, but that doesn't make them any less notable or reliable. This ties in with the specialist topics issue that I mentioned above. Certain laboratories or companies may not get much press outside of scientific literature, professional journals, or science-oriented websites. Certain lawyers, judges, or law firms may not get much mention outside of legal periodicals or law journals. (Seriously, how many non-lawyers have heard of American Lawyer magazine or the National Law Journal?) So long as the sources are generally reliable, and are being relied upon properly by the author (e.g., the author isn't misquoting the source or taking it out of context), it should, by rights, be considered an appropriate source.

This has consequences for assertions of notability. If someone makes reference to a company or organization being given a certain rating by a professional body, someone who has never heard of that professional body may have no idea what the author is talking about. This would be a good time to assume good faith and give the author the benefit of the doubt. You may want to consider asking the author some basic questions about his sources--that's what User Talk pages are for.

Check "Special:WhatLinksHere"

[edit]

The procedure for administrators dealing with speedy deletion requires admins to check the "What links here" link at the left of each page before deleting an article. This is to make sure that 1) the article being deleted is not one which has many inbound links (as this could be disruptive to the encyclopedia) and 2) that the page doesn't link to other pages which may themselves be candidates for deletion.

This is a good idea for people who are nominating articles for speedy deletion or Wikipedia:Proposed deletion. Seeing how frequently an article is linked, or whether it is linked to one of the Wikipedia:Requested articles lists, may give page patrollers some ideas as to whether or not the subject of the article is notable or whether the article warrants speedy deletion.

Look at the categories that the article is in

[edit]

Deletionists love to mention the adage that inclusion of one article doesn't warrant inclusion of similar articles. This is true, in a narrow sense, but there is a converse principle to it as well. If there exists a whole category of articles on similar companies or organizations, and someone comes along and writes another using the same format and similar types of sources, this does not in and of itself make the newer article a candidate for speedy deletion. Remember that "I don't like it" is also not a valid reason for deletion, and it certainly isn't a valid reason for nominating an article for a process as swift and drastic as speedy deletion.

A final thought

[edit]

Spam is a problem in Wikipedia, and will likely continue to be for the indefinite future. We all need to use the appropriate processes and procedures to deal with it. This does not mean that we need to be so hypervigilant against spam that we begin alienating contributors of valuable content or deleting pages which, with a small amount of rehabilitation, could be decent articles.

See also

[edit]
[edit]