User:FearBot/EvalFunc
Appearance
The Evaluation Function is a function written in Java that gives the article a score. Scores are defined as follows:
- Smaller than 0: Great
- 0-10: Ok
- 10-16: Shows Template:IdentifiedSpam
- 16-20: WP:PROD
- Greater than 20: WP:SPEEDY
Information
[edit]The paramaters are as follows:
- MediaWikiBot mwb: The bot functions from JWBF, not used in this function
- SimpleArticle article: The article in question. article.getText() is the text, article.getLabel() is the label.
The variables are as follows:
- String[] lines: An array of the lines in the source code
- String text: The article text (duh)
- int p: The score
- int len: The article length
- int numtemplates, int numlinks, int numimages, int numcomments: Self explanatory (if you don't get it, num=Number Of
- int numelinks: Number of external links
The data arrays are as follows:
- String[] badwords, String[] goodwords: An array of words (off User:FearBot/Wordlists
- String[] lupinBadwords: Some badwords gotten off User:Lupin/badwords (does not include regex)
- String[] langs: List of language codes used in wikipedia
The functions are as follows:
- int sOccC(String str, String substr): Returns the number of times substr is in str (sOccC = String Occurence Count)
The Code
[edit]String text = article.getText().toLowerCase(); if(text.contains("{{db")){ return 0; } String[] lines = text.split("\n"); if(lines.length > 0 && lines.length <= 2 && lines[0].startsWith("#REDRIECT")){ return 0; } int p = 0; int len = text.length(); if(len < 100){ p += 15; } if(len < 300 && len >= 100){ p += 5; } if(len < 500 && len >= 300){ p += 2; } if(len < 1000 && len >= 500){ p += 1; } if(len > 1000){ p -= 5; } if(len > 2000){ p -= 5; } int numlinks = 0; int numelinks = 0; int numtemplates = 0; int numimages = 0; int numcomments = 0; numlinks = sOccC(text, "[["); numelinks = sOccC(text, "[http:"); numimages = sOccC(text, "[[Image:"); numtemplates = sOccC(text, "{{"); numcomments = sOccC(text, "<!--"); p -= (sOccC(text, "class=wikitable") * 5); p += (sOccC(text, "--[[User:") * 2); p -= (sOccC(text, "<ref>") * 3); p -= (sOccC(text, "{{cite") * 3); p -= (sOccC(text, "Infobox") * 5); p -= (sOccC(text, "stub") * 10); for(int i = 0; i < langs.length; i++){ p -= (sOccC(text, "[["+langs[i]+":") * 5); } p -= (sOccC(text, "[[Category:") * 5); p -= (sOccC(text, "{{reflinks}}") * 10); p -= (sOccC(text, "redirect") * 20); p += (sOccC(text, "'''bold text'''") * 5); p += (sOccC(text, "== headline text ==") * 5); p += (sOccC(text, "!") * 3); p -= (sOccC(text, "'''") * 3); p -= (sOccC(text, "disambig") * 15); p -= (sOccC(text, "|") * 3); p -= (sOccC(text, "==") * 5); p -= (sOccC(text, "<") * 2); for(int i = 0; i < badwords.length; i++){ p += (sOccC(text, " "+badwords[i]+" ") * 3); } for(String bw : lupinBadwords){ p += (sOccC(text, bw) * 5); } for(int i = 0; i < goodwords.length; i++){ p -= (sOccC(text, goodwords[i]) * 3); } if(numtemplates < 5){ p += 2; } if(numtemplates > 5){ p -= numtemplates / 2; } if(numlinks < 3){ p += 10; } if(numelinks > 1){ p -= numelinks; } if(numimages == 0){ p += 2; } p -= numcomments; if(article.getLabel().equals(article.getLabel().toUpperCase())){ p += 5; } System.out.println("Article "+article.getLabel()+" scored "+p); return p;