Jump to content

User:FearBot/EvalFunc

From Wikipedia, the free encyclopedia

The Evaluation Function is a function written in Java that gives the article a score. Scores are defined as follows:

Information

[edit]

The paramaters are as follows:

  • MediaWikiBot mwb: The bot functions from JWBF, not used in this function
  • SimpleArticle article: The article in question. article.getText() is the text, article.getLabel() is the label.

The variables are as follows:

  • String[] lines: An array of the lines in the source code
  • String text: The article text (duh)
  • int p: The score
  • int len: The article length
  • int numtemplates, int numlinks, int numimages, int numcomments: Self explanatory (if you don't get it, num=Number Of
  • int numelinks: Number of external links

The data arrays are as follows:

  • String[] badwords, String[] goodwords: An array of words (off User:FearBot/Wordlists
  • String[] lupinBadwords: Some badwords gotten off User:Lupin/badwords (does not include regex)
  • String[] langs: List of language codes used in wikipedia

The functions are as follows:

  • int sOccC(String str, String substr): Returns the number of times substr is in str (sOccC = String Occurence Count)

The Code

[edit]
String text = article.getText().toLowerCase();
		if(text.contains("{{db")){
			return 0;
		}
		String[] lines = text.split("\n");
		if(lines.length > 0 && lines.length <= 2 && lines[0].startsWith("#REDRIECT")){
			return 0;
		}
		int p = 0;
		int len = text.length();
		if(len < 100){
			p += 15;
		}
		if(len < 300 && len >= 100){
			p += 5;
		}
		if(len < 500 && len >= 300){
			p += 2;
		}
		if(len < 1000 && len >= 500){
			p += 1;
		}
		if(len > 1000){
			p -= 5;
		}
		if(len > 2000){
			p -= 5;
		}
		int numlinks = 0;
		int numelinks = 0;
		int numtemplates = 0;
		int numimages = 0;
		int numcomments = 0;
		numlinks = sOccC(text, "[[");
		numelinks = sOccC(text, "[http:");
		numimages = sOccC(text, "[[Image:");
		numtemplates = sOccC(text, "{{");
		numcomments = sOccC(text, "<!--");
		p -= (sOccC(text, "class=wikitable") * 5);
		p += (sOccC(text, "--[[User:") * 2);
		p -= (sOccC(text, "<ref>") * 3);
		p -= (sOccC(text, "{{cite") * 3);
		p -= (sOccC(text, "Infobox") * 5);
		p -= (sOccC(text, "stub") * 10);
		for(int i = 0; i < langs.length; i++){
			p -= (sOccC(text, "[["+langs[i]+":") * 5);
		}
		p -= (sOccC(text, "[[Category:") * 5);
		p -= (sOccC(text, "{{reflinks}}") * 10);
		p -= (sOccC(text, "redirect") * 20);
		p += (sOccC(text, "'''bold text'''") * 5);
		p += (sOccC(text, "== headline text ==") * 5);
		p += (sOccC(text, "!") * 3);
		p -= (sOccC(text, "'''") * 3);
		p -= (sOccC(text, "disambig") * 15);
		p -= (sOccC(text, "|") * 3);
		p -= (sOccC(text, "==") * 5);
		p -= (sOccC(text, "<") * 2);
		for(int i = 0; i < badwords.length; i++){
			p += (sOccC(text, " "+badwords[i]+" ") * 3);
		}
		for(String bw : lupinBadwords){
			p += (sOccC(text, bw) * 5);
		}
		for(int i = 0; i < goodwords.length; i++){
			p -= (sOccC(text, goodwords[i]) * 3);
		}
		if(numtemplates < 5){
			p += 2;
		}
		if(numtemplates > 5){
			p -= numtemplates / 2;
		}
		if(numlinks < 3){
			p += 10;
		}
		if(numelinks > 1){
			p -= numelinks;
		}
		if(numimages == 0){
			p += 2;
		}
		p -= numcomments;
		if(article.getLabel().equals(article.getLabel().toUpperCase())){
			p += 5;
		}
		System.out.println("Article "+article.getLabel()+" scored "+p);
		return p;