Wikipedia:Wikipedia Signpost/2016-02-10/Special report
New internal documents raise questions about the origins of the Knowledge Engine
The 13-page grant agreement between the WMF and the Knight Foundation was released on the Wikimedia Foundation wiki on February 11, following the Signpost's inquiry to the Knight Foundation the day before.
The Knight Foundation grant has been a contentious topic in the Wikimedia community, and ousted WMF Board of Trustees member James Heilman has alleged that his initial opposition to the grant, which he ultimately voted in favor of, was a factor in his dismissal.
Numerous questions remain about the grant, which was intended to kickstart a project formerly called the Knowledge Engine – now referred to as Wikimedia Discovery. Chief among them is the question Andreas Kolbe asked last week in the Signpost, "So, what’s a knowledge engine anyway?".
Key players have repeatedly stated what the Knowledge Engine/Discovery is not – namely, a search engine intended to compete with Google. For example:
The Discovery FAQ on MediaWiki states that "We are not building Google. We are improving the existing CirrusSearch infrastructure with better relevance, multi-language, multi-projects search and incorporating new data sources for our projects. We want a relevant and consistent experience for users across searches for both wikipedia.org and our project sites." In a November 4 email to all WMF staff, provided to the Signpost by several WMF staffers, executive director Lila Tretikov expressly stated that the Knowledge Engine "is NOT ... a search engine". Just hours before the release of the grant agreement, Jimmy Wales was even more blunt: "To make this very clear: no one in top positions has proposed or is proposing that WMF should get into the general "searching" or to try to "be google". It's an interesting hypothetical which has not been part of any serious strategy proposal, nor even discussed at the board level, nor proposed to the board by staff, nor a part of any grant, etc. It's a total lie."
However, these statements are flatly contradicted by the now-released grant agreement between the WMF and the Knight Foundation. Quotes such as the following make it abundantly clear that what is envisioned under the terms of the grant is indeed a search engine:
"Knowledge Engine by Wikipedia, a system for discovering reliable and trustworthy public information on the Internet." (Page 1.) "Would users go to Wikipedia if it were an open channel beyond an encyclopedia?" (Page 2.) "Knowledge Engine by Wikipedia will democratize the discovery of media, news and information – it will make the Internet's most relevant information more accessible and openly curated, and it will create an open data engine that's completely free of commercial interests. Today, commercial search engines dominate search-engine use of the Internet, and they're employing proprietary technologies to consolidate channels of access to the Internet's knowledge and information. Their algorithms obscure the way the Internet's information is collected and displayed. ... Knowledge Engine by Wikipedia will be the Internet's first transparent search engine, and the first one originated by the Wikimedia Foundation." (Page 10.) "Proceed with the search engine project as deliberately as possible – which is what the Wikimedia Foundation is doing" (Page 13.)
Three internal WMF documents illustrating how WMF thinking about the project evolved have been leaked to the Signpost:
- An "April 2 – FINAL – Knight Search Presentation – 04.02.15"
- A "June 24 Attachment 1 of 2 – Knowledge Engine by Wikipedia"
- An "August 2015 – WMF Submission to Knight"
We describe the documents in detail in this week's "In Focus". The earliest document, dated April 2, 2015, is a 12-slide presentation marked "FINAL". While the phrase "Knowledge Engine" does not appear, it's clear that even at this early stage, the "Wikipedia Search" referred to here was a well-developed concept. The presentation contrasts the ideals and motivations of commercial search engines – they "highlight paid results, track users' internet habits, sell information to marketing firms" – with those of "Wikipedia Search", which will be private, transparent, and globally representative. It repeatedly stresses that "No other search engines carry these ideals".
Several well-designed examples of search results follow, including the one pictured above. They prominently brand Wikipedia and feature multimedia content and multiple Wikimedia projects such as Wiktionary and Wikivoyage. The results include non-wiki sources like Fox News and Open Maps.
The June 24 document is a draft proposal for the project, by then referred to as the Knowledge Engine, which promises to be "a new global project that will once again change the way people access knowledge on the Internet", fully leveraging Wikipedia's and the WMF's resources, values, and reputation. The Knowledge Engine is described as "a federated knowledge engine that will give users the most reliable and most trustworthy public information channel on the web" that "will make the Internet’s most relevant information more accessible and openly curated, and it will create an open data engine that’s completely free of commercial interests". Knowledge Engine "will be the Internet’s first transparent search engine, and the first one that carries the reputation of Wikipedia and the Wikimedia Foundation."
The proposal divides the plan into four stages, each lasting 16–18 months. Interestingly, the first stage is called Discovery, which is the term the WMF currently uses to refer to the Knowledge Engine project. The proposal asks for US$6M from the Knight Foundation over three years. It pledges $2.4M of the WMF's own resources to the project for the current fiscal year, including eight presumably full-time engineers and two data analysts.
The final document, dated August 5, 2015, resembles the publicly released current grant agreement in many ways, including much of the same language. The grant amount has dropped to its current $250,000, but this amount is only for the first Discovery phase of the larger Knowledge Engine project. Both the amount and its designation for phase one appear in the current grant agreement.
These documents raise significant questions about how much the Knowledge Engine has actually evolved from April 2015 and what the technical and social implications of this project will be for Wikimedia.
These questions are at the heart of the current debate regarding transparency, accountability, the relationship between the WMF and the Wikimedia community, and the uncertain direction of that movement.
Discuss this story
The thing is, to be honest, does any of this drama matter? We all know what the end results are going to be. The WMF is going to do whatever the heck they want no matter what anyone else says, they'll spend a ton of time and money on a technical project in the face of opposition, they'll release "in beta" a broken, buggy version that sort of resembles what they promised to release, and years later it will still not be done and will get quietly shut down. For a non-profit that so desperately wants to be a "tech" startup, they have a terrible track record at actually producing usable software projects, much less managing their PR cleverly. --PresN 05:52, 15 February 2016 (UTC)[reply]
"However, these statements are flatly contradicted ..." - Uh-oh Mister Ranger isn't gonna like this, Yogi! (n.b., that's older US slang which means "The authorities shall be displeased by your bold action"). The above is excellent work overall, my compliments. But there's a bit of background context which would have avoided a slight misstep there. When they talk about NOT compete with Google, they mean they aren't building a complete-web database, funding by advertising, to try to get a piece of that amazing money-machine that's been mastered by Google (the amounts involved are enormous). Rather, they're focusing on a restricted segment, and going for a different strategy for support. Now, the following remarks are purely speculative and the product of a very jaded and cynical person. Given Wales's previous Wikia Search project, and the extensive Google connections with the current Wikimedia Foundation Board, I would be extremely wary that this project exists to help Google in further improving its search results (that's indeed not competing with Google!). The spam and junk battle is ongoing. If Google can get Wikipedians to "volunteer" to mostly work for free in refining algorithms and curation, aiding it even more than they do already, that's advantageous to both Google and the Wikimedia Foundation people (who will likely somehow eventually end up with tangible reward, while you will get the joy and happiness of having oiled the amazing money-machine, excuse me, helped distribute knowledge to the world). Perhaps the proponents of the project will say I am an idiot for such thoughts, but always ask, "Who benefits?". -- Seth Finkelstein (talk) 08:00, 15 February 2016 (UTC)[reply]
Great idea.. terrible management
The more I see of this, the more I like it actually. Step 1, let's improve search, discovery and exploration within our own websites, then slowly pull in more stuff (everything that is open), then see if we can do some open model to even pull in the rest of the world. As a 10 year vision it's actually something that I have been waiting for.... Everyone knows that there will be left and right turns along the way, and marketing bullshit speak, and what not. Maybe we will get there, maybe not, whatever, at least it's a point in the future to strive for (and actually a pretty achievable one I suspect). But once again, it's a total F'up of communication towards the community. It is pathetic. It's shameful. And the worst part is that it's apparently equally bad handled internally, causing staff to feel the exact same way. (addendum: and yes. also managed badly towards the knight foundation of course). —TheDJ (talk • contribs) 09:23, 15 February 2016 (UTC)[reply]
Data sources
If Fox News or TeleSUR have the slightest chance of appearing as data sources of this searching project, I will campaign to stop it. --NaBUru38 (talk) 14:04, 15 February 2016 (UTC)[reply]
Copyright
I would very much like to know if the copyright owner of the "April 2 - FINAL- Knight Search Presentation - 04.02.15.pdf", i.e. the Wikimedia Foundation, decided to make a public release of this document, that seems to be an internal document. Pldx1 (talk) 16:24, 15 February 2016 (UTC)[reply]
What if it works?
There's a lot of negativity in the comments above, about the competence of the WMF to manage the creation of a good, unbiased, advert-free search engine (or whatever they would like it called). This is hardly surprising, after Visual Editor and Media Viewer. But supposing they get everything right this time? Who would lose most? I think I now understand the recent appearance of several Google employees on the WMF board. Maproom (talk) 00:06, 16 February 2016 (UTC)[reply]
The shadow
A shadow hangs over the WMF. The board should have never exercised its "rights and privileges", like some tyrant, in such a fashion as they did, to police our representatives like some sovereign. This public act by the board has as much tact as calling a person a cunt. And since, so many of us have viewed the WMF board as corrupt (as in bribery), and viewed this corrupt product with extreme bias.
The logic of a simpleton I admit. But please tell, would it not be equally OK to pass an Act of the People of Florida dissolving the WMF and taking possession of its property? (You better believe the courts in San Francisco, California (as in the WMF terms of use) would give full faith and credit to such an act, and I don't think Jimbo would be able to successfully re-incorporate his trusteds.) Would that not be at least equally acceptable as this nefarious, recent act of this trusted board, in terms of the excuses we've been given?
We have identified a flaw in our government, and it is time that, just as the people of people of Florida rule Florida , that the community regain trust in this trusted board. Or whatever negative feelings we have towards this project will grow in time into general contempt for all things WMF. int21h (talk · contribs · email) 07:01, 17 February 2016 (UTC)[reply]
Wikidrama