Jump to content

User talk:JAllemandou (WMF)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Filtering redirects and disambiguation pages

[edit]

Do you know about the redirect table in the database and the 'disambiguation' pp_propname record in the page_props table? EllenCT (talk) 01:17, 6 July 2016 (UTC)[reply]

@EllenCT: While I have not used them, I know they exist. Those field and table are present in our Mysql databases. The pageview data is available in our Hadoop cluster. Having them synchronized on a regular basis is a challenge. This is why we don't correlate those information as of today. --JAllemandou (WMF) (talk) 09:31, 7 July 2016 (UTC)[reply]
Are they too big to mirror across? EllenCT (talk) 13:48, 7 July 2016 (UTC)[reply]
@EllenCT: Most tables are not too big to be imported from a hadoop perspective. Concern is more about Mysql capacity to handle regular large exports and the cost of automation for 800+ databases. Also milimetric is starting to work on a one off task that involves joining hadoop and Mysql (see T139324). --JAllemandou (WMF) (talk) 09:44, 8 July 2016 (UTC)[reply]
Nice! EllenCT (talk) 02:47, 9 July 2016 (UTC)[reply]