User:Triddle/stubsensor

From MedBib.com - Medicine & Nature

< User:Triddle
Please do not adjust the stubsensor.

Stubsensor is a part of wpfsck that analyzes the English Wikipedia and tries to identify out-of-the-ordinary Stub articles; the software never edits an article, only humans do. Currently statistical analysis and bayesian filtering are being used for improper stub detection.

Contents

Reports

Stubsensor award

For your help with the Stubsensor cleanup project you are given the Stubsensor award.

The coveted stubsensor award is given for bravery, commitment to service, and fearlessness in the face of even the longest stub. For a list of current award holders see the image page. To award the Stubsensor award to another Wikipedian place a message on their talk page and including the following wikicode:

[[Image:Stubsensor award.jpg|200px|frame|right|For your help with the Stubsensor cleanup project you are given the Stubsensor award.]]

Suggestions

Feel free to leave ideas here.

Questions and observations

What happens if an article you come across from this page is really a stub? Is the idea to retag all stubs to something else? - ie. cleanup, expand, unreferenced, etc. Or do we leave stub stubs in the open section? EvokeNZ 01:25, 16 April 2007 (UTC)

What is the current status of the stubsensor project? Is it dead, or just on hiatus?

Here are a couple of stub observations that I've made as I've tried to organize things for the Southern California WikiProject: 1) In looking at the local radio stations, all of them had {{US-bcast-stub}}, but at least a quarter of them were large enough that they didn't need it. 2) In looking at the User:Rambot-created articles on cities and communities, almost all of them did NOT have a stub template, although somewhere between 1/2 and 2/3rds of them had less than 10 sentences that were not created by Rambot and therefore probably should have {{california-geo-stub}}. BlankVerse 15:55, 2 August 2005 (UTC)

Stubsensor is going strong; its been integrated into wpfsck and the next cleanup project will be organized shortly after the coresponding database dump. Can you create a list of the articles you are referring to? I'll go through the list manually and tidy things up; the new stubsensor runs off statistical analysis of the entire Wikipedia. If I can get those articles cleaned up before the next database the stubsensor should be even more accurate. Triddle 01:00, August 3, 2005 (UTC)
For all the LA-area radio stations, I've already gone and removed {{US-bcast-stub}} from the largest articles, but left it on the borderline articles that I know can be greatly expanded. I haven't yet looked at the San Diego-area radio stations, but I assume that I will probably find similar statistics—the broadcast stub on almost all of them, with a fair number of them already grown beyond stub size. The radio stubs are probably a big problem for the stub sensor since most (~80%) have infoboxes and other miscellaneam which will inflate their size. You can find the rest of the articles with {{US-bcast-stub}} at Category:United States broadcasting stubs.
As for the Rambot-generated articles, there are roughly 140 cities in the Los Angeles county, roughly 60 unincorporated communities, and at least 250 districts and neighborhoods within the City of Los Angeles (although not all of them have articles yet). You can see my worksheet at [[1]]. BlankVerse 06:27, 3 August 2005 (UTC)
Hello, the edit summary volunteers are asked to use when they destub an article: "Stubsensor cleanup project; you can help!" doesn't mention that the article was, umm, destubbed. What about changing it to something like "Destubbed by a volunteer from the Stubsensor cleanup project; you can help!".S Sepp 20:39, 29 April 2007 (UTC)