So if you haven’t noticed we have a…
So, if you haven’t noticed, we have a new make blog, and a post on it that I think will be relevant to your interests.
http://make.wordpress.org/meta
So, if you haven’t noticed, we have a new make blog, and a post on it that I think will be relevant to your interests.
http://make.wordpress.org/meta
Although today’s XKCD is relevant, I thought the idea of having more clarity in the star ratings (like Amazon’s) would be good. So I coded that up real quick, and it’s live now.
So instead of getting just the count, you now get the rating as a number (3.8 out of 5 stars, for example), and you get a 5-bar chart showing the counts for each rating made.
Go see it on any plugin page.

This is rad.
Otto rocks the house as usual
Otto – this is bad ass!!! Dramatic improvement.
Nice job sir!
Kudos. Amazing what a difference one small change already makes. Now get to work on reviews!
Brilliant! Can I make one small suggestion? Not sure what others think, but it would be nice to know the total number of votes without having to do some mental arithmetic
. If space is a premium, IMHO the ‘out of 5 stars’ is fairly clear. In any case, much much better, thanks Otto!
Well done sir.
Rock on. Well done.
Fantastic!
Nice work Otto.
Good information for users and plugins authors alike.
Interesting discussion in HW about our topic at hand http://news.ycombinator.com/item?id=4416864
I saw that article earlier today. Their choice of a scatter plot makes sense to me as a math nerd, but I don’t think it’s a great visualization overall. It’s confusing to the non-nerd, although that’s a matter of knowing your audience.
Also, tracking multiple choices like that doesn’t seem like a good idea to me. Asking users to perform a complex rating system really means that most of the data you get from those secondary questions will be garbage or useless.
I think it’s possible to do better than just the ‘average’ for a overall score choice though. Perhaps if you tracked user histories and compared them to a standard deviation, then weighted those users that conform to the standard bell curve more than the others. Basically, if a user has a wide range of 1-5 ratings, his opinion probably should count more than somebody who only ever rates anything 1 or 5 alone.
This is also very handy info:
http://youtube-global.blogspot.com/2009/09/five-stars-dominate-ratings.html
Would a simple like/dislike system be better? YouTube clearly thought so, since that’s what they do now.
Pandora uses a similar system, although in their case they are using it for user-correlation-matching. If you and somebody else both dislike/like the same things, then they can correlate that data and play you each others liked songs.
Great work, Otto, this is a really fantastic improvement! I’ll be sure to bring this to everyone’s attention at our next WordPress Meetup too.
So, it sounds like the next step is to add reviews and tie them to profiles. Otto, do you need help? I’d be happy to wireframe out an idea.
No need for wireframes. Working on it.
Did anybody else notice that like the vast majority of stars are 5 or 1? What use are 2, 3, and 4 with this sort of distribution?
I’ll figure out some stats soon, I think a thumbs up/down would make more sense given what I’ve seen.
not in the case of the events calendar, but we have had a long life with some years of total neglect and some with lots of love. http://wordpress.org/extend/plugins/the-events-calendar/
Even in that case, 65% of the votes are 5 or 1. Those two numbers make up the majority on every plugin.
That true. That said, a 65% majority is informative, the nuance of 35% if valuable enough for me to share it. Although at the end of the day, it is the reviews that will really help me decide.
I would be interested though in seeing it have a 6 month or whole life filter on ratings.
Recently I’ve noticed a number of plugins on WPORG got rating of 1 for plugins written by top authors with a high quality code. I have serious doubts about the relevance of these comments. IMO a mandatory comment for every rating (even anonymous) would be much better than some random anti-rating that could be even bot-based.
If you’re at WordCamp Portland, there will be a unconference session continuing this discussion: 10:30 in the Bergen Dining Hall (downstairs).
Here are the unedited notes that I took during the presentation:
https://docs.google.com/a/get10up.com/document/pub?id=1ZfmvfI6nzlnmU_FDAuY0o-oJe6cGzk1KRvVDggkFRBA
Hi Taylor, I get a message that I don’t have permission to access that document – any chance of setting it to public? Thanks. Stephen
Sorry about that, let’s try this link.
https://docs.google.com/document/d/1GnN5O6ltCL5VTxDB4r_6GncGTig-XSBHh-lpOOWmvI4/edit
My notes from this afternoon’s discussion:
My notes from the afternoon to complement Erick’s, they’re a bit shorter:
My 2 cents:
Pre-review is going to be very hard to scale (at least well) and will require a huge investment in human time, won’t keep bad guys out because we’re not covering updates as well, and probably couldn’t without a 15-20 full-time developers dedicated to doing reviews.
Many good ideas were discussed, but if we did them all at once we’d create a franken-directory.
It’s important to curate the ideas and figure out what’s best to try first, and then observe the impact once it’s live, and then evaluate the rest of the ideas in that context.
The two things I personally think would move the needle the most are breaking out ratings, and allowing reviews. To expand on each of those a bit:
Breaking out ratings
http://cl.ly/image/0z2V2N1R2H3f/content
Look at all the awesome things Amazon is doing here:
(It would be interesting to think about the weakness of a 5-star system versus three levels like recommended, okay, not recommended, or even a binary rating thumbs up or thumbs down.)
Reviews
Right now we allow people to create support forum threads easily from the plugin page, it wouldn’t be hard to extend that to creating reviews as well.
The subject could be formatted [Plugin Name 1.2.3] **** Review Title — version makes it specific, pretend the asterisks are nice unicode star symbols, and then the title and content (and the thread that follows it) we get for free. Authors could reply to things, and hopefully lower reviews could lead to bugs being fixed or plugins being improved.
A hidden taxonomy can allow us to do structured queries on the data later. We already have code to pull in forum threads on plugin pages, you could see all reviews in one place on the forums.
Reviews go in your WP.org profile, like favorites (and not-favorites) on steroids, and because they’re tied back you get reputation factors built in. People asked for “vetting” could happen organically through core dev accounts or WP.com VIP account leaving reviews on plugins, essentially curating their own recommended lists, or sharing publicly the results of their audits and reviews giving the plugin author an opportunity to learn and improve from it. (Imagine web hosts doing the same.)
Priorities
Reviews as described above would be really powerful, and not too hard to implement. (Famous last words.) Breaking out ratings is trickier, but might be good as a phase II, especially if we decide to make it 100% ratings driven rather than naked stars.
Adding these reviews with specific versions of the plugin targeted is a great start. And as you said Matt, this will encourage organic reviews and requests for new reviews as developers and users learn who in the community is trustworthy.
I also really like the star breakout idea. Would this work on a per version level, or would the stars still persist across all versions? Maybe all versions by default with a filter?
I have felt from the beginning of this conversation that simply creating a bigger vetting process for new plugins would not be a good way to do it as it will require a lot of manpower, and then how do we decide _who_ is worthy of doing the reviews? Tapping into the power of social media to create a social review process is going to be far more powerful and useful, and much less stressful on Automattic.
Good ideas all. Great conversation at #wcpdx too
I agree that these are achievable, practical first steps that at least make the repository useful to those who want to take the time to study their choices.
I think being able to see a review by “Automattic” or Nacin (or Matt!) or other high profile developers would be immediately and immensely useful to those “in the know” (who will spread the word to those less in the know). If we can – as Amazon does – also mark reviews as useful, we’ll be on our way to having a list of top reviewers that could, in practice, be a form of indirect vetting. And since the community will “rate the ratings” (and indirectly the raters), issues of favoritism and cronyism that might accompany an “ordained” vetting team fall away. I’m picturing an average rating from “top WordPress plug-in reviewers”.
As the woman whose name I can’t remember in the back kept belaboring (and I agree) it’s really important that we engineers not lose sight of what the casual user (Matt’s mom?) wants in a plug-in repository that’s baked right into the software itself: plug’n'play, not plug’n'pray. It’s my hope that as we amass this information and iterate improvements into the repository, that we can get to a plug-in install screen that really raises the best to the top, and perhaps even conceals (or at least visibly diminishes) plug-ins that haven’t attained a certain standing.
What are the next steps? Is it time for a meeting with Otto to talk about folding in a review system?
Next step is probably just to try the above and see how it feels, then we can iterate based on how people use it and how it looks and feels to us when it’s running with live data. There are probably some gotchas we haven’t thought of yet.
Summary of ideas for an advanced vetting process
1) Have a subset of plugins that are “vetted”, “flagged”, “badged” or some verbiage to differentiate them as peer reviewed.
2) Do so via a peer reviewed plugin review team, similar to the theme review team. Seemed to be some acceptance to my concept of the academic paper review process.
2a) Plugins utilize some sort of process for submission for peer review. Could be some a form submission where blog posts, code reviews done by devs outside of dot-org process, etc. Could also be based on a certain number of star ratings or downloads. Could also be based on the plugin author’s standing in the repository (aka is it a Mark Jaquith or a first-time plugin submitter). Could be some combination of these.
2b) Once plugins are deemed vet-worthy, perhaps a small group of people review the plugin and go through theme-check-esque review process of plugin, similar to peer review of an academic paper.
3) Assume a plugin is vetted. Now how will it be displayed, found / searched for and maintained? Some sort of tagging / badging system could be utilized. Ability to remove approval or badge would be important.
3a) Very difficult task here. How and where do we “promote” vetted plugins and give them an edge vs standard plugins. Make them sticky for standard plugin tags? Have the ability to limit searches to only vetted plugins? All of the above? Something else? This process needs quite a bit of brainstorming yet imo.
Might be wise to consult the vip team from .com to see how they go about this process, given the understanding that the goals are not the same.
One of the ideas we were tossing around for Renku (http://renku.me/) was a tagging/badge system. The idea we had was to have different people give their seal of approval to products. Given that it’s a commercial service, the idea here was “pay $x to get a review, and if you pass, you get a badge”.
For WP.org, I believe the same sort of system could be implemented, minus the commercial side of things. Rather than just having a general vetting, badges could be applied for different areas: say, a security badge, a usability badge, etc. These would all be handled through separate queues, with a team assigned to each type of badge. To start out, you could have just one badge (e.g.) and scale up from there.
That would also tie into the social aspect. Mark Jaquith, for example, is well-known for his security audits, so he’d be a great person for the security team. On the other hand, he might not be the best for the usability team (he’d be great regardless of the team; this is just an example), whereas someone from make/ui might be better.
The problem then of course comes down to: how do you handle a giant queue like this? Surely everyone will want their plugins reviewed. I think the best way would be that you can only be in one queue at once. Some plugins might want the security seal of approval more than a usability one, so this would help to spread out the load.
One other issue is, how do you handle updates? I think you could show the badge, but have it greyed out to indicate that the current version hasn’t been vetted fully, but that previous versions have been. A separate, expedited queue could be made for updates, where the diff of the versions is checked, rather than rechecking the entire plugin.
Regarding UI, I’d add a checkbox filter into the plugin search (both on wp.org and in the plugin installer) to filter to just plugins with a security badge, e.g. In the list, you could also have little icons next to plugins to see at a glance if they’ve been vetted. I’ll see about doing a mockup of this if it sounds good.
Really like this in contraposition with the general “vetted” option. The SEO plugin I need for a project with 5 MM visits a day doesn’t necessarily have the same requirements as a SEO plugin I want for my mom’s blog.
Don’t know that I agree there need to be multiple “exposed” metrics for vetting code, aside form separating “prettiness” from “well engineered” – and you could even argue those should both be part of a vetting process.
End users don’t want to have to look for 4 badges. They want to know that it passed the “WordPress acceptable standard” or failed it (or weren’t vetted yet). A plug-in should be secure, clean up after itself, not break anything, and not fail under load (etc).
Do we really want to say “vetted for security! but, hey it may break WordPress”?
If it may break WordPress, it should be out of the repo altogether.
But if I need to send my mom to look for a plugin, I don’t care about scale, but I want the best UX possible so she doesn’t call me 30 times a day. If this if for a corporation website with a tech savvy webmaster, I want the best scalability I can get, even if it doesn’t come with the best UX.
My concern about this “vetted” approach is what will happen with all the plugins that don’t make this exclusive list. Maybe because they are in queue, maybe because they are _great_ for scale, but suck at UI, etc.
In the attempt of fixing this, we shouldn’t stagnate the repo making new devs frustrated.
I think there may be a way to make it clear that a “vetted” badge doesn’t mean other plug-ins failed a vetting test – just that they haven’t been vetted yet. But yes, it’s sticky, and risks discouraging new players.
This is, in large part, why I originally felt that this vetted repo needed to be independent of the official repo. A “third party” vetting / review site (even if it’s the usual suspects) would be understood to only review plug-ins they get around to reviewing. Making it part of the official repo, and influencing the browsing of said repo, makes this politically stickier.
If this ends up being a private effort, things get _so_ much easier. We can make a website for that in 4 hours, and each one of us gets to invite their ~5 most trusted persons in the community. Or only big name companies / freelancers / core contributors get to participate and rate plugins.
Here you won’t have 15 plugins a day, just a curated list with the best of the best.
The question is what to do with plugins who fail vetting? Leave them in the repo? How do you differentiate between not vetted vs failed?
Daniel – I agree, and I’m tempted to go forward with this as a starting point. But I do want to see the official repo improve, and don’t want to be the proverbial guy who took my toys and went home.
Shane – glad you raised this. It’s been in the back of my mind. Even for the official repo – what if it’s been in the repo, submitted for vetting, and failed? Do we have a “Failed Vetting” badge? Do we recommend it for removal? Perhaps it gets an “under review” badge, and we ask them to make changes?
I think that “failed” vetting brings on a key part of this debate. Are we trying to totally transform the repository or are we trying to showcase the cream of the crop?
If we’re going for complete transformation, a “failed” vet may include removal, or at least “hiding” (like is done with some old plugins now), from the repository.
If we’re just showcasing the cream of the crop, I think a soft fail is pretty acceptable, especially if the other two parts of this discussion are successful (meaning better plugins will get in to start with and have better reviews in general). That way a plugin that isn’t vetted can still be downloaded and used, and who knows, maybe non-vetted plugins are okay for a lot of normal sites.
I personally think there is still enormous value in broad plugin acceptance to the repository (even if it’s pretty strictly security benefits from not having loads of third party repositories and sketchy plugins – ie – like with themes already). So to me, a soft fail is preferred. If it’s a harder fail, then the process of vetting *really* needs to be thorough and very involved w/ the dev, otherwise it’ll be brutal to be denied a vetted status.
PErhaps the best approach is to display the outcome of the review process, and then if it earned it, a vetting badge. The plugin could fail to earn vetting for a reason that many users might not care about (say scaling for example) but shine in other aspects that matter.
@shane, I think that makes sense. A prominent “vetted” badge. A less prominent “tested” notice if it’s been tested but not vetted, with visual cues as to sections of pass/fail or scores or whatnot, in order to give some reasoning to a user as to what happened in that scenario.
Re the private effort coversation – it might be easier and it might help your clients out (fair enough), but 99% of WordPress users are going to keep on using the official repo / WordPress admin area for plugins. Far better if we can improve the official repo.
100% agree. But maybe a private effort can lead the way and prove that some strategy works and some other doesn’t.
It’s like when you propose a new feature to core. The standard(?) response is: go make a plugin. If that proves to have adoption and work well, it may get integrated in core.
One way of handling fails would be to have a “Vetting History” tab added somewhere on the plugin page. Hide it if there is no history but show it if there is a history. Show who/when vetted the plugin and the results. A plugin may not be “vetted” for minor reasons or a major reason, this lets all users see exactly why in an open/transparent manner. Especially if there are minor reasons something doesn’t pass I’d hate to see a failed badge on it.
This way if I see a plugin that has gone through the process but didn’t get the vetted badge I can see “OK it’s just not that great for load” or that there are huge security risks because nothing is escaped, no nonces, etc.
I keep thinking of the Twitter “verified” type badge with a different application. Not being verified doesn’t mean @pmgarman isn’t me, but if I have that verification it’s proven for sure its me.
I think we’re all on the same page with this (again, I’m just now playing catch up), but I don’t think the ultimate goal is to kill plugins.
No matter how experienced a developer you are, every time you build for a new platform, you have to cut your teeth on something. WordPress is no difference. Keeping the doors wide open for new plugins – sort of the “Use at your own risk” for lack of a much friendlier term
– is key.
But having this tier of vetted (or approved or stamped or labeled or starred or whatever arbitrary indicator is decided) is also important. And I don’t think we need to clutter the plugins page with 5 badges indicating security, code quality, UI, UX, and whatever else.
I think it needs to be simple (simple, not simplistic) – it either passes or it doesn’t. I don’t think it’s fair for us to complain about having higher standards of quality for plugins if we’re not willing to go all in.
By that, I mean that it’s not fair to say: “I’m want higher quality plugins – I’d aim for the badge of security and code quality but not be too concerned with UI/UX.”
I think it’s all or not.
Tom: How does a single binary yes/no help? It replaces a bad rating with a rating that tells a user it works for somebody but not necessarily for them, and it provides no additional context for them decide is the rating is relevant to them. It’s also likely to be like Dmoz vs. Google; fiefdoms will emerge and the plugins vetted will be those developed by the friends of the vetting commit because of access. And this means many good plugins won’t get a reasonable chance to get vetted.
Instead let everyone who wants rate and review plugins, let everyone who reads reviews say if a review is helpful, and let people build their reputations based on the quality of their reviews. Then let the reviews of those with higher reputations have more weight than reviews by those with low or no reputation.
Finally allow reviewers to freeform rank attributes so that the rankings can in-aggregate provide those what want to do more evaluation with more details about the plugin.
@Mike – you’re right. I wasn’t intending to say that this particular rating should indicate whether or not it works across the board of hosts or that it’s compatible with every theme and that it encompasses every single issue we have around a plugins.
I meant only to suggest this around the code that goes into plugin. For example, we’ve been talking about:
Coding standards
UI guidelines (which I know are still up to discussion)
Security
etc.
I meant that, as developers, if we’re having an discussion about plugin quality, then we need to target *all* of the criteria. It doesn’t make sense to me to have a discussion about improving the quality of the plugins, but only go in half-way.
If we truly care about the entire plugin experience (which I think everyone here does!), then we need to care about it on all fronts.
So this badge/indicator/whatever-its-called is meant only to say that it meets a certain set of rules for evaluation of development. It’s not a guarantee that it works on all hosts.
To me, that’s an entirely other rating (which is still something that needs to be covered).
Anyway, didn’t mean to be unclear in my initial comment
.
Random question: Shouldn’t we vet plugin _versions_ instead of plugins in general? “The 1.3 version of Plugin XXX is vetted”. Because I’ve seen lots and lots of plugins f’k up really bad with an update.
It’s another great point. Especially with the “recommended” system. I think any badges / recommending has to be version specific…. which of course, might discourage people from upgrading their code… eek.
As I noted in my comment, I think it’s a good idea to link vetting to a specific version. I think the UI can show that it was previously vetted, but the current version hasn’t been yet. The question is, how do we convey that without causing user confusion?
And maybe let them install the vetted version… ? If there’s no way to know if the upgrade broke something, of what value is the past vetted version, really? I’d still need to audit it. Maybe show diff of changes since audit… ? Not useful to non-devs…
Something like “The current version (1.3.1) has not yet been vetted. Install 1.3 instead?“
That’s the idea. And maybe a “What Changed” tab that can include developer’s changelog and a tab with diffs for developer types? Maybe % of code change for non-devs?
I don’t think that recommending older versions is aligned with the views of the repo. In most(?) cases an update is better / safer.
Daniel: I agree, and that’s why I think revetting updates should be a fairly high priority (and have a separate queue).
@otto42 , how many plugin updates does the repo get each day?
I think that having to vet versions can be a pretty monumental task. If it goes that route, I think it’d be important to get plugins to adopt the WP main project methodology for tagging. As in, version 1.0 and 1.1 are major versions, and would go in a vetting queue, but versions 1.0.1 or 1.1.2 would be minor and perhaps not require vetting. Still a bit of the honor system on a system like this, but I think that will be necessary no matter what. Plus, if someone was behaving badly and tagging major updates 1.x.x then if they are flagged they could lose that status anyway. But vetting *every* update could be ridiculous. WP SEO I know of the top of my head sometimes updates multiple times in a week. Not feasible at scale.
Vetting every update would be the best solution, but I’m guessing it’s just not feasible. Vetting the plugin originally, but not the updates, has issues as discussed and is still a huge job.
So, what if we don’t vet the plugins, what if we vet the authors? Some sort of accreditation scheme where they have demonstrate all the concepts required and earn an official ‘score’. You still let everyone in, so as to not discourage new plugin authors, but users can easily see the dev’s score, which is a signal of quality..
It’s not as good as vetting every version of every plugin, but it may be more feasible. Still loads of issues with it (what about old plugins from before they were accredited), but might be worth looking at.
We all have a few authors that we trust without even opening the code, but if we need to make it a general rule, I don’t trust people; I trust code. Even with my own code
After all, anyone can have a really bad day.
While I love the idea of vetting authors rather than plugins (easier), the reality of it makes me uncomfortable.
Take a begining dev. they submit a plugin for review. it goes into a 3 month loop as they get coached. finally it passes. are they vetted? if they tackle a new plugin which is far more complex and it fails, should it really get auto-vetting? I don’t think so. Even the best of us who are pushing boundaries will create a mess every so often. Peer review of the plugin will create quality. I work with some amazing devs, and sometimes, they have off weeks (or months).
Second, the issue is that plugins are often created with a specific purpose and audience in mind. Some might be made to scale, others to improve a confusing process. How do we make these decisions transparant?
how many plugin updates does the repo get each day?
Well, it’s 17:14 now, and there have been 602 changes so far today. So, guesstimate would be around 800-1000 on average.
http://plugins.trac.wordpress.org/timeline
Not all changes result in a new version to a plugin. That’s harder to measure.
Otto. That metric is extra useless because many devs don’t use the repo svn to dev, as they prefer git. For example we dev elsewhere and then simply commit on version change. Any way to figure out the number of version changes?
No, I don’t have that information, nor know where to start even trying to figure it out. It’s not something we track.
However, if the majority of developers did as you describe, then that metric would actually be extra-useful, since most commits would be complete version updates. Since many people *do* dev via the SVN, the number is inflated.
I don’t think we’d be doing anyone any good vetting a plugin but not keeping track of the versions. Maybe have two types of “badges” on the plugin (not sure what exactly we are calling them). Either by a semi-transparent or grayscale vs color, show that its been vetted and the current version is vetted – or a previous version was.
Another thing to think about is silent updates? Say that v1.1.0 gets pushed out and vetted. After being vetted the dev decides to go into the 1.1.0 tag and make some nasty changes. Something would need to stop this from happening. Or in the case like what happened to malicious SVN commits by someone other than the dev? We will need to lock down vetted versions one way or the other.
I shared some of the plugin code review I currently do for my clients in the “Locking Down WordPress” Code Poet ebook. This was a security related response (as opposed to a performance related response), but I think it is worth sharing here:
If the client requests a plugin that I have never used before I review the
plugin files and the plugin developer(s). When I review the plugin files I
specifically look for WordPress Plugin API hooks, actions, and filters, properly sanitized data and MySQL statements, unique namespace items, use of the Settings API for any plugin settings or options, and nonces instead of browser cookies. I review the developer to verify reasonable response times to support items and that the plugin is actively developed.
Some additions from my plugin review process:
1. Are scripts and styles enqueued correctly? Related: Is the plugin enqueueing yet ANOTHER version of jQuery?
2. Will front end scripts/styles only load in views where needed?
3. Verify the plugin can be activated, used and deactivated without throwing an exception or triggering an error.
4. Upon deactivation, does the plugin clear any database tables or rows added during use?
5. When uninstalled does the plugin remove any added database tables/rows and all files?
6. Does the use of the plugin cause memory leaks/spikes?
7. Does the use of the plugin increase the number of database queries required to load different views? If so, is the increase expected? If the plugin functionality was expected to increase database queries is the increased load reasonable?
Interesting. I think all those step could be implemented in an automatic tester like we have for themes.
+1 for an automated tester. I know there’s a WP Coding Standards sniffer, but I’m not sure if there’s any tool to check for any of these things. I think it *could* be done though.
Ryan, There is a Code Sniffer 1.3 plugin for WordPress standards: https://github.com/mrchrisadams/WordPress-Coding-Standards that in conjunction with xDebug and some sort of profiler would be helpful to start automating this…
That’s the one I was thinking of. That’ll check things like indentation, braces, etc, but it won’t check the things you noted above, which I think could be automated.
+1 for the points above.
I think there is *some* leeway in a few of these points. For example, not all plugins will need to clean up after themselves.
For example, I’ve got one plugin that, when activated, adds data to the end of comment content based on user input. If I were to back that stuff out upon plugin deactivation, it would literally remove files and break previous comments.
So I see this as kind of being a “Theme Check” style set of rules where there are :
Errors
Warnings
Recommendations
Great list. Your having such a list and indicating you’ve done this for clients already also reinforces my sense that great developers like yourself are ALREADY doing this vetting… and we’re not capturing it. Sure you can vote, but 20 other people whose credibility means nothing to me got mixed in with your vote.
Yes. I have been thinking about solving the original problem you brought up – which seems more of an aside to the general Plugin Repo. A place where other developers/development companies can share the plugins they use, plugins created (that perhaps are not meant for non-developers) and even share code.
I would love to talk to you (Jake) and any other other developers that have interest. Many of us doing client development work are doing double work, and I am interested in finding a way for us to work smarter.
I think an interesting question is… how are you documenting each one of this reviews so your team can always get back to them? Do you have a system / process for that?
Also, do you re-review each time the plugin gets an update?
This seems like something outside the repo to me and would totally make sense to collaborate on. Invite only to people who trust. Simple cpt called review – 1 per plugin. We are already doing the vetting consistently for projects.
Rachel / Shane – you really hit on the spirit of my initial tweet / idea.
I’m glad Otto flagged it, and he’s right that we should try to help improve the official repository, which (I’m going to say it again) is broken. I want clients and users to be able to see this list as much as other developers.
But in the meantime, upgrading the repo is a bit of a hornet’s nest politically, and incremental change is probably necessary and inevitable. A third party site can avoid this.
I even think that something like this would fit nicely into Code Poet. Code Poet seems like a way to create a social community around developers, and I think this would align quite well with that.
good call ryan, I sent pete davies a quick email on the topic
I tend to think Code Poet is a little too geared toward smaller businesses (that may have a lot of plug-ins that would fail vetting) and a little too official (an Automattic project)…
Would also think checking for things that aren’t necessarily “wrong”, but generally inadvisable. query_posts() comes to mind. Caching third-party API requests, etc.
To the thought of a third-party, independent ‘review’ repository – why not? Perhaps this can be hashed out and built at the dev day tomorrow?
Wondering the effect that can have on a new WP dev to have his first plugin marked as not-vetted. Because we can argue that is a great opportunity to learn.. but…
Not-vetted or failed vetting?
“Not vetted” I’m less worried about – it’s a bonus, and we can explain how we pick which plug-ins we vet.
“Failed vetting” – this needs more exploration, but I think most developers would welcome the opportunity for free feedback from expert devs.
I was referring to failed vettings.
I think most developers would welcome the opportunity for free feedback from expert devs.
I’ll reply to this when I stop laughing… it may be some time.
Didn’t work that way for the theme review team. Lots and lots and lots of theme authors didn’t like having to go through the process. Of course, that may be a different situation, since a “fail” in that case meant the theme didn’t go in the directory. Many theme authors left, but since these tended to be spammy authors or just making pretty poor themes in general, no big loss. Eventually a large amount of theme authors improved their code, and started getting in. So the standards went up a bit. Rinse, repeat.
Nobody likes criticism, really. Even when you couch it in “expert advice” or something, people tend to react badly. Expect it.
Otto – I really think there’s a difference between “sorry, you’re rejected” and “to earn the vetted badge, here are some tips.” Maybe I am naive on this point, and sure, some will have bruised egos, but when they realize the tips are valuable and nothing has been take away from them, and assuming the reviewers have a positive style, I think there will be some level of respect / appreciation. Think of it as catching a bug for another developer, and making a suggestion for a fix.
It’s not my case, but I know for a fact that there are some amazing project managers here. Can anyone take a stab at how much manpower do we need to achieve this? Not talking about creating the tools to do it. I mean to have a good chunk of the repo vetted.
@otto – I need some stats and could probably think this through.
How many plugins over 100k downloads are in the repo?
How many plugin in the repo which have been actively updated in the last 2 years
How many new plugin submission a day
How many updates a day
Anything else you think relevant.
How many plugins over 100k downloads are in the repo?
494
How many plugin in the repo which have been actively updated in the last 2 years
15,922
How many new plugin submission a day
20-30, on average.
How many updates a day
Probably around 800-1000 check-ins a day to the repo, on average.
Thanks otto – anyone know the pace by which the theme team reviews? how many themes / day on avg?
Even with a couple plugins a day it would take a year just to do the top plugins. We could probably setup a few hack days to hammer through the big ones?
They used to track that information themselves, however I don’t think anybody has updated the numbers in a while:
http://make.wordpress.org/themes/about/weekly-stats-trends/
Hey guys, deeply sad I missed that the Google Hangout was happening.
I just watched the entire video and I’m a bit concerned about this “vetting” process where some people get to be trusted reviewed and some plugins get to be vetted. It sounds like what you are trying to build is the Yahoo directory of the 90′s when we know that Google’s algorithmic search was a lot more scalable.
Can I propose that we implement a reputation system for community members and for plugins based on numerous signals instead of a manual selection process, and that we enable everyone to become a reviewer and from this the best reviewers reviews can bubble up to the top. We can then weight the reviews based on the reviewers reputation and how helpful the review is voted to be (like Amazon.com’s “was this review helpful?”)
This type of system would encourage companies like Jake’s 10up to capture their efforts when they rate plugins because good reviews that are voted up would improve 10up’s brand on WordPress.org. This way lots of people will be motived to write really helpful review because it will help them raise their standing in the community and help them present themselves as having a high reptuation to promote their services to prospective clients.
I’ve also got some thoughts about what plugin reviews might look like which I’ll post in topic #2.
I think we’re already headed to a review system that exposes and reviewers and (hopefully) allows for up / down ratings on reviews, that will take us along this path (see my reply to Matt, way up in this P2).
I also agree that a positive indirect consequence of this system would be more constructive involvement in the repository by community members, including 10up.
My original thinking was that this could happen in parallel with vetting, but my thinking is changing.
I’m not clear on what you mean in the last sentence, can you clarify? Also, I don’t see any comments from Matt on this page so I can’t find the comment you mention. Got a link?
I absolutely agree with you, Mike. The “vetting” idea actually really concerns me, both from an implementation and management perspective and from a user’s point of view.
Thanks for posting the video, wish I could have made it. Loved the discussion and the way it developed. I think Jake nailed it toward the end, proposing the addition of a social aspect to .org, enabling .org users to endorse other .org users. I think there’s a tremendous amount of value in being able to search/browse all of the favorited plugins of developers who you’ve endorsed. Also, if .org profiles would show who a user has endorsed, it would be a nice way of discovering other great developers who you may not be familiar with yet. Also, I agree with Shane that there’s a lot more value in showing the # of active installs of a plugin over downloads and would love to see this implemented in .org.
+100. I might be endorsing the endorsement of my own idea here, but I really think this is one of the most immediately actionable, useful ideas that we came up with.
And, as a side note, this might also discourage dishonest or lazy voting.
Also, we may want to intentionally not display the # of endorsements a user has to avoid FB & Twitter-like popularity contests.
A lot was said on this topic. Here are some of the ideas I’ve picked up from rewatching the video:
1) reviews for ratings (perhaps forced reviews with poor ratings)
2) connect ratings / reviews to actual users (perhaps favorites integrations)
3) relate ratings on a time scale in order to prevent older-is-better idea
4) have ability to change review / unfavorite
5) figure out way to make reviews not be worthless (rate reviews themselves, amazon style?)
What have I missed, and which of these are “sure thing” good ideas?
Regarding reviews: buddypress.org has supported plugin reviews for BuddyPress plugins for >1 year. Take a look at http://buddypress.org/community/groups/achievements/reviews/ for an example of an approach of this.
Is it possible to have a full discussion on these reviews though? I personally think anything more than a one-line response should be split out into a forum topic (linked from the review).
@Ryan Agreed. Rotten Tomatoes is probably the most useful review system out there and their reviews are just a sentence or two. I think the other component to that review system that makes it useful is that the reviews are by film critics, not just anyone. In this case, instead of critics, the reviews written by .org users who I’ve endorsed could be separated from the reviews written by users I haven’t endorsed. Similar to how Flixster has reviews by Flixster users separated from the reviews by Rotten Tomatoes critics.
@otto – is there a pattern right now for .org to have reviews or should I wireframe something up? We have great ideas floating around this forum. I’m staring at zappos right now and this is a really solid experience > http://cl.ly/Inf4 and could easily see it applied to plugins.
You can if you like, but it’s a bit early in the discussion to be making wireframes I think. It still has to be decided what exactly is going to be done. I expect such a discussion to take the better part of a month or two.
Some initial thoughts discuss on the call.
Metrics: # of downloads is the deceiving.
It hints at popularity but is deeply biased by age and users just trying things out with no commitments. Working with george at presstrends, I was able to get a side by side comparison of downloads vs active installs (across about 80k sites – so its not perfect) for event calendar plugins in the repo. http://cl.ly/ImfT. The left column is the # of downloads as of a few months ago and the right column is the plugins rank among all active plugins tracked by presstrends. You’ll notice that “Events Calendar” has more than 2-3x downloads of the top plugins and yet is the least used of the bunch. The statistic is deceiving.
Suggestions :
1) show # active installs – that is a real number people can count on
2) if we keep showing downloads – maybe limit within X time period (6 months?) along with total
Metrics: Star has no transparency.
Admit it, you’ve given yourself 5 stars. And maybe even, in a fit of frustration given a plugin 1 star the might or might not have deserved. The challenge with stars are well know: people value a star differently; people judge based upon different criteria; people’s ratings are not worth the same… so here is the question. What are things we can do to bring better meaning?
Second, you can’t rate from the dashboard. 100% of the time I abandon a plugin, I personally do not go hunt it down on .org to give it a low rating. Only a few plugins get a high rating from me and that is after a lot of repeated use. Why? Simply because it is inconvenient.
Stars can be gamed. Not that many people are doing it but Otto mention it is a legit issue.
Suggestions:
1) Ditch the stars. Replace it with favorite count, which will integrate with profiles and core.
2) Integrate stars vote option upon update or uninstall (increases problem with gaming the system)
3) Do the research to see what the big bad world outside of WP is doing to address this challenge (volunteer please)
Promotion: Screenshots, FAQ & Good Content
When I go to the apple app store, the first thing I look at is screenshots. Same thing with the .org repo. Difference is on our repo I an continually disappointed. The fact that I have to download a plugin and figure out how to make it work, just to see what it does far to often. Strong screenshots, FAQ and description play a huge role in building trust and expressing value. While we can’t require people to create great marketing content, it will be the difference between initial impression and mistrust. Look at the awesome impact something as simple as the top banner had.
Suggestions:
1) I’m not sure we can require screenshots, but I wouldn’t be opposed to it.
2) Find a way to increase the prominence of plugins which have taken the time to create a strong presence. Reward the effort.
Reviews:
One of the primary reasons I shop at amazons is to read reviews. It puts ratings into perspective. It gives a story. They rating are measured by “Was this review helpful: y/n” and then best and worst rated reviews with high helpfulness are displayed first. The only thing I wished the did was find a way to display the credibility of the reviewer. The cool think is that we have the ability to do that with author profiles (# of reviews, # of plugins, core contributions and other cool badges).
Suggestions:
1) get reviews going and tie them to rating & user
An extension of activation in the downloads section – not all installs are created equal. I have no idea how to do this, but if there was a way to show that this site was used by sites / brands people respect? If CNN is using a plugin on their site, that is a lot more meaningful than if joe’s bar and pub is. Just curious if anyone has any ideas aside from self reporting in a review.
ok, I have more but need to crash…
downloads
I completely agree that active installs would be hugely beneficial and a concrete indicator. +1 for that.
I do think downloads over a period of time can be useful. Maybe it could be integrated into the current stats page. But so much can go into downloads, like counting it as a download every time the plugin updates, which could be a lot.
Stars
I believe starts can add value. Here’s how the Mac app store displays them: http://cl.ly/image/2U0j053X2c2G
Just showing the # of each rating is helpful. The app store shows stars for the current version. I think this can create enormous issues. A dev could update to eliminate poor ratings, or may not update in order to preserve good ratings. I think total stars in a span of time is the way to go here, perhaps with the ability to view star history too (dreamy wish).
favorites
I view favorites differently than stars. To me, favorites are like my own little vetted list. They’re my “go-to” place. I think favorites could even be a big part of the socialization of the repository, and be one of the metrics used to measure when a plugin is ready to be vetted.
Screenshots and plugin documentation
I completely agree screenshots should be mandatory. I’d like to see nearly every admin page a plugin has (if any) in the screenshots section.
For documentation, I found that to be a difficult part of submitting a plugin. I’d like to see documentation pages for every plugin auto-generated that are basically the same as the codex. This way, even if a plugin author does a poor job, users can document the plugin as they see fit. Plus, a lot of devs do a poor job of understanding what a user needs to know about a plugin anyway.
The app store shows stars for the current version. I think this can create enormous issues. A dev could update to eliminate poor ratings, or may not update in order to preserve good ratings.
It’s not just that this can happen; it actively does happen on the App Store. I think a better way would be to have the ratings not count after x months after an update. So, a new update could be brought out to try and game this, but the previous ratings would still last for a while before disappearing.
Another possibility could be that the ratings decay over time. So, a 3 month old rating would be worth less than a rating made 1 second ago. I know Hacker News (http://news.ycombinator.com/) uses a rating system for comments based on this: http://amix.dk/blog/post/19574
A bit late to the party, but +1 on including some sort of history type system for ratings. Maybe a two color chart line would help. Show ratings based on the last 1 (or 2) years in the primary color, and then the total stars in secondary color.
Specifically regarding “# of downloads is the deceiving”:
Our stats tracking for active installs isn’t quite up to par yet to get those numbers. And that’s a hard problem, but one that we might be able to solve in the long run.
But, real *numbers* isn’t something we’d be willing to share, because raw numbers are usually crap without solid interpretation. Percentages, sure.
Here’s the thing: if I said 1000 sites ran your plugin, then all you know is that I said 1000 sites ran your plugin. You have no idea that that is a correct number. You don’t know my measurement methods. You don’t know my crazy assumptions. You have no way to gauge the reliability of my data. But, if I say that your plugin runs on 2.5% of measured sites, then that’s at least reliable, when compared against other plugins. It’s a useful metric.
Think about this problem: how many sites run WordPress.org software? Well, how do you define “run”? How do you define “site”, even?
If a site is setup on a server and nobody ever visits it, is it really running? At what point does a site disappear from our “count”? Not an easy question to answer, is it? This sort of thing is why raw numbers are misleading.
What about something along the lines of “report anonymous usage data to WordPress.org” which could be a setting on the general settings page even. This simply pings back to .org every so often with what WP version, plugins/theme+version.
Hits the following
Sites with no visitors never report because the process does not run
Easy opt-out
Greater reliability of tracking, maybe when users opt-out a ping back to .org with that. This way tracking opt-outs and knowing about what % of the user base you are getting usage info from
regarding @patrickgarman’s comment, getting that kind of data would be a huge boon to plugin authors as well as users. Many of us are operating blind and good data would really have an impact. For plugins that affect the front end, knowing the top XX themes used in conjunction with the plugin would help smooth layout conflicts. Same with knowing the top XX plugins to test for conflicts.
@otto, I totally hear you that the #s get convoluted.
% could be a reasonable approach although I wonder how much even the largest niche plugins have. Seeing 4% could imply low adoption when it is in fact the largest in the niche by 50%.
Maybe something like Plugin install rank could be another. Especially if it could be cut my some type of category or tag. Ie, like Brian was asking: Show me all SEO plugins sorted by active install base (rank). That would dramatically improve search. Only challenge is that it will naturally bury anything new and noteworthy.
there is the interesting challenge of mu installs, does a 10k instance mu build count a 1 or 10k?
Also how long does a site stay a credible statistic. If you plugin was installed 3 years ago and hasn’t been updated, nor has the site been updated and still runs WP 2.8, is this a useful & legit count? We know that very few WP sites ever upgrade.
can you weigh one site to be “worth”more than another? personally knowing that someone like TechCrunch runs your plugin means more to me than joe’s diner. perhaps that is more about exposing top users or something. not sure how to do this but it would be a big deal in terms of plugin credibility.
I think that low %’s are just something that each niche will need to weigh that against. In a specific niche the active developers will probably know that 2% is actually really high compared to all the others getting 0.2%
I think that MU sites should be treated as if each site was it’s own though as far as stats go. Each site would at least have a unique home_url.
Even old sites that have not been updated are still technically credible which is why version numbers are so important. We don’t like seeing sites running WP 2.X but it happens, and if the site is active it should still be counted but counted appropriately. Just like the plugins currently do with active versions, although I’m not sure where this metric comes from currently? – http://cl.ly/image/353w0r2b3o09
As far as weighing some sites more might be asking for too much information from a site though. Part of reporting anonymous data is that it’s being reported anonymously. It’d be awesome to know that kind of information but I’m not sure how well it would play out.
Usage rank it’s invaluable for plugin / themes developers. Bonus: show us the trend plotted over time. But not sure it’s a good measure of quality either. Some of the most used plugins are well know for its lack of quality. Also, plugins that do too much will tend to be better ranked that plugins that do just one thing but very very well. Citing Jake: “Britney Spears is really popular”.
Agreed that usage != quality, I think this is going to be more along the lines of tracking popularity and stats for developers more so than a usable stat for finding a good plugin.
Maybe once a vetting/review process is in place a plugin that gets released or has substantial growth over X period of time automatically gets flagged to check out.
A couple thoughts on this… Some already covered by others here.
I don’t see the importance of number of downloads as a measure of quality. It could be a measure of age or the reach of the plugin developer more than anything, neither a good way to tel
If something is coded well. Another reason against “vetted” authors. Even if every other plugin was great, their latest one might be lacking.
Star ratings have never had much meaning to me unless there is a corresponding g review explaining why. A yelp-like review and comment system directly attached to the plugin would be amazing. When we find a plugin that breaks something we could leave a comment saying “This broke my site because of x,y, z.
However, all of these things may matter to different people. So in stead of trying tonight’s out the one thing to sort plugins by or buil an algorithm that takes it all into account, wouldn’t it be better to have sortable listings. You could sort by star ratings, author rating, # of downloads, etc…
Oh.. And screenshots… Some plugins simply add functionality like a simple filter and have no ui or anything that COULD be a screenshot. Encouraging them is great. Requiring them does not make sense.
I think one of the major problems with ratings is, as discussed, their one-dimensionality. Ratings give you an overall picture of the plugin, but a one-star could mean “makes your server explode” or simply “I didn’t like this plugin but I can’t be bothered thinking of a proper rating”.
Someone mentioned in the Hangout about adding subratings to give more granularity to this. While this could work, I think it would add unneeded complexity to the system. I think a better way would be to introduce reviews and link them to a rating, with the review part optional.
How do you handle reviews though? Personally, I think reviews would serve a dual-purpose: to inform others of the plugin’s qualities, and also to start a conversation with the plugin author. A “helpfulness” metric would help here, much like Amazon’s: http://www.amazon.com/Kindle-Wi-Fi-Ink-Display-international/product-reviews/B0051QVF7A/ e.g. – Reviews that are rated “helpful” the most would bubble to the top, and those could help users make an informed decision on the plugin.
Trying to tie that into a social aspect, a flair system similar to Amazon’s system could be incorporated. Amazon shows “top 100 reviewer” for example; WP.org could show “core contributor”, “core committer”, “plugin author (42 plugins, average rating 4.9)” for example. This would introduce a trustworthiness metric into the reviews.
(Tying this into the badges idea: when a plugin is submitted for review, the reviewer could post their findings back as a review, with their flair including “official security review team”, e.g.)
I like the idea of showing “core contributor”, etc on reviews.
If you extend that to show it against the plugin authors on the plugin page itself, that would a) give users another useful signal about the quality of the plugin and b) give plugin authors additional motivation to contribute to core. Everyone wins.
Bingo.
All of these above suggestions are great (Shane – awesome list). That said, I can’t underscore enough the popularity is NOT a measure of quality. I don’t even care who’s Calendar plug-in is the most *used* (vs. downloaded). I care which one is best.
I really don’t check CNET for reviews much but could do something along what they have. There are user reviews and then separate core contributors or however you want to separate those users out.
Although I do agree with Jake there’s a couple things going on in the entire conversation here. Enhancing user reviews and that functionality is great, and I think a lot is already planned for this. I still would check (albeit not as deep) with plugins that have good “core contributor” reviews. This combined with a “vetted” status would be good though. Of the 50 plugins in a niche, 6 are vetted, and 2 have core contributor reviews. This leaves you with a variety of information to go off
BestBuy has some cool UX for its users reviews. Easy to filter for the things you’re interested about: http://screenshots.mzaweb.com/fqPi
@Ryan: +1 for “Helpfulness”
@Daniel: +1 for Best Buy’s description of relevant attributes.
@Patrick: IMO a “vetted” status is too binary and will favor the common use-cases which still won’t be relevant enough when the use-case requires more detailed evaluation.
@Jake: “Best” is too subjective. I’d suggest review system that let the motivated 1% give detailed reviews that 5-star rank relevant attributes and use those reviews/ratings to drive reviewer’s reputations which drives how much their ratings count in the overall score presented to the end user. This will allow a single simple number for end-users but allow advance people to drill down more. Here’s a mockup (sorry for pasting this URL a third time: http://screenshots.newclarity.net/skitched-20120817-224118.png)
Of the three topics, I feel that this one is the most important, and also the most realistic, since it doesn’t rely on a team of reviewers.
While reviews can be just as unhelpful as star ratings, they also have the potential to hugely improve things. There have been a huge number of times that I, as a user and developer, have wished I could leave a review on a quality plugin, in part to help it stand out from the crowd and in part just because I thought the developer deserved it.
By tying the reviews into .org user accounts and having their stats (# of plugins, core contributions, etc, etc) displayed (as badges or some other form), we can dramatically improve the worth of reviews. These stats won’t necessarily mean a lot to the typical WP user that doesn’t even know what the concept of Core is, but it will be a huge step in the right direction.
With reviews, there is always the possibility that users will leave abusive reviews, even when the plugin doesn’t deserve it, so I’d suggest a system for flagging reviews as poor also be implemented. Once a review has three negative flags, it gets hidden.
“Favorites” is not really a good seal of approval. You may add a plugin to favorites to remember reviewing it later. If the label were “Recommended” it’d imply much more for me. If I see that, for instance, Otto has a plugin in his “Recommended” list, for me is a no brainer decision to use it. Then, we can have a curated list of the “Recommended” plugins by the best devs (best as in # of core contributions, popularity of his plugins, etc). That seems fairly easy to implement, and seems to be a good solution for what @jakemgold proposed in the first place.
In short, if 5 core devs and 3 big names in the community “Recommended” a plugin, just go for it.
This will also not discriminate new devs. Because even though I can trust that “Recommended” list, it’s not an official dotorg label. Also eliminates the sense of favouritism factor altogether.
I think a “Recommend this Plugin” is an excellent idea.
+1 on recommend as well. This is a wonderful idea. Really it is just a favorite but with a different label.
Agreed.
+1 I think if each .org user was able to maintain a list of “Recommended” plugins and a list of other .org users they’ve “Endorsed”, this would make searching and browsing the .org repo a pleasure.
Agree that Favorites is not good, but “Recommend” is too simple. People recommend plugins for criteria that differs from the criteria I might care about. They may love it’s it’s UI whereas I might care it is uses Custom Post Types and provides hooks I can integrate with. Here’s a mockup to consider a potential rating module that I propose anyone should be able to use like anyone can review a product on Amazon: http://screenshots.newclarity.net/skitched-20120817-224118.png
The trick here is knowing who recommended a plug-in.
Knowing who recommended a plugin is useful, but knowing *why* them recommended it is even more useful. I’d be happy to recommend a plugin to one client and recommend against the same plugin to another, depending on the situation. Also, knowing *who* isn’t useful unless we have some validated why for people to establish their reputation as being a good reviewer. Unless we just want to create a cult-of-personality, which I sincerely hope is not the plan.
Mike – I think that’s where user profiles or “badges” come into play. Reviews from core contributors, for instance, might carry more weight, and it’s an earned credential.
I don’t really follow the “cult of personality” point. People will respect certain reviewers based on earned “cred”, not, say, personal likability (I think).
I think our views are coalescing. By “cult-of-personality” I was referring to what would happen if selected people were “blessed” to be the reviewers, not because of their cumulative reputation score created by their ongoing efforts but because they were hand picked.
Being a core developer doesn’t mean you will be a great plugin reviewer. I believe if we have something similar to StackExchange’s reputation model but for reviews then awesome reviewers will emerge, people who will take a lot of their own time to write great reviews. The core developers will never have time for that, and shouldn’t. Yes they can have an opinion and people may care about it, but we shouldn’t build a review system around their opinions.
The idea of badges is perfect which assumes people will be able to drill down to see all the different reviews and who made them but also have an aggregate rollup of reviews. So if Mark Jaquith’s opinion is important to someone they should be able to see that, but my point which I think we agree on is we should not limit the reviewers to a hand selected group; we should empower everyone in the community to review if they are motivated to.
In topic #3 I suggested a reputation system based in part on plugin reviews that were voted to be helpful. Here’s a mockup of what such a plugin review might look like:
Jack people said the rating system needs to be simple, but I think we won’t get valid until the ratings capture more valuable information. End users need a 9-of-10 stars but people like me need to know if it use custom post types when it should, does it have hooks I can use, can you override styling with your own CSS, how many HTTP requests does it make, does it 10 query_posts() on every page load, etc. By having ratins based on criteria (any one of which a rater could ingnore) would give WordPress.org the ability to see what other developers have said and thus make it easier for us to evaluate and to rate the same plugins ourselves.
We could create a handful of “vetting” criteria and then let reviewers add any criteria they want, and over time the core community could vet more criteria based on what people are using a lot in their plugin ratings.
This will be better than the 5 star ratings because only motivated people will rate, and ones who aren’t motivated will get poor reputations for their poor ratings and thus either won’t rate or their ratings won’t affect the single score displayed for the end-user.
Capturing this kind of criteria-based ratings would allow WordPress.org present weighted rankings of plugins for different personas; a non-technical end-user blogger will care about “Easy to Install”, “Visual Appealing UI” and “Unlikely to Break Site” whereas an agency building a site for a Fortune 100 will probably care a lot more about “Highly Secure”, “Able to Scale” and “Unlikely to Break Site.” Criteria-weighted rating algorithms could be pre-developed for common personas, and if a query UI is created and it would let the motivated among us to do custom queries based on our own weighted based on what we care about for any given site we might be working on.
Oops, P2 ate my <img> tag. Here’s the URL to the screenshot: http://screenshots.newclarity.net/skitched-20120817-224118.png
That looks perfect to me.
Yet another idea that would push this all to a more social atmosphere…I’d love to see a “Request a review” option where A plugin author could request a review from others. I imagined this on a small scale where I request it from people I know… However I could see it getting ugly for the more well known developers who could get inundated with requests. But maybe a Dev could opt in to be a reviewer and set a number of requests? I could see this getting messy but Ithe idea of requesting a review is interesting.
I like the sentiment but agree if could get ugly, fast. Everyone would ask @nacin and @markjaquith to review their plugin and they would get overwhelmed and very grumpy very quickly (and rightly so!)
Instead of requests for reviews why not provide incentives for people who write great reviews by adding a reputation system and the best reviewer and their reviews could bubble up to the top?
Agreed – lots of people will want reviews from credible devs, for the exposure if nothing else.
Yep. But I also think great reviewers could emerge like great question answerers did on WordPress Answers if a reputation system incents them properly. Those emergent high rated reviewers would probably have more clout to +90% of the user base who don’t pay enough attention to who the core developers even are, especially since more end users struggle with things that are easy for the core developers.
How about a “support available” flag. Let users know that if they download this plugin, their on their own. On the contrary, the developer can indicate that they will support / maintain their creation.
Regarding submissions, I’ve got a couple major takeaways:
1) Integrate some sort of plugin-check script in order to aid approval process.
2) Differentiate submission vs vetting. Manpower demands are too high to fully vet at submission stage.
This is the section I couldn’t grab a lot of tangible changes from the conversation. Would love some additions to this.
I’m not sure what all really needs changed during the initial submission. It sounds like Otto+crew already beefed up their standards quite a bit recently. Would be great to require some form or “quality control” right out of the gate but at the same time do not want to break the openness of the repo.
I think the vetting submission process could become a community deal though. A simple form with the plugin name/url and a textbox for comments could add plugins to a queue. Those who want to join in on the plugin review team can use that textbox to link to or include their own review of the plugin. As they review a few plugins the current plugin review team can group up and invite them to join?
I think one thing to help the submission process would be a steps + completed score. Like when I joined linked-in, they had things I needed to do, and a % complete. We could have things like:
30% Pass Auto-Security Test
30% Pass Code Quality Test
10% Plugin description
5% Plugin image
15% Screenshots
5% 2 ratings and review
….
whatever the steps might be to having a plugin demonstrate quality and value to users
I like this idea of a initial-completion score a lot. I think if some metrics can be agreed upon, it would immediately encourage plugin developers to do complete documentation around their plugin.
Whatever process goes into place needs to be created carefully so that new developers are not discouraged to quit or even end up moving to another platform. I think getting into the door should still be a fairly simple process, but maybe some help with keeping with standards and major security flaws.
This isn’t ever going to happen due to man power issues but if when submitting your first plugin the reviewer more mentored the developer on how to do things properly. I just don’t think there are many metrics that will actually be agreed on for auto testing to deny a plugin.
Giving nn% rewards is a great idea. Especially if those percentages are used in some way, like impacting where the plugin appears in search results.
@patrickgarman’s point is very important though. We shouldn’t achieve better quality at the cost of discouraging new developers.
I hear your concern patrick.
That is why I liked the % idea. Until you hit 100%, you aren’t a real plugin yet (little pinnochio). You’re just a pet project.
You can host your pet project on the repo, but we should lower it presence and value. We could even label it as “experimental” or something. And once you get your act together and hit 100% we’ll give you the bar mitzva and introduce you to the world.
Just to play devils advocate, while we do want to encourage learning and participation, is it reasonable to the user community to have poorly executed, documented and displayed plugins in the repo? I would personally be ok with seeing us go from 15-20 new plugins a day (according to otto) down to 5 if those ones are of better quality and presentation.
Where do you think we should draw the line? Requiring screenshots & a basic writeup in english that explains your plugin is just a baseline for quality in my eyes.
I think a lot of this is getting a little too mixed with more of a ‘vetted’ plugin though. I’m all for wanting better quality but the fact is if a plugin isn’t any good most people will delete it and just move onto another. Too many hurdles will start pushing people away. Even screenshots I would only consider a nice to have. If an author wants their plugins used they will put the effort into making the plugin great, if it isn’t great then it can sit there dormant like so many plugins already do. Don’t get me wrong I want to see improvements in new plugins, but just want to do it very very carefully. Maybe instead of requiring these for submission it could just be a “plugin completion” bar that appears when users look at it, or even just a notice to bug the developer. LinkedIn as the example you still get your profile even though it is only X% complete.
Requiring english I’d disagree with entirely though. Sometimes this just isn’t relevant, Alipay for example is a Chinese payment gateway. Requiring an english description would get in the way of plugins like these… http://wordpress.org/extend/plugins/tags/alipay
It’s an interesting idea, but I think we need to be a little conscious of making the plug-in repo process tedious. Do some of our best developers want to go through a wizard every time they want to share some well constructed code?
I think there could definitely be improvements here in the form of automated testing. As a baseline, an automated test for errors, warnings, notices, etc. could be incorporated. Errors and warnings would cause the plugin to fail, while notices would simply show a notice (similar to the theme checker).
A score could also be generated based on the WordPress coding standards, more as an informative tool for users. I know that many plugin authors don’t seem to use any coding standard, let alone WordPress’ (I myself avoid a few of the parts of the WP coding standards that seem insane), and I think at least giving an indication of this would improve the quality of plugins. A score (as a percentage, e.g.) would give developers a goal to work towards.
Took me a bit longer than I’d have liked, but they’ll be up momentarily.
Thanks for putting this up; I had a schedule conflict that meant I missed this. Seems like a very interesting discussion in the hangout, looking forward to the further discussion here.
I just want further discussion. By all means, please listen to the YouTube, comment on the discussion posts that Erick kindly made, and let’s hash it out. We can’t know what the community wants until the community speaks up. Tell your friends. Tell your enemies. Tell everybody. Spread the word:
WE WANT INPUT.
Let that be the word spread. More ideas, more input, excellent.
My 2 cents:
dotorg could be more like github
enable “social” coding, pull requests/patches (with ability to do code review), and more visible issue tracking (forum integration good, but could be more visible). This will encourage meritocracy. Plugins with traction will be obvious by the activity around their repos.
WP needs more plugins that provide “APIs” that other plugins can build on top of.
There are lots of doubled efforts happening in the plugin space, and so much duplicate plumbing. Drupal does this pretty well, with many modules providing the “guts” and no UI. Other modules then can build on a solid architecture but go nuts with the options, UI, and other bells/whistles that some devs like but others abhor. I’ve always wondered why I haven’t seen much/any of this in WP. There are some framework-y plugins/themes but nothing akin to providing the data model for “voting api” and other plugins providing the UI — one for “+1/-1″ style voting; one for “5 star” ratings, etc, etc. Quality APIs that are outside of the scope of core would allow less experienced devs to focus on parts of the stack they might be more comfortable with, and might be able to mitigate some of the more egregious code I’ve seen on dotorg.
Take up the invitation by Otto and help him review plugins.
If you’ve got the chops and the interest the best way to help others find vetted plugins is to vet plugins. If you don’t have the chops, figure out how to get them. A good way to learn is ask others to review your code. Another way to learn is to do code reviews. It’s easier than you think to spot “code smell” in other’s work even when you may be guilty of the same errors. Seriously, this is the one thing that would make the most impact — more eyes means that we can move past having stuff vetted for spam/phishing by 2 or 3 people to vetting code for quality, efficiency, performance, best practices, security AND spam/phishing.
+1 for Eric’s idea of social coding. Often I see plugins that could use tweaks, or they’re _doing_it_wrong(), and being able to contribute a patch back quickly would be great. At the moment though, it’s a lot of effort to checkout the plugin, create a patch and submit it to the (broken) plugin Trac, especially when plugin authors don’t monitor this. I’d love to see something GitHub-style with editing in the browser instead.
+1 indeed. Lots of times we need to fix broken plugins for clients projects. It’d be really cool to be able to contribute those changes back to the main plugin in an easy way.
+1 for the idea of social coding as well
While more social coding and github functionality might be nice, I’m going to again bring this conversation around to *non-technical* users. The repository should be a simple, consumable, *trusted* source for plug-ins for general WordPress users who could care less about how socially a plug-in is built, and are immensely confused when they visit github.
Let’s not make this more complex for Joe WordPress Publisher in our attempt to make a repository.
+1 for easy patches/pull-requests to other plugins. Making it easy to submit patches to other plugins will raise the standard of quality in plugins.
Jake’s comments considered, this feature doesn’t need to confuse Joe WordPress Publisher either – he never needs to see it. It can be hidden away for developers looking for it, in much the same way Joanne WordPress Publisher never sees http://core.trac.wordpress.org/
+1 on Brent’s point, it can all be hidden away under the Developer tab on WP.org. I think filling out that tab with more tools would be great.
Again – all for it. I just think the primary objective right now is separating the wheat from the chaff. Part 2 is how do we more actively improve the chaff (and the wheat, I guess).
Focusing on “separating the wheat from the chaff” is also more achievable in the short term.
I’ll start another thread for the idea of achieving this through a better search algorithm.
In separating the wheat from the chaff… who’s doing the gardening? By what rules do these gardeners operate? That’s the discussion I want to see.
You have free rein to argue it here, and I would *love* to do more hangouts when asked. So please, self-organize. Come up with a set of guidelines. Build them up. Revise them. When I see a team of self-organized folks ready to take on the challenge, then by all means, I will pass on advice, provide a location to organize, provide the needed powers to get the job done, etc.
Build that team of volunteers willing to get the job done, come up with a set of principles by which the job will be done, show that you’re capable of doing the job, and I will absolutely assist in any way possible.
Take charge. Get ‘er done.
Social Coding *is* for non-technical users. Other than a review team, this is the most direct way, I can think of, to improve quality of the plugins in the repos for *everyone*. Yes, only devs will participate in the coding, but that’s already how it is now.
@Eric – I agree but just to a point.
Social coding is for end user’s in that that who’s it generally benefits, but most end users don’t get those services. GitHub is built for developers by developers – trying to get the average company or Joe Publisher to figure out how to interact on there isn’t a good idea (this is coming from experience – even with newbie developers).
I think there should be two segments, so to speak: one for developers to communicate with developers (just like GitHub) and one for users to communicate with developers (like a forum of some sort). I’d even go as far as to say that the Stack Exchange model can detract a bit because a number of people are more concerned with getting their questioned answered or bugs resolved they are than obtaining reputation or badgets.
If there is to be any plug-in reviews, or vetting of plug-ins there needs to be first the guidelines in place with which to vet them against. I think these guidelines should be worked on now – since even if vetting isn’t put into practise yet, it will serve as the ‘go to’ place for authors wanting to do it right. It will also act as a good indication of what any future plugin review team would look at.
So while how vetting might / might not be done is still being thrashed out – should we start looking at the actual guidelines for plug-ins. Since this will be useful for authors now. If anyone has drawn up a rough draft list – then please post it. Otherwise I can start one up on Google docs, similar to the UI guidelines being worked on: https://docs.google.com/document/d/1ZWPeUSFVYlMxClmHFjuAXuekXcZsLso49G3bDRquHcs/edit?pli=1
@seiharris a Google doc looks like a good way to get into more meaty content around guidelines.
@seiharris – that is a good point. Now that the core contributor handbook is done, I knwo that there is interest and intentions to create a plugin contributor handbook. Not sure (@nacin or @otto …) who is going to own it, but it makes a lot of sense that this should be a paired effort with the current conversation.
My apologies for not making it to the Hangout, I also had a schedule conflict. Had to drive my wife into work. THanks for recording it to YouTube though, I’ll watch back over it
Just noticed that you can’t see the text chat when re-watching. One of the questions was how do we measure and determine quality as a user.
metrics of quality:
base review (security, what it says it does, no errors / notices)
peer dev review (vetted by respected community members)
peer usability review (vetted by respected community members)
community usage (active installs, downloads, updates)
community rating (favorites, works, rating, reviews)
support level (# of threads answered)
anything else?
Which metrics to use for quality is very a important question.
Each of those sound great, except for community usage. My most popular plugin is by far my worst code. Many of the most popular plugins were also written many years ago, before many important WP APIs and before WP Coding standards.
Usage is a better measure of demand than quality.
The only thing I would add is “WP Coding Standards” to the base review.
I’m a bit behind in chiming in on this and I’m doing my best to catch up so hopefully these thoughts will all be captured in the right thread. If not, my pre-emptive apologies
.
I like @Shane’s initial metrics of quality – I’d just like to add that there needs to be a line drawn between the technical areas that we’re discussing (GitHub-esque and ala Stack Exchange) and user-centered features.
First, I think that *all* developers should be using the Developer plugin. I’m not sure of a good reason why a user shouldn’t be using this.
Secondly, I think that the criteria that’s build into many of those sub-plugins (Log Deprecated Notices, Debug, etc, etc, provide a solid foundation of basic rules to follow. I think that plugins – before being released to the repository – should be evaluated against the rules those plugins. If passes, perhaps that’s good enough; otherwise, it isn’t approved.
Finally, I’m a fan of developers providing PHPDoc-based comments of their functions and variables. This definitely plays more into the developer-centric features, but I think that if we’re at all going to being hacking each other’s plugins, we need to make sure we understand what one another’s functions are doing
.
My thinking is that something that could be implemented and experimented with in the short-term is a way to prioritize search results by the number of the plugin author’s core contributions. If their code is trusted enough for WP core – to me, that’s a massive indicator of being able to trust the code.
Agreed. This is exactly the kind of change we need – and this information is already being captured. In fact, we might just encourage more core contributions if they influence plug-in popularity.
What I think we need to figure out, however, is how to also elevate plug-ins those leading contributors / community members have vetted, but haven’t contributed to.
This goes back to – I think – the best idea in our conversation: users should be able to favorite WordPress.org users, and view plug-in’s favorited by their favorite users. Think what “Facebook Like” does for websites, for plug-ins… a feed / timeline of activity by favorite users. “Nacin just favorited ________ plug-in.”
This would be really cool. And I think buddypress has this functionality? Only thing I’d like to argue again is for the difference between “Fav” and “recommends”. Actually I think Fav is a weird concept. It’s much clear to have “bookmark” and “recommend”, or something like that.
I agree that “Recommended” might be a better term. Though I’d like to think elite developers wouldn’t favorite poorly engineered plug-ins.
+1 for prioritizing search results based on code. As mentioned in this comment, I think adherence to the basic Developer Plugin rules would be one place to start.
I think we need to be careful with a plugin author’s core contributions. Case in point: I love WordPress and I’ve been trying to actively contribute more to core *but* I’ve had some patches merged, some dismissed, and some I just don’t have time for because of outstanding work (all WordPress-based – themes, plugins, etc).
Perhaps a badge indicating that the author is a Core Contributor would be a way to help enforce this?
So glad you guys have video of this, sorry I couldn’t make it – had some family stuff that needed doing.
What sort of ranking algorithm is used for plugin search atm?
Many of the suggestions from the video aimed at “separating the wheat from the chaff” could be achieve through a better search algorithm.
IMHO as a WP user that’s become sophisticated over time – low hanging fruit – enhance the plugin repository with better info, sorting, tagging. Social coding is a great idea but the “ownership” of the developer needs to be considered. Don’t cater for the lowest common denominator, instead try a rating for user level to work with a plugin. Better RSS feeds of plugin releases. **Most of my ideas are structural and cost little and require no maintenance.** One more controversial idea – add advertising to the repository to pay for the enhancements.
I’m slowly pulling the Tweets together. If I missed any, send the URLs and I’ll add them.
Thinking there's a market for trusted #WordPress pro devs to build a repo / "app store" experience for independently vetted plug-ins…—
Jake Goldman (@jakemgold) August 15, 2012
@jakemgold someone is already working on that.—
Norcross (@norcross) August 15, 2012
@norcross curious to know who. chops of said reviewers crucial to my trust of said repo. are they core dev quality?—
Jake Goldman (@jakemgold) August 15, 2012
@jakemgold nvmd, I was thinking of the WP App Store. what do you think this repo would do?—
Norcross (@norcross) August 15, 2012
@jakemgold I've been on a plugin building kick lately, in part because I can't find what I'm looking for more often than not.—
Norcross (@norcross) August 15, 2012
@norcross review & grade notable plugins, be them free or $. 10k+ official repo has lost credibility as trusted source; this is a problem—
Jake Goldman (@jakemgold) August 15, 2012
@ryancduff @Otto42 @jakemgold @norcross Same here, and my first releases were total crap. But I learned and now I know. Needs to be open.—
Erick Hitter (@ethitter) August 15, 2012
@Otto42 @ryancduff @ethitter @norcross totally agree. my 1st plugin blew. hence my sense theres incompatible mission vs what I'm picturing—
Jake Goldman (@jakemgold) August 15, 2012
@jakemgold @ethitter there's no rhyme to the featured list. It's largely what I think is cool. I'm the only one who has updated it in a year—
Samuel Wood (Otto) (@Otto42) August 15, 2012
@Otto42 @jakemgold @ethitter @norcross Didn’t we all? Just need to highlight cream of the crop. How to do and not discriminate?—
Ryan Duff (@ryancduff) August 15, 2012
@ryancduff @jakemgold @ethitter @norcross true, but we also don't want to discourage new devs. I started out by writing plugins too.—
Samuel Wood (Otto) (@Otto42) August 15, 2012
@Otto42 @ethitter maybe I'm off on the bias, but I find the "featured" list "curious" sometimes—
Jake Goldman (@jakemgold) August 15, 2012
@ryancduff @jakemgold @Otto42 @norcross Totally get that. Not suggesting totally closed, just a way to denote who's _doing_it_right().—
Erick Hitter (@ethitter) August 15, 2012
@jakemgold @ethitter @Otto42 @norcross It’s the “here’s something I created, maybe it’ll help you too” aspect. Not the best, but fosters—
Ryan Duff (@ryancduff) August 15, 2012
@jakemgold @Otto42 No Automattic bias here, I've been saying this for years. Realized it with what @ra8 had to do to get a module listed.—
Erick Hitter (@ethitter) August 15, 2012
@ethitter @Otto42 @norcross @ryancduff problem is official repo no longer trusted by those in know; probably shouldn't be by those not!—
Jake Goldman (@jakemgold) August 15, 2012
@jakemgold @ethitter not sure what bias you refer to. Only one of the usual reviewers works for Automattic.—
Samuel Wood (Otto) (@Otto42) August 15, 2012
@norcross no offense taken, it's just a big job, with few highly qualified volunteers.—
Samuel Wood (Otto) (@Otto42) August 15, 2012
@ethitter @Otto42 @jakemgold @norcross Agree on that. Unity is what I’d prefer, but .org has a stigma of sorts. Trying to be fixed.—
Ryan Duff (@ryancduff) August 15, 2012
@Otto42 @jakemgold @norcross @ryancduff Definitely requires an investment on someone's part, but I think it's a strength Drupal modules have—
Erick Hitter (@ethitter) August 15, 2012
@Otto42 trust me, I don't want to seem as though I'm accusing of anything.—
Norcross (@norcross) August 15, 2012
@Otto42 @jakemgold @norcross @ryancduff way of indicating such. Maintain one canonical, trusted source for plugins and updates.—
Erick Hitter (@ethitter) August 15, 2012
@norcross there's like 5 people doing it in their spare time. We mostly confine ourselves to security and spam for simple time reasons.—
Samuel Wood (Otto) (@Otto42) August 15, 2012
@Otto42 @jakemgold @norcross @ryancduff I'd honestly rather see a vetting/review process in the .org repo than something new, with a flag or—
Erick Hitter (@ethitter) August 15, 2012
@Otto42 and nothing indicating that it's even (a) been reviewed or (b) current—
Norcross (@norcross) August 15, 2012
@Otto42 and nothing indicating that it's even (a) been reviewed or (b) current—
Norcross (@norcross) August 15, 2012
@jakemgold @norcross @ryancduff have you considered emailing plugins@ with some reviews, or suggestions? Just a thought.—
Samuel Wood (Otto) (@Otto42) August 15, 2012
@ryancduff @otto42 @jakemgold @ethitter @norcross or a list of "community voted 'good' plugins"—
Pippinsplugins (@pippinsplugins) August 15, 2012
@pippinsplugins @ryancduff @otto42 @jakemgold @ethitter @norcross @justlikeair #wpchat like I said, expand and encourage ratings/reviews
—
Ben Lobaugh (@benlobaugh) August 15, 2012
@benlobaugh @pippinsplugins @ryancduff @Otto42 @ethitter @norcross @justlikeair STEPPING BACK FROM MY FIRESTORM TO EAT! will be back
—
Jake Goldman (@jakemgold) August 15, 2012
@benlobaugh @pippinsplugins @ryancduff @otto42 @jakemgold @ethitter @norcross showing #active rather than #downloaded might be a good start—
Shane Pearlman (@justlikeair) August 15, 2012
@justlikeair @pippinsplugins @ryancduff @otto42 @jakemgold @ethitter @norcross that would be really cool. Add a ping to WP core for that
—
Ben Lobaugh (@benlobaugh) August 15, 2012
@jakemgold @pippinsplugins @ryancduff @Otto42 @ethitter @norcross how do we decide who is the expert?—
Ben Lobaugh (@benlobaugh) August 15, 2012
@jakemgold @pippinsplugins @Otto42 @ethitter @norcross Best of the best, cream of the crop… best calendar plugin? this one here. etc…—
Ryan Duff (@ryancduff) August 15, 2012
@ryancduff @jakemgold @otto42 @ethitter @norcross maybe not community, but "vetted by pro developers"—
Pippinsplugins (@pippinsplugins) August 15, 2012
+1 @jakemgold @pippinsplugins @ryancduff @Otto42 @ethitter @norcross—
(@codyL) August 15, 2012
@otto42 Shouldn’t this conversation be on http://make.wordpress.org/plugins/ instead?
I didn’t create this site, somebody else did. I just posted it here because it was already here.
Consider the conversation today to be preliminary, involving most of the people who were on twitter last night. It got too fast to keep up with, so I figured a live chat would work better. I plan to mostly listen anyway. Conversation can grow from whatever comes of this.
Note that I do expect that whatever happens to affect make/plugins. That would be a good place for a plugin review team to be based.
Daniel Bachhuber 7:28 pm on October 29, 2012 Permalink | Log in to Reply
Neat
Can you add a “Subscribe via Email” to the right rail of that new site?
mikeschinkel 9:20 pm on October 29, 2012 Permalink | Log in to Reply
Ditto what Daniel said.