As many of you know, I have officially ended my contributions to Mozilla for the time being. Since I published that post, I have been asked by several people why I left, and ways that I think Mozilla could change to be more friendly to Triage. So, I thought I would give a blog post to try to explain a few things, and just maybe spark a bit of discussion on Why triage needs to be improved.
Update: Please read this post in conjunction with this one.
First off, I did not leave because of rapid release. I love the idea of rapid release, and I think the recent spurt of posts to the planet on how Rapid Release will be beneficial in the long run does a great job of explaining it. Rapid Release is going to be awesome if done properly. I have always been so frustrated by the continual late releases that hold back awesome new features from the web. I think Rapid Release should have been planned out better before it was put into effect (figuring out a plan for the Enterprise, making the update process smoother BEFORE jumping into it, could have been explained better). But I think that excellent progress is being made on all these fronts, and hopefully by the end of the year, many of the quirks of the system will be ironed out. I wish it had been done earlier, but that is neither here nor there.
Nor did I leave because of a recent conversation concerning the removal of version numbers, at least it was not the main reason I left. I feel that the whole conversation could have been handled much better on both sides of the board (For those who were calling for murders over a SOFTWARE change, you seriously need to get a life. Put down the computer, go camping for a weekend, then come back when you can act reasonably. Even though I firmly believe in Mozilla’s mission, software is not that important to be threatening people’s lives, and frankly, acting like a spoiled 2-year old.) But, it was not the determining factor in my decision. It would be silly to leave a project that I have contributed to for 3+ years over a misunderstanding.
Why I Left
I left because of a general lack of interest in doing anything substantial to improve the Triage process on BMO outside the QA community and a few others. Triage as we know it today is NOT ready to handle the Rapid Release process. While Releng, Devs, QA, SUMO, etc. have been able to transition to the new process relatively smoothly, Triage was caught off guard. And not through a lack of trying. I have pushed for change, and I know others have as well. Change has been made. But it is not enough. I understand that change takes time, and there is always a delay between planning a change, and the implementation. But with Triage, time is our enemy. We currently have 2598 UNCO bugs in Firefox that haven’t been touched in 150 days. That is almost 2600 bugs that have not been touched since Firefox 4 was released. And how many more bugs have been touched, but not really triaged or worked on? Every day this number grows. 27 UNCO bugs have been filed in Firefox alone in the last 24 hours, and that over a weekend.
Triage is Broken. Period.
With the old model of releasing a new major version once a year, triage had a bit more time to go through a massive pile of bugs, to find regressions and issues, and there was a pretty good chance that most bugs would get caught, just because we had time on our side, and we could afford to miss a bug for 6 weeks, because we would most likely get around to it. Or so went the theory. Even this process failed us.
Organization, Community, Politeness and Professionalism
In Spring 2010, we hit roughly 13,000 UNCO bugs in the Firefox product on BMO. 13,000!!! We currently have 5934. While this is an improvement, that is 6000 bugs in Firefox that could be shipping today, and enhancements that could be making the web better (of course it isn’t that high, but the potential is there). This is several thousand contributors that we have told “Thank you for filing a bug report with us. We don’t really care about it, and we are going to let it sit for 6 months and just ask you to retest when you know it isn’t fixed, but thank you anyway. Oh, and Mozilla is run by the community.” Even though nobody means this, that is what we tell an end-user who just submitted their first bug and is ignored. We can’t do this. We just can’t. Now, I realize that it is impossible to touch every bug immediately, but, to those who have gone the extra mile to create a BMO account, file out the bug report form, and navigate BMO emails just to report something they feel is important, don’t we owe at least a little common courtesy?
Perhaps, if a bug isn’t replied to immediately, an automated email could be sent to say “hey, we haven’t forgotten you, we appreciate the bug report. All the Triagers are busy, but here are some steps you can take to test if this bug goes away:” or something along those lines. Building a friendly relationship with our community not only gives them confidence that their report will be taken care of, but it also helps them want to come back.
We also need more coordination as a community so we can touch all these bugs in an effective and professional manner. Having better tools for the Triage community to find bugs that haven’t been replied to within a certain time period. Having better communication between developers and triagers, and between triagers themselves. I want a developer to be comfortable asking the triage group “I have this list of 200 bugs I would like triaged in this way. Thanks”, and it to be done, without a bug day. And I want there to be developers who when they are pinged on a bug by a triager, to reply and assist with the process as needed. Along these lines, THIS IS AWESOME! More of this, lots lots lots more please! I also want the triage group to have an easy way to be able to say “Hey, I noticed that Bug XXXXXX seems to have lots of dupes, let’s keep an eye on it.” Or “I think it is about time that we go through and clean up some old bugs.” With multiple eyes looking at the same list, we can help keep bug numbers down.
I realize that triagers are volunteers, and thus may not be held to the same standards as a Mozilla employee. But I see no reason that they shouldn’t be. As a Triager, I have unique opportunity. I get to be the person that most people will associate with Mozilla, at least at first. This is incredible, and terrifying. Everything I do, and how I treat people’s bugs, is going to influence their perception of Mozilla for a long time. Giving one sentence replies, being rude and thinned skinned gets you nowhere. I have learned this the hard way many times. We need to have a set guideline of how to treat specific bugs, and how to treat their reporters, and somehow ensure that Triagers are following these guidelines. While there are spammers, and people who are just trying to cause problems, the vast majority of BMO bug creators are just trying to make the software they use a little bit better.
We need the Right Tool
Even if we create all the best tools in the world, if we are disorganized, it isn’t going to do any good. We might have 5000 UNCO bugs instead of 6000, but that isn’t an improvement. On the other hand, we can be organized, but if we don’t have good tools, we are going to be trying to cut down a tree with a pocket knife.
Right now, there is no real way to triage except “Here is a list of 1700 bugs. Start at the top and work your way down.” We need a way to mark bugs that need triage (Like Getting Bugs Done and Triage Flags). But we also need to remember, BMO is being used for things it never was created for. BMO is either an end-user bug submission forum, or a developer bug submission database. It can’t be both, but that is what it is trying to do. It worked when it was small, it kinda worked when we took a year or two to release a new versions, it doesn’t work with Rapid Release.
This is why I have envisioned a separate BMO product of UNCO bugs. That way, we can separate End-user bug submission from the development process, at least in the beginning stage. We can make a simplified interface for those who are just starting out with Mozilla. We can add flags to bugs, that while useless for developers, are incredibly useful for Triagers. We can have a different set of rules for the UNCO product than we do for the Developers. While there is no perfect answer, while we can never make everyone happy, we can do what is better for the community as a whole.
I also strongly feel that Mozilla needs at least 1-3 people to just handle Triage on an Official Full-Time capacity. Pretty much all the community Triagers I know have jobs, families, lives outside of Mozilla. Mozilla is just one of their hobbies. While they are passionate about what they do, they don’t have the time it takes to full-time triage. And with Rapid Release, this is what we need. Plus, a Mozilla Employee has a lot more weight at getting changes done, organizing and revitalizing the community, and helping continually improve Triage.
I know this was a long and rambling blog post, and I covered a lot of things with little cohesion. I felt it was important to get my thoughts out in one final attempt to maybe get some traction going on fixing Triage for the good of Mozilla. Without long-term changes, I feel that triage is going to be facing a larger and larger pile of bugs that doesn’t go down, and eventually may just go away out of frustration. And this has nothing to do with the community. When I see what triagers have to go through to do a good job, and they still stick with it, I know that if Triage goes away, it won’t be the fault of the community. It is the only thing keeping it together in fact. I hope to see some sort of improvement in Triage, and when I do, I will jump right back in. But as it is, I see no use in contributing to a project that seems to have gained little attention from Mozilla. No, there is no glamor in Triage. But neither is there any glamor in 20,000 UNCO bugs.
I am not looking for zero bugs, I am not even looking for less than a thousand. I am looking for timely responses to all or nearly all bugs reported, and better tools and organization in the triage community.
I should also say, for the past two weeks I have been in discussions with several Mozilla Employees and Contributors on how Triage can be improved. I simply wanted to be fair to those in the community that were not aware of all my concerns with Triage, and perhaps to get more feedback. So, this is my in-depth explanation of last week’s post.
(These are my personal opinions, and are not meant in any way to attack Mozilla or any Employee of Mozilla or Contributor. It is simply my frustrations, and personal opinions of what can be done to improve triage. I am not saying my ideas are perfect, but I want them more to serve as a catalyst of HOW to improve triage for the long-term. Please, feel free to contact me for further discussions.)