Why we Need to Radically Change Bugzilla, Part 3

Well, this is the meat of my series, where I stop talking about hypothetical solutions, and start to think about some real, achievable fixes. The main thing I will focus this post on is the “Staging Area”, and the most practical way to handle it. I may not cover all the bases, so please, in the comments, point out edge-cases I have missed and I’ll do my best to address them.

A New BMO Product

So far, of all the possible ways to create a staging area I have thought of, a new BMO product seems the easiest, most cost-effective, and relatively easy to implement. While there are both pros and cons, I think that we can create a very effective staging area with a new BMO product.

The Vision

A Product called “Un-triaged Bugs”.
Components inside the product for every individual product that needs triaging (Firefox, Thunderbird, Seamonkey, Etc.)

A reporter without CANCONFIRM privs files a bug. They either go to BMO, or to the Staging Area Front-End (issues.mozilla.org for example, different UI, ties into the same backend). Using a redesigned Guided Bug entry form, they file their issue. Along the way they are asked if their issue has already been filed, or encouraged to go to SUMO. If they decide to file an issue, then it is automatically sent to the Un-Triaged Bugs Product.

In this product, all Bugs are reported as UNCO, all bugs are very much simplified. The Reporter is only exposed to fields they need to see. Blocking flags are gone, CC list is gone. Whiteboard, milestone, priority, product, component, importance, blocks and depends on are all hidden from the user. Users with CANCONFIRM can see the fields, users with EDITBUGS can edit all the fields. But to the first time bug reporter, a much simpler, easier to read, and friendlier bug page is used. All those fields are not likely to be needed by a first-time bug reporter, and only serve to confuse them (of course, this can be turned on and off in the preferences).

Then, Triage jumps on the bug. They know that nearly all bugs sent to the staging area are end-user bugs, so this makes their life much easier. Bugs that are actually support questions are sent to SUMO. Bugs that need to be directed to an extension developer are sent to AMO in a similar way. Feedback is sent to Hendrix. After sorting through the bugs, then Triage digs into the “try with a fresh profile” “try with a recent Firefox version” “update flash” “give a stacktrace/testcase” questions.

Once enough information has been gathered, then the bug is morphed into the proper product and component. Example, Bug XXXXXX is filed against Firefox, and concerns the location bar. After triage gathers information, they move the bug to Firefox:Location Bar. At this point, the bug is marked as NEW, and developers can jump on it and begin work. The developers can make decisions such as WONTFIX, WFM, INVALID, or fix the bug.

Pros

  • Unconfirmed bugs and bugs that are not bugs are totally separated from the bugs that are really valid. This eases the tension on BMO between devs and end-users
  • Eased workload on triage. Instead of having to separate developer bugs from end-user bugs, Triagers know all bugs in the Un-Triaged product need attention, while those in specific products have already been triaged and are ready for a dev.
  • End-users no longer have to try to figure out products or components. They select the product on the guided bug entry, and triage takes care of the rest.
  • BMO is no longer balancing end-user bug reporting and development in the same place. It is separated out without too much additional work

Cons

  • Even with improvements, Bugzilla is still just not very pretty for end-users. Perhaps this post can help that somewhat.
  • This will require significant changes in how triagers work, instead of being spread out, everything is dumped into one place. This will require changes and more efficiency or this pile will get out of control.

In my next post I plan on discussing what needs to be done to make this work more in detail, as well as specifics of how this component would work.

Why we need to Radically Change Bugzilla, Part 2.1

Just a quick post to clear up something that I’ve noticed in the comments. I am not proposing that BMO be any more locked down than it is. I don’t want to make it less open to the public, and I feel that keeping BMO open is what will help drive Mozilla forward. That is why I am proposing this change. So then end-users who have legitimate bugs can have them put right in BMO, while those who just need support and help can get it without feeling that they are being brushed off. By treating EVERY report as important, we can go far in helping increase the goodwill of end-users towards Mozilla.

Also, the Staging area will not be required. I envision it as something like the Guided bug reporting form. Enabled by default for new users, but can be turned on and off at will. If someone who has been using BMO for years wants to enable/disable it, that is totally possible. Bugs can also be moved in and out of the staging area. For instance:

  1. Issue is reported to staging area. At first glance it is a support issue, triage kicks it to SUMO.
  2. Sumo works with the reporter and determines there is an underlying bug in Firefox that is causing the issue. SUMO moves the bug back into Staging.
  3. Triage determines the proper product and component to move the issue into, and does so.
  4. Developers in BMO Fix the bug.

In that case the bug went from Staging to SUMO, from SUMO to staging, and then from STAGING to BMO. If the SUMO people knew what component the bug could go into, they could have moved it directly there as well, skipping step 3.

Why we need to Radically Change Bugzilla, Part 2

First, let me clarify a few things. By “End-User Bugzilla” and “Developer Bugzilla” I didn’t mean two totally separate Bugzilla installs. While that could be made to work, there are much easier ways to go about it that are much better. I will continue to call it “End-User Bugzilla” for lack of a better term however. The End-user Bugzilla could take the form of a product on BMO, a separate database on another website, a whole range of ideas, I’ll lay those out in a future post.

Current Bugzilla

While slightly out of date, this chart does give the basic idea of how every bug in Bugzilla goes through life. Developer bugs miss the UNCO state usually, but all end-user bugs follow this pretty much from beginning to end. In theory. In reality, end-user bugs sit in one of two states for years. UNCO and NEW. Very few become ASSIGNED or FIXED. The majority end up becoming INVALID, DUPLICATE or INCOMPLETE. Reducing that number, and raising the number of FIXED bugs is the challenge here.

Proposed Bug Process

For Developers, the bug process remains largely the same, a bug is reported, assigned (or resolved WONTFIX, etc.), then FIXED. For End-Users, the bug is put into a staging area. This staging area is where triage, QA and Support can all work together to improve bug reporting for out end-users. A user reports an issue (ticket, bug, whatever we want to call it). Triage jumps on it, ensures that the issue has all the information it needs, filters out spam, and if it is ready, sends it off to the proper “department”. Support questions go to SUMO, Extension bugs are tied into AMO somehow (feedback on an extension perhaps?), and legitimate bugs are sent to BMO, where they are then put into the proper product and component (this is all a one-step process, much like changing products on BMO today). Issues where the reporter never gets back are marked INCO within a certain period of time.

Why this is Better

For Developers, the excess wading through bugs submitted by end-users that may or may not be real bugs is avoided. BMO is devoted to only real bugs, and triage doesn’t have to worry about tripping over a developer and visa versa.

For End-users, this ensure more personal attention. Instead of having their bug ignored, marked INVALID (how uncaring does INVALID sound? It is a valid issue to them, or they wouldn’t have raised it) or some other confusing resolution. Instead of simply being brushed off (which, no matter how we mean it, it appears that way to the general public), users will now get their bug moved straight into the proper place. If they need support, they are happy because boom, there it is. If they did find a bug, it can be an entry into more bug reporting in the future because we move it straight into BMO. If it is another issue like an extension, then the information can be moved where it needs to be. Perhaps even feedback can be taken, then submitted to Hendrix as higher quality information.

A Clearing House

So in essence I envision the “End-User Bugzilla” as a central place for input and feedback from users. Taking the loose ends and pieces of Mozilla’s carious portals, and tying them all into an easy place for end-users to get help, report issues, and make a difference.

Why we need to Radically Change Bugzilla, Part 1

Since April 7th, 1998, bugzilla.mozilla.org has served Mozilla. Over 657,000 Bugs have been filed in the multiple products and components covered in the vast database. Over these 13+ years, Bugzilla’s users have been divided into two groups.

  1. End-users: Those people who have found what they think is a bug in a Mozilla product. These may or may not be legitimate bugs in the product and the likelihood that the original reporter will ever reply on their bug is unfortunately low. However, many very real bugs are reported by general end-users, this is a very very important function of BMO.
  2. Developers: Mozilla employees, contributors, etc. Those who actually are working on the Code of Mozilla products. They submit patches, they review changes, etc. They use Bugzilla for both reporting bugs, and for reviewing new changes and features to a product.

These two groups are both essential to the Mozilla ecosystem. Without end-users reporting bugs, software quality would tank, because lets face it, there is simply no possibility that Mozilla’s QA team can catch every bug. The more eyes on the software, the better. Without developers, well, there would be very little change going on in Mozilla.

These two groups use BMO in very different ways. That is why we have the UNCO and NEW bug resolutions. Developers and established community members can file a bug as NEW without having to go through the confirmation process. End-users submitting a first-time bug need to have the bug double-checked and confirmed by another person before it changes state. Developers usually know right where they want their bug to be, end-users often need help navigating Bugzilla’s complex interface. Triagers exist to help end-users, we hardly need to do anything to help developers.

While there have been attempts to improve Bugzilla, the main problem is, Bugzilla is broken. Yes, it is busted, broke, shot, however you want to say it. Now, before you start hyperventilating until your pocket-protector breaks, let me clarify. Bugzilla is broken for end-users. For developers it is an extremely powerful tool. A painfully complicated tool from time to time, but it is a wonderful tool that will continue to serve Mozilla for many years. The recent upgrade to Bugzilla 4.0 makes it much easier to use and quite a bit better. However, for end-users it is a broken tool that only serves to confuse, intimidate and create bad feelings ( I won’t give specific example to protect the innocent, but I’m sure you’ve all seen this).

I think that the best overall solution to this problem is two Bugzillas. One to be dedicated to Developers, the other for End-Users. Yes, this is a very very complicated solution. However, I don’t believe it is possible to “fix” bugzilla in its current state. It is being used for things it wasn’t ever meant to be used for and right now is juggling being an end-user support site and a developer bug database. This brings quite a few benefits to end-users, as well as triagers, developers and QA. While it may seem like too large of a solution to the problem, at this point I think we are standing between major changes, or seeing BMO become unusable and resulting decrease in software quality. I’ll flesh this idea out more over the next week, so please bear with me and tolerate my insanity. I think we still have a few more years of use out of bugzilla, but we do need to seriously change how we handle bugs.

Firefox Bug Versions

As of tonight, all UNCO Firefox:General bugs have been marked with the correct version field. There are approx 50 bugs that did not have the original Build ID, so they are still marked as Unspecified. I was able to clear out around ~1800-2000 bugs that had the Build ID, just had never been marked against the proper version. These bugs have been spread from Firefox 1 to Firefox 5 and trunk. Hopefully this makes it easier for us in the future to triage EOL bugs and ensure that the bugs we are working on are against recent Firefox versions.

Unfortunately, this won’t last too long. While BMO does auto detect platform and OS when a new Firefox bug is filed, it fails to detect versions. Until Bug 577561 is fixed, the Firefox bug pile will continue to grow with bugs that are incorrectly marked unspecified. This makes it difficult to update bugs that are filed against versions of Firefox that are EOL. We have to go off of dates, not versions, which is much less accurate, as on any given date we can have several version of Firefox floating around.

Rapid Release: How to make it work

(apologies for the break in blogging. Taking another stab at ye ole’ blog)

With 103,000,000+ downloads and counting, Firefox 4 has been shipped to the world, hopefully providing a noticeable improvement in their web browsing experience. Speed, interface improvements, and a multitude of fixes and improvements, users should at least notice something different with their browser. They should also notice something different in how their browser is released in the future. Goodbye to major versions with massive changes spanning a year-plus of development. Rapid Release is how Mozilla plans to get fixes and enhancements out to users more quickly and efficiently.  This plan will either succeed mightily, getting rid of the release delays and slow development process of the past, or brilliantly self-destruct and result in little progress and multiple, out of date versions.

I see two things that need to be done to ensure that Rapid Release is at least viable. This is a very very broad over-simplified version, and ignores the nuances and specifics of releasing a new browser. You may also ask “what does this have to do with Triage?”. Well, it has alot to do with triage. Roughly 8000 of the Open bugs in firefox are with versions of Firefox other than Firefox 4 (there is some room for error in this number, due to bugs being submitted with the wrong version number, etc.). This isn’t bad per-say, but Triagers have to approach different versions differently. Someone reporting a bug using Firefox 3.5 will most likely be asked to attempt to reproduce with Firefox 4 for example. So version numbers and development cycles mean alot to a triager.

Under-Promise / Over-Deliver

The worst thing that can start happening to Rapid Release cycles is that developer A comes out, says “Look at my cool widget prototype. It shines and has buttons. It will be released with Firefox 6″. News Outlet B begins to spread the word about this wonderful, new, revolutionary and magical feature. Time to release Firefox 6 comes around, Developer A hasn’t had enough time to finish his widget, the release drivers are forced to either drop the feature, or delay the release. Either way they get burned. Delay the release, have every news outlet on the internet say how Firefox is falling behind, or drop the feature, and hear the same thing.

This whole scenario could have been avoided with a little prior thinking. Problems come up, issues can’t always be foreseen, and development gets slowed down. With Rapid Release, unless a feature is 99% certain to be finished and working on time, nothing should ever be promised for a certain version. I think that so far, the Mozilla team has done a good job of qualifying goals with “If it is ready” and not promising by a certain ship date. It is a better thing to surprise people with more features than none.

Make it Noticable

User Susan starts her computer, opens Firefox, begins her morning surfing. Susan has been using Firefox since the old 2.0 days, when an update comes across, she is used to major changes. She gets a message saying “Firefox 5 is released, update now”. Thinking to herself how quickly this new version has been released, she updates. After restarting, she looks for changes, but there are none. Wondering if the update even worked, she goes and checks her version. Yep, it shows Firefox 5. Puzzled, Susan decides these must be marketing hype and doesn’t update next time Firefox warns her to.

While we can’t let Firefox releases be delayed for a new feature, we also should not let a new Firefox be released with no changes at all. This I think will be one of the biggest pitfalls of rapid release. Each developer has their projects they are working on, but when the time rolls around to release, none of them are ready. sure a few things have slipped in, bug fixes, minor changes and tweaks, but nothing substantial. Too many releases like that and you might as well not release at all, you are wasting user’s bandwidth. On a somewhat serious note, if there are no major changes, what is there for the news media to write about?

Unfortunately, until we make version numbers totally irrelevant, we are going to have to make sure to include major changes in every release of Firefox. I do see in the near future, relegating versions numbers to something only developers care about. With 100% silent updates, pushed through to the user quickly, I think then we can start to relax a little bit in how releases are made. I think Chrome has come pretty close to being able to safely ignore Version numbers. Eventually they will be a thing of the past, only needed for developers to differentiate between releases. But for at least the next few releases, Firefox is still going to be relying on version 5, 6, 7 etc.

So Mozilla, I think that Rapid Release could be a good thing, or a terrible thing. I think the world is ready to be surprised though, the ball is in our court.

End-User Bugs on Bugzilla

Mozilla offers an opportunity normally not available to users of most software. The ability to submit bugs directly to the developers of the software and watch the progression of the bug. With this comes a responsibility normally not faced by software companies. Because users can report these bugs, it means that unless they are looked at, the reporter and other user’s experiencing the issue quickly become frustrated that they are being ignored. The whole benefit of having end-users be able to watch the bug process can bite Mozilla in the tail. It frequently seems that the trend in Bugzilla is to divide bugs into quickly fixed and never fixed.

Bug Species

Bugs reported by regular end-users fall into one of two species.

  1. Not a real Firefox bug. Extension issues, spam bugs, support bugs, etc. These bugs are usually directed to either SUMO or AMO to talk to those who can properly assist.
  2. A real Firefox Bug. These are then split down further.
  • A bug that is already reported, is on a developer’s radar, etc. These bugs are typically resolved relatively quickly. Either as DUPE, FIXED, etc.
  • A bug that has not been reported before, or is not affecting a developer’s “pet project”. These bugs are typically ignored and shoved aside. They may be fixed in later years when the component they are filed against is rewritten, or some developer stumbles against the bug later on and fixes it. These bugs may be edge-case bugs, or really minor issues, but they are still bugs.

Now I realize developers have to have priorities. P1, P2, etc. I’m not expecting every bug to be treated the same. not all bugs were created equals, and there is no way to fix every bug within the first week it is filed. And fixing bugs isn’t very glamorous work. It looks a lot better with a new release to say “Redesigned component X with New Features A, B and C” compared to “Fixed 20 bugs in print preview”.

Scenario

But which makes a greater impact on our users? Ensuring that our users have a solid product with minimal bugs, or giving them all the bling that “the Other Browser” has. I love new Features. I think all the hard work in Firefox 4 like the redesigned Addons Manager, Theme, etc are great and help make a competitive product. But, running through triage I see so many bugs marked as NEW, and then never looked at again (As I commented on in another post). This is disheartening to users. Imagine the scenario:

John is IT tech at a local community bank. He finally convinced the IT Dept. Head to let him run a test run of Firefox on all the bank’s computers. Everything seems to be running great. Until, a few days later, John starts to notice complaints that their proprietary web-based plugin site does not allow the use of navigation keys when the plugin form has focus. This is a major issue for the bank because the tellers are trained to only use the keyboard for sake of speed. Some terminals don’t even have mice, meaning they essentially become trapped once focus is captured by the plugin. With no easy answer to the issue and no patch forthcoming in over 10 years, John is forced to switch to a browser that doesn’t have the same issue. All the great features, speed, and functionality came to nothing when a relatively minor bug caused a major breakdown in the bank’s operations.

All of this could have been avoided if a developer had taken notice of the bug and it had been addressed years before.

Solutions?

Now as someone I respect very much in Mozilla has said “If all bugs were security bugs, they would be fixed very quickly”. And major regressions, and common bugs are frequently fixed very quickly (ie. within a month or so). However, the sad reality is that many bugs reported by end-users go ignored for a very long time. If it is not acceptable for a company making a physical product to have a 2% failure rate, how can we ship a product with over 7000 bugs in NEW, ASSIGNED and REOPENED? I know it is impossible to reduce that number to even 1,000. But, I think we should be able to cut those NEW bugs at least. Maybe what is needed is a developer (or three) who is assigned for a while to go through, reproduce old NEW bugs and fix them. I think that the Papercuts push was a great idea, the unfortunately kinda flopped. Maybe a push like this on every major release of Firefox would help, so long as the follow through was there. Ideas?

Follow

Get every new post delivered to your Inbox.