Did your parents ever tell you something like this: “Back when I was a kid, I had to walk 3 miles to school, in the snow!”? Well, this article is going to be something like that since I’m going to say that software engineering was easier when I started. I’m going to share an extreme story about how I barely do what I call software engineering these days. Much of the problem is all the tools we have to use and poor management.
Back in the Day of Software Engineering
I have stated many times at conferences and in my writing that…
“Coding should be the least part of your job. If you spend all day coding, you are doing it wrong!”
The point of this statement is that there is a lot of work that should be done before software engineers sit down and start banging out code. This includes meeting with customers, writing feature requirements, doing good architecture, writing design documents, prototypes, and other related information. Far too many managers expect software engineers to attend a meeting and go back to their desks and start writing the code. This is 100% wrong and will lead to failed projects or projects so brittle (like the one I will discuss in this article) that it will be very costly to add features and fix bugs.
When I was a beginner, we never wrote code until the feature requirements and design documents were written and signed off. These detailed documents would include architecture, what projects need to be written, and sometimes would even go down to the class level. I had a template for these documents that I would use at every company I worked at. I will admit, that writing these documents was not my favorite thing to do since I love writing code. But these documents were vitally important for a successful project. Now, I can’t even remember the last time I wrote these documents. Today I blame much of it on the way teams use Agile for skipping these steps.
Skipping these steps is very costly to the company as shown in this chart.
As you can see, finding issues in the feature and design steps costs nothing. By the time coding starts it’s already at 10X the cost and if it’s found in production it’s a whopping 150X the cost to fix. Since managers care about cost a lot more than developers do, I encourage attendees of the conference session where I show this chart to print it out and glue it to their managers’ door at work!
At Mitchell International here in San Diego, California, I wrote their first open API so that partners could have access to backend data so they can provide apps and services for Mitchell customers. I have written more about this in my article A Look at 20 Years of Microsoft .NET: My First Enterprise Application and More! Mitchell purchased the application I worked on from a business manager in New Mexico. This person knew a lot about the business but was far from a good coder. The database design was one of the worse I have ever seen. Because of this, it was very important to me that our partners did not have to deal with the bad design. I wanted to make sure that the data structure for the API was simple to understand and use. To accomplish this, I spent a lot of time talking with the original developer and came up with a very detailed design that included UML diagrams for the data structure. I worked on this design off and on for six months.
I wrote the API using Entity Framework which was newly released right when I started working on this project. The partners found it easy to use this REST based API and when I left the company, I was in testing with partners that allowed Mitchell to participate in profit sharing in any product or service sold using the API. After I left, one of the contracts I worked on was from one of the partners and I wrote code against my own API!
An Extreme Example of Software Engineering Today
Since I have worked mostly as a contractor for the past 10 years, many teams hire me to fix the issues in their codebase including memory issues, performance issues, and quality. I do this more than I add features to projects. In a recent project, I decided to keep track of how much time I spent on all the major tasks I do for a two-week period. As you can see, I spent less than an hour on actual new coding!
Waiting for Builds
I spent a whopping 22.5 hours sitting around waiting for builds and the unit tests to be run! This team uses Team Foundation Services (TFS) for their code repository. Each build (on the build server) takes almost an hour, and I must wait to make sure it is completed successfully before I can continue doing any work. If it fails, I must fix it before moving on. Their workflow made it impossible to work on different Jira tickets at the same time. This codebase is so brittle, that much of the time my commits break the build, and I must spend a lot of time trying to fix it. The biggest issue with the builds is that most 9,000+ unit tests will not run on my local development virtual machine. I get PTSD just thinking about committing code changes. How can I make sure the code works properly when I can’t even run the full suite of unit tests? To me, this is over half a week of work wasted dealing with these issues. Building on my local VM takes about 20 minutes, not including unit tests since most will not run locally. Even their developer build script did not include all the projects it needed to, so I had to create my own.
Merging Branches
This solution had the most issues I’ve seen stemming from not properly disposing of objects and not implementing IDisposable correctly or at all. This was the first major thing they wanted me to work on since DevOps throughout the day had to restart servers or services. I spent well over 3 months on the first pass of this work. They had me create a new repository folder and work off that. Since I am not a TFS expert, I kept asking for help with keeping my folder up to date. I never got it. When they wanted me to start putting my changes into the main repository, my folder was so out of date that it would be impossible. Even restoring from a shelf-set would not work reliably.
In the end, most of the code I moved into the main branch, I had to do manually or just do it from scratch again. This doubled and sometimes tripled the amount of time I spent doing it the first time. Why do it once when you can do it two or three times, right? This also caused a lot more work that I will describe in the next sections.
Build Issues
It’s so easy to break the build for this 82-project solution! The major reason is that we could not run most of the unit tests on our development VM. We had to submit a build and see if they break there. Most of the time it did, especially for the work I did to fix the memory issues. On top of that, some issues would not show up until the build was deployed! So basically, I was not confident that the code worked when I committed it.
In this two-week period, I spent almost two days just dealing with these issues. Other build issues were:
- NuGet packages and or other DLL’s not working the same as it did on our development VMs.
- Other developers overwriting our changes.
- Visual Studio and .NET were not kept up to date on the build server.
Many of the build issues that I ran into were caused by using Fakes for unit testing. At some point, they decided that the server build was taking too long and moved most of their unit tests to use Fakes instead of hitting the database. The issue was that if I made a change to a class, it would break the Fakes. I asked many times for help with how the Fakes worked so I could get around these build issues. I never got that help. Due to this, I had to back out a lot of the memory fixes that I tried to implement.
Reimplementing TFS Rollbacks
When a build would break, most of the time I had to back out my changes so others could build. Then try to figure out the issue then reimplement those changes from a code review shelved set in TFS. Again, this was so painful since the build worked fine on my VM and since I could not run the unit test where most of the issues were, I had to make a best guess on what would fix it and then commit it and wait an hour for the build, over and over. Because of the way they used TFS, as you can see, I wasted 14 hours of time. Doing this in TFS is a nightmare!
Waiting for Code Reviews
Every change I made, had to be code reviewed, which is normal. But the way this team used TFS, I could not work on anything until the code review was done. Normally this took around 3 hours or more! So that is 3 hours I could not work on code. What impacted this time was the different time zones the team worked in. If it was past 2 pm PST, then none of the code reviews would be done until the next day. Even with sending messages on Teams to the reviewers, I could never get that time down to less than three hours.
Visual Studio Issues
This company forced all their developers to use virtual machines for development which I avoid if I can since VMs are always slower than a physical box. I would say around 20% slower. Once I got my VM to a point where I could build (this takes a month, not kidding), and ran Visual Studio 2019, it was painfully slow. I then found out that they only allotted 8GB of memory for these VMs! Windows takes about that much just to run!!
On a good day, Visual Studio would hang and crash on me sometimes over 5 times. On top of that, the way they used mapped drives made things much worse with Visual Studio. I had them bump up my VM to 16GB of memory and while that helped, it did not prevent these issues from happening every day. I spent almost a day dealing with this in just a two-week period! Visual Studio 2019 already has a lot of issues that slow down development, these VMs just compounded them.
Changing Just 1 Line of Code
Let’s say I had to make a very minor change to one line of code. This is an estimate of the actual time this change could take:
Description |
Time |
Code Change |
.5 hours |
Local Build |
.5 hours |
Review Code Changes |
.5 hours |
Code Review |
3 hours |
Build on Server including Unit Tests |
1 hour |
Total: |
5.5 hours |
This estimate assumed everything worked perfectly, which was very rare. Then you would have to add time to roll back the change, wait for the server build, and then start this process again, sometimes feeling like I was stuck in an endless loop of pain and frustration! Also, if I had to wait for a deployment to a development server, that could take many hours or even wait until the next day!
Odds & Ends
Here are a few more items that slowed down development at this company.
Jira
Jira is used by software teams for product management, handling basic software-development tasks, and bug tracking. It seems, from what I read on Twitter, that developers either hate it or love it. On this project, I hated it. Well, the way this team implement it and used it is what I hated. It really bogged down the process. What made matters worse, was the person from QA who was on our team.
In the past when I used software like this after I was done with coding and passed it on to QA, I was done with the ticket. When QA was done, they would pass it on to the next step in the process. But not on this team. Each developer was responsible for keeping track and changing the status until it was completed. This shifted too much of the work and blame onto the engineer. We were responsible for making sure the ticket was part of the correct deployment, that changed a lot and much more.
Our QA person was a major source of frustration when it came to our Jira tickets. Not only by the way she talked to us in a negative, bitter manner but in the way she demanded that everything was perfect on the ticket before she would be okay with it. Many of these items, to me, are not an engineer’s responsibility, but more of the responsibility of managers, scrum masters (if we even had one), etc.
When I started programming, we did not have applications such as Jira to manage projects. That was the sole responsibility of the project manager and they typically used Microsoft Project.
Microsoft Teams
This team seemed to use Teams to give direction on what needs to be worked on or looked at, especially our QA person. To me, besides hosting online meetings, it’s a chat program! Direction should never be done in a program like Teams. Either use Jira or an email to document the work that needs to be done. With many chats going on and chat rooms, how are we supposed to keep track of everything? What’s worse is trying to find something someone might have said in the past. It got to the point where if work was not documented in Jira or email, I would ignore it. I had too much else to do to keep track of what the heck is going on in Teams.
Contractor Hiring Practices
This company is what I call “contractor happy”, which means they hired a lot of them from different parts of the world. It felt to me they had a lot more contractors than permanent employees. This is a bad trend I have been seeing for a long time. What made matters worse, they would often let go of many of them, then in a short time afterward, hire a new batch to fill the void. I don’t know if they did this to make their financial status look better or what, but this does not work.
Not only did it take a month for a new contractor to get their development environment set up so they could code, but then each one had a long learning curve. Then just to lose all that knowledge when the company needs to decrease the headcount. The full-time developers would be so busy, that getting any help from them was next to impossible. The developer there with the most experience would not even show up to scrum meetings most days.
Contractors are there to help on a temporary basis. They should not be a permanent part of the hiring practice for the reasons I discussed above and a lot more. Some states in the US have laws against this. Maybe that’s why they would hire them and let them go often. There also is a lot more management that needs to be done when hiring contractors. Something I’ve never seen done correctly.
Permanent hires hold the “brain trust” of the project. Teams need more of them, not less. I can guarantee that companies, like this one, end up paying a lot more for the same amount of work a permanent hire can perform. My team had over 17 people in it, far more than any other team I have worked on. I would say because of this practice and others I have discussed, there are about three times more people in this team than they really need. I wonder if the investment firm that owns this company realizes this. It does not seem like it to me since this firm has done the same practice with other software companies they have purchased.
Scrum Meetings
Most teams I work on do not really use scrum meetings for what they are intended to be, this team included. For this team, it was a way to maybe change or verify the status of a ticket. Rarely did anyone get help for obstacles they were facing. Usually when someone would ask “who can look into an issue” or “who can work on this ticket”, there would be silence on the call. Most of the time when I asked for help, there would also be silence on the call, even from management. Even with 17 people on the call, the meetings were short but, in the end, were a waste of 15 to 20 minutes. Many days after I gave my status, I would drop the meeting so I could get back to work.
Take Away
I have spoken and written many times, that coding should be the least part of your job. But what I outlined in this article is not what I’m talking about. The issues I described in just two weeks of work show that too many tools and practices are getting in our way to produce good quality software in a timely manner. This companies hiring practices and poor architecture, standards, and much more ended up in a codebase that is difficult to fix and add features to.
Code analysis for this solution reveals that there are over 48,000 code violations. Over 217,000 lines of cloned code and code that aren’t in use anymore. Over 32,000 spelling issues. 80 NuGet packages are out of date. It’s using versions of .NET that are not supported any longer. There are almost 2,000 places in the code that need to be fixed to deal with the memory issues. None of the classes that implemented IDisposable did it correctly and had a lot more problems. I worked on their memory issues for most of the 6 months I was on this team, and I would estimate that there are another 6 months of work needed. It’s that bad. Now that I am gone, I doubt if these issues will ever be fixed.
Working with all these issues on this team created a dramatic amount of stress and anxiety which also hurts software projects. Now you might realize why it takes so long to add features to the software you use and to fix bugs.
My goal with this article is to start a conversation on how we can make the software development life cycle easier on software engineering teams. Have you run into these issues where you work? Have you found a solution? I would like to hear from you. Please comment below.