Sohan's Blog

Living the Developer's Life

What We Learned About Feature Flags in Five Years

Looking at our git logs from Cisco AMP for Endpoints Console, I see that we introduced feature flags back in January, 2014. The reason I got interested in it is because even after all these years of use, today I had to build a new concept on our feature flag code. If you’re already using feature flags or thinking about adding feature flags to your project, this experience report may be helpful.

switchboard

Photo credits to Michael Newton

Back in 2014, we were growing as a team, but wanted to keep working on a single shared code. We perceived that the productivity gain of multiple teams working on a shared code would outweigh cross-team dependency issues. As we started working on multiple features in parallel, mostly independent with different release dates, we saw unfinished work on one feature was blocking the release of a completed one. After some research, we decided to introduce feature flags in our code.

First, we read Martin Fowler’s article on this topic as a guideline. Today, we have 195 feature flags in production. Over time, we have extended the use of feature flags with new concepts and I wanted to document it here for everyone. Fowler’s blog also published a more detailed and updated post later. The taxomony used here is different from Fowler’s because I find the following to be more relevant for our product.

  1. Database stored: We store the feature flags in the main database so that the features can be toggled without needing a code deployment.
  2. Cached: Feature flag lookups are cached for performance.
  3. Temporary vs. permanent: We mark some feature flags as temporary when the primary goal is to incrementally release code to production. Temporary feature flags are regularly cleaned once the feature is complete. 13/195 currently used feature flags are marked temporary.
  4. Self-serve: We tag some feature flags as self-serve where users need to opt-in to use the feature.
  5. Limited availability: For self-serve feature flags, we tag some features as limited availability. It allows us to release self-serve features to selected customers.
  6. Globally enabled: We have a mechanism to globally enable or disable a feature flag. 131/195 feature flags are currently marked globally enabled. This number varies by deployed environments.
  7. Enabled for all, but: We have a mechanism for enabling a feature flag for all but some specific targets.
  8. Multi-target: Sometimes we attach a single feature flag to multiple domain objects such as tenant, user, subscription tier, etc.
  9. Hierarchical: We use a fallback mechanism for feature check. For example, the check if a user have file upload permission, we check it for the specific user, then fall back to the tenant it belongs to, and finally fall back to the feature itself being globally enabled.
  10. Code generator: We use a single-command code generator to introduce a new feature flag to our code. It takes care of the database migration, seed entry, and code references.
  11. Circuit-breaker: For integration with external services, we’ve used feature flags as a circuit-breaker to gracefully handle third-party downtime.
  12. Environment-flags. We deploy the product to multiple geographic environments, including a private cloud model. Certain features behave differently based on the deployment. Using feature flags make it easy to develop and test such differences before deploying to each target environment.

There are reusable libraries and services such as LaunchDarkly that provide rich APIs and user interfaces for feature flags. At this point, even with all the aforementioned concepts, our custom implementation of feature flag is quite straight-forward and easy to evolve. It has been a key ingredient for our frequent iterative deployments with 6 teams working on diverse features in parallel on the same product.

Software Architecture Is All About Ugly Boxes and Lines - My Wishlist

In my last post, I claimed software architecture is all talk and no show. When we have a visible one, it’s a bunch of poorly drawn boxes and lines. I don’t have a problem with boxes or lines, but I do like beautiful drawings.

Despite many standards, we still mostly use whiteboard drawing of boxes and lines for sharing software design as we build new systems or introduce new team members. Where it sucks is the lack of evolution and context of the rest of the system that’s not drawn on the board.

A digital repro of software architecture diagrams often happen in PowerPoint or similar tools that allow us to draw boxes and lines. This process is so rough that people just give up.

At work, I have been using WebSequenceDiagram. While it’s still not an eye-candy, I like the fact that you can draw a diagram from using plain text. Consider this as an input to create the accompanying diagram:

1
2
3
4
5
title Toilet Flush System
User -> Flush Lever: Push
Flush Lever -> Outlet Valve: Open
Outlet Valve -> Toilet Bowl: Water
Outlet Valve -> Inlet Valve: Open

Sequence Diagram

While this text to sequence diagram is a great achievement for a tool, I don’t see such tools for software architecture diagrams. Here’s my wishlist of features that I’d want in a software architecture tool:

  1. Text input. Allows us to easily create the diagrams and use all the version control features.
  2. Map like UX: Allows us to easily transition between higher and lower level components.
  3. Beautiful.

Do you know any? Do these requirements make sense?

All Talk No Show: Software Architecture

We have a problem with software architecture. Let’s face it. Find the architecture diagrams of the products you’re working on and answer these questions:

  1. Did you find it?
  2. Does everyone in your team know where to find it?
  3. Is it up-to-date?
  4. Can you see how this system scales, handles failover, monitors performance, or how it’s secured?
  5. Can you see how it evolved over time?
  6. Can you train a new team-member using this diagram?

This is the first micro-post of a series of such as I aim to build a compelling case for fundamentally changing software architecture diagrams.

Play at Work

paper plane

Image source: Kalvis

“I am a mango”.

First, mangoes are in-season, and I still remember the juicy sweetness of the mango I had just the night before. So, when the coach asked each of us to be a fruit, I didn’t think twice. One of my coworkers was a kiwi, another one apple, and so on. The idea was to group us by color, then by size. This got us, twenty people in the class, moving and engaged. It was part of a two-day design thinking course. The coaches used the fruit game to bring some energy into the room as well as to pave the way for the next exercise - grouping a bunch of ideas by cost and the level of innovation.

I find that professional trainers bring play at work, especially for sessions that span hours or days. However, on a typical day to day business, I don’t see much play activity at work. Hoping to bring in some play activities to my work going forward. Here are some ideas for play activities based on what I saw so far:

Paper planes: Having small groups build paper planes and the winner has the most number of planes crossing a line.

Catch and throw: A ball or a ball-like object changing hands and the person catching must find someone that didn’t catch it already.

Portrait: Everyone draws a portrait of another person looking at their face without looking at the paper.

Quiz: An online quiz that maintains a leaderboard throughout a session.

Exercise: Getting everyone out of their seats to do a quick one-minute exercise.

Internal Trainings

design_thinking

Design Thinking course, Cisco Calgary Office, AB, 2019

One benefit of working at Cisco is access to the many learning and development resources. Our learning and development org arranges hundreds of courses throughout the year. Moreover, we have a reimbursement program for external courses, conferences, books, and subscriptions to online learning programs and publications.

In the past 6 years, I have immensely benefitted from these resources. Here’s a list of the learning resources I’ve used.

  1. Safari books online, aka O’Reilly learning: In the past 3 years, have taken 3 online trainings and 7 books on this platform so far.
  2. Design thinking, 2019: A two-day course taught by consultants about how to apply the principles of design thinking to communicate and derive solutions to complex problems.
  3. Mindfulness, 2019: A 5-week program taught by consultants, with one hour per week, where I learned about staying mindful and effective at work amid all the chaos that surrounds it.
  4. Tufte one-day course, 2019: Attended a course taught by Edward Tufte on data visualization and learned about the principles that make compelling data visualization.
  5. SANS incident response, 2018: A packed week-long program where I learned to think like a hacker by learning about and then hacking some interesting vulnerabilities in systems.
  6. Cisco R00tcamp, 2017: A packed week-long program where I learned hands-on pen-testing techniques to build more secure software.
  7. Cisco threat-hunting, 2017: A daylong course to find root case that triggered a security threat using integrated Cisco tools, including the product I build with our team.
  8. RailsConf, 2015: It was a big opportunity to meet the community and bring some of their practices in-house. For example, we started using BugSnag after learning about it in the conference.

Polyglot YYC 2019: My First Unconference

polyglot_yyc Source: Polyglot YYC

This weekend I went to Polyglot YYC 2019. It’s a gathering of tech people from around Calgary. The event was open to all possible topics, hence the name Polyglot. It was the first unconference style meetup I went to and this post is a summary of my day.

About the attendees, I don’t have the official count, but I imagine there were close to a hundred people. Quite a few people represented the sponsor companies. The sponsors advertised for hiring new employees, mostly software engineers. On the other hand, I met a few people who joined this venue to talk to prospective employers. Some of the attendees were enrolled in a coding bootcamp to switch careers. I was quite happy to see the community praising their enthusiasm towards the bootcampers. Moreover, I met a few regular tech meetup people after a long time. I used to go to all tech meetups I could find in town before having kids, and I felt great to be among the self-motivated crowd after a long break.

About the event, I was fascinated by how the unconference took its shape. At check-in time, everyone got a couple of forms to write down topics of interest, either as a host or a participant. Everyone could vote for five such topics. I didn’t prepare beforehand. So, I put up a topic that I’m presently curios about, “Writing for Developers”. A handful of people voted for me, but it didn’t make the cut. I found some ideas better than mine and was happy that those got voted to the top. In hindsight, I should’ve proposed the topic of “Why Are You Not Innovating?”. I did an internal presentation on this topic at Cisco and it was generally praised by my colleagues.

My other observation is, between technical and soft-skill related topics, I liked the soft-skill ones better. For example, I found the topics on hiring, choosing a technical vs. managerial career path to be more interesting than the topics such as GraphQL and ReactJS. A few years ago, I’d just choose the technical topics without thinking twice. This is also a pattern in my recent blog posts or reading list.

In the hiring session, I saw a positive attitude towards hiring remotes and treating them as equals. This is a major mindset shift among the community.

In the individual contributor vs. management career path session, the attendee list included both kinds as well as people that had transitioned in either direction. The one take home message I got from this session was, when confused with career choice, individual contributors and managers should take the step to switch roles. And if things don’t work out, it’s totally possible to revert later.

I’m looking forward to the 2020 edition of this event and may even prepare to come up with a good idea for the unconference. If you can, please join us for the next round.

Micro Design Critic: Microsoft Word vs. Apple Pages

MsWordVsApplePages Showing a Screenshot of Microsoft Word for Mac and Apple Pages

This is what you get by default when you open a blank document on both editors. I love that Apple Pages puts a deep focus on the content. If you haven’t used it, I’d recommend trying it out. I know Microsoft Word has a known face, so you’re likely used to all the distractions that sorrounds the content. But, if you can, give Apple Pages a try.

Exception Handling Anti-patterns

confusing road sign

Source: Henry Burrows

Whenever faced with a production issue, I find exceptions to be an extremely useful information source. A careful look at an exeption has often led to quick discovery of the source of a trouble. On the flip side, I have also faced a lot of chaotic debugging sessions because of poor exception handling. Here, I present the common anti-patterns that I recommend fixing while reviewing pull-requests. Most programmers are already familiar with the mechanics of exception handling. Yet, I see these anti-patterns everyday.

I primarily see these anti-patterns to be control-flow or logging related as shown below:

Control-flow Anti-patterns

Unhandled. When an exception is unhandled, if often results in a clueless user experience for the end user as well as the developer.

1
2
3
def notify
  post.email! #May fail due to configuration, network, or authentication
end

Catch-all. With catch-all errors, it’s often difficult to quickly detect the original problem. For the same reason, the end users don’t get specific and actionable error messages.

1
2
3
4
5
def create
  post.save! #May fail due to database issues
rescue => error
  # handle
end

If-else Exceptions. Exceptions mean something unexpected took place. If-else is used for logical known code paths. For example, when accepting an API request, invalid input data is often a known logical path. Using exceptions for it will trigger false alarms.

1
2
3
4
5
6
def create
  post = Post.new(params)
  post.save!
rescue ValidationError => error
  log_exception(error)
end

Wrapped Exception. A new exception is raised hiding the original exception. In such cases, if the exception is handled by the caller, critical context information is lost since the orignal stacktrace is no longer available.

1
2
3
4
5
def create
  post.save! #May fail due to database issues
rescue SaveError => error
  raise CustomSaveError.new('Failed to save the post')
end

Useless Custom Exception. Introducing a new exception type when a pre-defined exception suits just fine.

1
2
3
4
5
6
def create(text:)
  if text == nil
    #Could just use pre-defined ArgumentError
    raise EmptyTextException.new("Text can't be empty")   end
  #...
end

Leaky Handler. Handling an error without cleaning system resources such as file handles, open network connections, can cause cascading system outage.

1
2
3
4
5
6
7
8
def create
  #Will leak this file handle if read succeeds, but write fails
  file = File.open('/some/new.txt', 'w')
  file.write('some text')
rescue FileNotFoundError, FileSaveError => error
  log.warn('...')
  raise error
end

Logging Anti-Patterns

Silent Handler. Makes it very difficult to debug problems.

1
2
3
4
def create
  post.save! #May fail due to database issues
rescue
end

Debug-only Handler. Similar to silent handler since most production apps run in non-debug log level.

1
2
3
4
5
def create
  post.save! #May fail due to database issues
rescue SaveError => error
  log.debug "failed to save post #{error.message} #{error.backtrace.join}"
end

Custom Message-only Handler. Some exception handlers only log a custom message leaving the details of the exceptions. As a result, critical information is lost that can be very useful for debugging.

1
2
3
4
5
def create
  post.save! #May fail due to database issues
rescue SaveError
  log.warn "failed to save post"
end

Message-only Handler. Without Stacktrace, it gets very difficult to trace the root of a problem since often times exception handlers wrap a few lines of code.

1
2
3
4
5
6
7
def create
  email = User.find(params[:id]).email
  post = Post.find(params[:id])
  comment = post.comments.create!(name: user.name)
rescue NotFoundError => error # Could happen in line 2 or 4
  log.warn "failed to save post #{error.message}"
end

Sneaky Handler. Some exception handlers return nil or a value. The caller can’t distinguish between a successful vs. exception case and fails in subsequent steps.

1
2
3
4
5
6
def create
  post.save! #May fail due to database issues
rescue SaveError => error
  log.warn "failed to save post #{error.message} #{error.backtrace.join}"
  return null
end

There are times when you intentionally have to use some of these anti-patterns. But those are rare. It’s critical for the developers to think about the information that’d help in swiftly debugging a production problem. As such, developers must avoid the noise and provide all context information for errors to help diagnose potential system problems.

Happy coding.

“Ah, How Good It Is to Be Among People Who Are Reading.”

Rainer Maria Rilke

All creators take a deep interest in the creations of others. All filmmakers watch a lot of movies, all good writers are also the most prolific readers, all artists can talk at length about the smallest pieces of art they have seen.

We, software developers, if we want to claim ourselves as the artists of this craft, we must be prolific readers of code. There’s been no better time as today. We have immediate access to millions of lines of carefully written code out there in the internet. Just like artists of any craft, I’ve had so much fun spending time with my fellow developers that read code for the pure joy of learning something new.

Just had so much fun reading this ruby code today from the Ruby on Rails project:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Array
  # Wraps the array in an +ArrayInquirer+ object, which gives a friendlier way
  # to check its string-like contents.
  #
  #   pets = [:cat, :dog].inquiry
  #
  #   pets.cat?     # => true
  #   pets.ferret?  # => false
  #
  #   pets.any?(:cat, :ferret)  # => true
  #   pets.any?(:ferret, :alligator)  # => false
  def inquiry
    ActiveSupport::ArrayInquirer.new(self)
  end
end

The beautify of this rich API is an art. I love it. You may have different opinions. But I hope you find your love of art in code. There’s plenty of art in code out there for everyone to enjoy.

How Am I Developing the People I Support as a People Leader?

After being a people leader at work for the past six years, I’m now going through a phase of introspection. Essentially, I’m trying to understand my own philosophy about people leadership so that I can clearly communicate it to the people that I support.

My most important realization is, people leadership is all about developing people. What I mean by this is, for everyone I’m supporting, I must carefully build a plan that provides them with the opportunities to stay motivated so that they can thrive. With this goal, once I wrote down my understanding of what motivates each of the people I support, it was quite eye opening to see the differences among people. Going through this process, I also realized how unprepared I was in terms of providing them with a clear career path to achieve their best.

If you’re interested, I’ve shared a template of the people development document here.

1
2
3
4
5
6
Name: ___
Date:___
Current role: ___
Motivation: ___
Upcoming opportunity in the next three months:___
Upcoming opportunity in the next year:___

I consider this to be a living document as people often develop new interests and the opportunities at work change with time. But keeping a clear log of each individual’s career is a great way to establish and manage expectations. You can build this with the the individuals directly and update it when you meet for one-on-one feedbacks. As a lead, when you collaboratively build this, you empower them and build a trusting relationship as you both see how the motivations align with the work.

From the past 13 years of my time in the industry, working for 5 different companies, personally I’ve always felt a little under-informed about how my leaders planned a career development for me. Through my introspection of being a people leader, I realized I didn’t honestly appreciate the need for such clarity among the people I supported. So, I wanted to change it. And found the written document to be a simple yet surprisingly powerful tool to fill this void.

Now, if you’re a people leader, I’d recommend doing this exercise with your people. You’ll be pleasantly surprised by the outcome.

If you aren’t a people leader, you can write it down for yourself and ask your leader to collaborate on it. This way, when you have a one-on-one, you both will have the same reference document to focus on.