*** title: 11 principles for building and scaling feature flag systems description: >- Build a scalable, secure feature flag system with 11 key principles. Improve DevOps metrics, ensure reliability, and enhance developer experience. keywords: 'principles, building, scaling, feature flags, unleash' 'og:site\_name': Unleash Documentation 'og:title': 11 principles for building and scaling feature flag systems max-toc-depth: 2 ---------------- Feature flags, [sometimes called feature toggles or feature switches](/get-started/what-is-a-feature-flag), are a powerful software development technique that allows engineering teams to decouple the release of new functionality from software deployments. With feature flags, developers can turn [specific features or code segments on or off at runtime](https://www.getunleash.io/feature-flag-use-cases-software-kill-switches) without needing a code deployment or rollback. Organizations that adopt feature flags see improvements in key [DevOps metrics](https://www.getunleash.io/blog/dora-metrics-in-2023-5-ways-to-measure-devops-performance) like lead time to changes, mean time to recovery, deployment frequency, and change failure rate. At Unleash, we've defined 11 principles for building a large-scale feature flag system. These principles have their roots in distributed systems design and focus on security, privacy, and scalability—critical needs for enterprise systems. By following these principles, you can create a feature flag system that's reliable, easy to maintain, and capable of handling heavy loads. These principles are: 1. [Enable runtime control](#1-enable-runtime-control) 2. [Make flags short-lived](#2-make-flags-short-lived) 3. [Prioritize availability over consistency](#3-prioritize-availability-over-consistency) 4. [Ensure unique flag names](#4-ensure-unique-flag-names) 5. [Choose open by default](#5-choose-open-by-default) 6. [Protect PII by evaluating flags server-side](#6-protect-pii-by-evaluating-flags-server-side) 7. [Evaluate flags as close to the user as possible](#7-evaluate-flags-as-close-to-the-user-as-possible) 8. [Scale horizontally by decoupling reads and writes](#8-scale-horizontally-by-decoupling-reads-and-writes) 9. [Limit feature flag payload](#9-limit-feature-flag-payload) 10. [Prioritize consistent user experience](#10-prioritize-consistent-user-experience) 11. [Optimize for developer experience](#11-optimize-for-developer-experience) Let's dive deeper into each principle. ## 1. Enable runtime control A scalable feature management system evaluates flags at runtime. Flags are dynamic, not static. If you need to restart your application to turn on a flag, that's configuration, not a feature flag. A large-scale feature flag system that enables [runtime control](https://www.getunleash.io/blog/so-what-exactly-is-runtime-control) should have, at minimum, the following components: a [service to manage feature flags](https://www.getunleash.io/blog/feature-management), a database or data store, an [API layer](/get-started/unleash-overview), a [feature flag SDK](/sdks), and a continuous update mechanism. Let's break down these components. ### Feature flag system components * **Feature Flag Control Service**: A [service that acts as the control plane for your feature flags](https://www.getunleash.io/feature-flag-service), managing all flag configurations. The scope of this service should reflect the boundaries of your organization. * **Database or data store**: A robust, scalable, and highly available database or data store that stores feature flag configurations reliably. Common options include SQL databases, NoSQL databases, or key-value stores. * **API layer**: An API layer that exposes endpoints for your application to interact with the *Feature Flag Control Service*. [This API should allow your application to request feature flag configurations](/get-started/unleash-overview). * **Feature flag SDK**: [An easy-to-use interface for fetching flag configurations and evaluating feature flags at runtime](/sdks). When considering feature flags in your application, the call to the SDK should query the local cache, and the SDK should ask the central service for updates in the background. * **Continuous update mechanism**: An update mechanism that enables [dynamic updates to feature flag configurations](/get-started/unleash-overview) without requiring application restarts or redeployments. The SDK should handle subscriptions or polling to the *Feature Flag Control Service* for updates. ![The SDK holds an in-memory feature flag configuration cache which is continuously synced with the Feature Flag Control Service. You can then use the SDK to check the state of feature flags in your application.](https://files.buildwithfern.com/unleash.docs.buildwithfern.com/896105ff4e0b2954a08fe454f6e6ede7af07269c2fb6604cb932431dc691b923/assets/feature-flag-scalable-architecture.png) ## 2. Make flags short-lived The most common use case for feature flags is to manage the [rollout](https://www.getunleash.io/feature-flag-use-cases-progressive-or-gradual-rollouts) of new functionality. Once a rollout is complete, you should remove the feature flag from your code and archive it. Remove any old code paths that the new functionality replaces. Avoid using feature flags for static application configuration. Application configuration should be consistent, long-lived, and loaded during application startup. In contrast, [feature flags should be short-lived, dynamic, and updated at runtime](/guides/best-practices-using-feature-flags-at-scale). They prioritize availability over consistency and can be modified frequently.
Configuration system Feature flag system
Lifetime and runtime behavior Long-lived, static during runtime Short-lived, changes during runtime
Example use cases
  • Database or server credentials
  • Server port
  • CORS headers
  • API base URL
  • [A/B or multivariate testing](https://www.getunleash.io/blog/ab-testing-feature-flags-how-to)
  • [Gradual rollout](https://www.getunleash.io/feature-flag-use-cases-progressive-or-gradual-rollouts)
  • Permission flags (i.e., beta testing programs)
  • Operational testing in production
### Strategies for large organizations implementing feature flags To succeed with feature flags in a large organization, [follow these strategies](/guides/best-practices-using-feature-flags-at-scale): * **Set flag expiration dates**: Assign expiration dates to feature flags to track which flags are no longer needed. A good feature flag management tool will alert you to expired flags, making it easier to maintain your codebase. * **Treat feature flags like technical debt**: Incorporate tasks to remove outdated feature flags into your sprint or project planning, just as you would with [technical debt](/concepts/technical-debt). Feature flags add complexity to your code by introducing multiple code paths that need context and maintenance. If you don't clean up feature flags in a timely manner, you risk losing the context as time passes or personnel changes, making them harder to manage or remove. * **Archive old flags**: When feature flags are no longer in use, archive them after removing them from the codebase. This archive serves as an important audit log of feature flags and allows you to revive flags if you need to restore an older version of your application. While most feature flags should be short-lived, there are valid exceptions for long-lived flags, including: * **Kill switches**: [Kill switches](https://www.getunleash.io/blog/kill-switches-best-practice) act as inverted feature flags, allowing you to gracefully disable parts of a system with known weak spots. * **Internal flags**: Internal flags to enable additional debugging, tracing, and metrics at runtime, which are too costly to run continuously. Engineers can enable these flags while debugging issues. ## 3. Prioritize availability over consistency Your application shouldn't have any dependency on the availability of your feature flag system. Robust feature flag systems avoid relying on real-time flag evaluations because the unavailability of the feature flag system will cause application downtime, outages, degraded performance, or even a complete failure of your application. If the feature flag system fails, your application should continue running smoothly. Feature flagging should degrade gracefully, preventing any unexpected behavior or disruptions for users. You can implement the following strategies to achieve a resilient architecture: * **Bootstrap SDKs with data**: Feature flagging SDKs should work with locally cached data, even when the network connection to the *Feature Flag Control Service* is unavailable, using the last known configuration or defaults to ensure uninterrupted functionality. * **Use local cache**: Maintaining a local cache of feature flag configurations helps reduce network round trips and dependency on external services. The local cache can periodically synchronize with the central *Feature Flag Control Service* when it's available. * **Evaluate feature flags locally**: Whenever possible, the SDKs or application components should evaluate feature flags locally without relying on external services, ensuring uninterrupted feature flag evaluations even if the feature flagging service is down. * **Prioritize availability over consistency**: In line with the [CAP theorem](https://www.ibm.com/topics/cap-theorem), design for availability over strict consistency. In the face of network partitions or downtime of external services, your application should favor maintaining its availability rather than enforcing perfectly consistent feature flag configuration caches. Eventually consistent systems can tolerate temporary inconsistencies in flag evaluations without compromising availability. ## 4. Ensure unique flag names Ensure that all flags within the same *Feature Flag Control Service* have unique names across your entire system. [Unique naming](/guides/best-practices-using-feature-flags-at-scale#reusing-feature-flag-names) prevents the reuse of old flag names, reducing the risk of accidentally re-enabling outdated features with the same name. ![Creating a feature flag with a unique name referencing the project and related jira ticket.](https://files.buildwithfern.com/unleash.docs.buildwithfern.com/0611e4c3a7a5894d3d894c731d2f99c6b01b38cfc80fe161928bdf537c80d23f/assets/feature-flag-unique-name.png) ### The benefits of unique naming Unique naming has the following advantages: * **Flexibility over time**: Large enterprise systems are not static. Monoliths may split into microservices, microservices may merge, and applications change responsibility. Unique flag naming across your organization means that you can reorganize your flags to match the changing needs of your organization. * **Fewer conflicts**: If two applications use the same feature flag name, it can become difficult to identify which flag controls which application. Even with separate namespaces, you risk toggling the wrong flag, leading to unexpected consequences. * **Easier flag management**: Unique names make it simpler to track and identify feature flags. Searching across codebases becomes more straightforward, and it's easier to understand a flag's purpose and usage. * **Improved collaboration**: A feature flag with a unique name in the organization simplifies collaboration across teams, products, and applications, ensuring that everyone refers to the same feature. ## 5. Choose open by default Making feature flag systems open by default enables engineers, product owners, and support teams to collaborate effectively and make informed decisions. Open access encourages productive discussions about feature releases, experiments, and their impact on the user experience. [Access control and visibility](/guides/user-management-access-controls) are also key considerations for [security and compliance](https://www.getunleash.io/security-and-performance). Tracking and auditing feature flag changes help maintain data integrity and meet regulatory requirements. While open access is key, it's equally important to integrate with corporate access controls, such as SSO, to ensure security. In some cases, additional controls like feature flag approvals using the [four-eyes principle](/concepts/change-requests) are necessary for critical changes. ![Preview of a change request to enable gradual rollout in production.](https://files.buildwithfern.com/unleash.docs.buildwithfern.com/4088c051797c2be7dad8fb9cdf4b4fb268625a50f27d9890ed025d4ec82e5bab/assets/feature-flag-change-requests-preview.png) ### Open collaboration strategies for feature flag management For open collaboration, consider providing the following: * **Access to the codebase**: Engineers need direct access to the codebase that contains the feature flags. This allows them to quickly diagnose and fix issues, minimizing downtime and performance problems. * **Access to configuration**: Engineers, product owners, and even technical support should be able to view feature flag configuration. This transparency provides insights into which features are currently active, what conditions trigger them, and how they impact the application's behavior. Product owners can also make real-time decisions on feature rollouts or adjustments without relying solely on engineering resources. * **Access to analytics**: Both engineers and product owners should be able to correlate feature flag changes with production metrics. This helps assess how flags impact user behavior, performance, and system health, enabling data-driven decisions for feature rollouts, optimizations, or rollbacks. ## 6. Protect PII by evaluating flags server-side Feature flags often require contextual data for accurate evaluation, which could include sensitive information such as user IDs, email addresses, or geographical locations. To safeguard this data, follow the data security [principle of least privilege (PoLP)](https://www.cyberark.com/what-is/least-privilege), ensuring that all [Personally Identifiable Information (PII)](https://www.investopedia.com/terms/p/personally-identifiable-information-pii.asp) remains confined to your application. To implement the principle of least privilege, ensure that your *Feature Flag Control Service* only handles the configuration for your feature flags and passes this configuration down to the SDKs connecting from your applications. Let's look at an example where feature flag evaluation happens inside the server-side application. This is where all the contextual application data lives. The flag configuration—all the information needed to evaluate the flags—is fetched from the *Feature Flag Control Service*. ![Evaluating flags on the server side without exposing sensitive information.](https://files.buildwithfern.com/unleash.docs.buildwithfern.com/9a428b53e3e1e564e875af8c03668bda02929be2bd17a42ff59736481d052d0f/assets/feature-flag-server-side-evaluation.png) Client-side applications where the code resides on the user's machine in browsers or mobile devices, require a different approach. You can't evaluate flags on the client side because it raises significant security concerns by exposing potentially sensitive information such as API keys, flag data, and flag configurations. Placing these critical elements on the client side increases the risk of unauthorized access, tampering, or data breaches. ### Evaluate within a self-hosted environment Instead of performing client-side evaluation, a more secure and maintainable approach is to evaluate feature flags within a self-hosted environment. Doing so can safeguard sensitive elements like API keys and flag configurations from potential client-side exposure. This strategy involves a server-side evaluation of feature flags, where the server makes decisions based on user and application parameters and then securely passes down the evaluated results to the frontend without any configuration leaking. ![In client-side setups, perform the feature flag evaluation on the server side. Connected client-side applications receive only evaluated feature flags to avoid leaking configuration.](https://files.buildwithfern.com/unleash.docs.buildwithfern.com/b807e4d7cdaf0d01ea2990710431860dbe4fb2b413890c771a229e6ef99b8312/assets/feature-flag-architecture-client-side.png) ### Server-side components In [Principle 1](#1-enable-runtime-control), we proposed a set of architectural components for building a feature flag system. The same principles apply here, with additional suggestions for achieving local evaluation. For client-side setups, use a dedicated evaluation server that can evaluate feature flags and pass evaluated results to the frontend SDK. ### SDKs [SDKs](/sdks) make it more convenient to work with feature flags. Depending on the context of your infrastructure, you need different types of SDKs to talk to your feature flagging service. Backend SDKs should fetch configurations from the *Feature Flag Control Service* and evaluate flags locally using the application's context, reducing the need for frequent network calls. For frontend applications, SDKs should send the context to an evaluation server and receive the evaluated results. The evaluated results are then cached in memory in the client-side application, allowing quick lookups without additional network overhead. This provides the performance benefits of local evaluation while minimizing the exposure of sensitive data.