Why?
What?
Actually what 12FactorApp is... it is a good set of ideas around basic set of concerns. The concerns are right, the solutions suggested are situational and the situation is the default/basic setup. With the teams following good DevOps-y practices, they don't turn out to be exactly same.
So to avoid the confusions for more people and foremost saving me the pain of explaining myself at different places in different times for my views against 12FactorApp..... here is what the concerns are and what the solutions turn into when following a proper DevOps-y approach.
The 12 Concerns+Solutions:
Few months ago I came across The Twelve-Factor-App preaching the best practices for building and delivering software. Nothing really new, but a good central place with many good practices for people to refer and compare. Recently I saw some implementation of it in an environment where the basic concerns were already handled and thus the solution implemented was redundant and extra cost. To some level also low-grade.
What?
Actually what 12FactorApp is... it is a good set of ideas around basic set of concerns. The concerns are right, the solutions suggested are situational and the situation is the default/basic setup. With the teams following good DevOps-y practices, they don't turn out to be exactly same.
So to avoid the confusions for more people and foremost saving me the pain of explaining myself at different places in different times for my views against 12FactorApp..... here is what the concerns are and what the solutions turn into when following a proper DevOps-y approach.
What @12FactorApp doesn't suit at all for DevOps-y Solutions
- ~
- Dependencies
[+] Obsolete: 'If the app needs to shell out to a system tool, that tool should be vendored into the app.'
Changed-to: Make your automation configuration management system handle it. - Configurations.
[+] Obsolete: The twelve-factor app stores config in environment variables changing between deploys w/o changing any code.
Changed-to: This is not a fine scalable with disaster management based solution. Now configuration management handles the node level deterministic state. The non-developer box level configuration is no more in code.
[+] Obsolete: The twelve-factor app stores config in environment variables changing between deploys w/o changing any code.
Changed-to: Now configuration management handles the node level deterministic state. In such a case keeping configurations in a file is much more verifiable, cleaner and broadly available solution. So, there will be no more noise of different environment level configurations in the same configuration file. - ~
- Build, Release, Run
[+] Obsolete: The resulting release contains both the build and config.
Changed-to: Packaging configuration along-with build makes it dependent of a set environment. Any disaster resistant or scalable architecture would be crippled with it as it requires creating new packages every change. Make your automated configuration management solution intelligent enough to infer required configuration and deploy the build. - ~
- ~
- Concurrency
[+] Obsolete: Twelve-factor app processes should never daemonize or write PIDfiles.
Changed-to: PID files help some automated configuration management solutions to easily identify the 'service' check placed in them. There are operating system level process managers also supporting PIDfiles. Having a pidfile eases up lots of other custom monitoring plug-ins too... and is not a bad practice to have. - ~
- ~
- ~
- ~
Cumulative Correct Concerns 3C@12FactorApp and DevOps-y Solutions
Overall aiming to achieve a easy-to-setup, clean-to-configure, quick-to-scale and smooth-to-update software development ambiance.The 12 Concerns+Solutions:
- Problem: Maintaining Application Source Code
Solution:
a. Using Version Control Mechanism, if possible Distributed VCS like git. Private hosted (at least private account) code repository.
b. Unique application~to~repository mapping i.e. single application or independent library's source code in a single repository.
c. For different versions of same application depend on different commit-stages (not even branches in general cases) of the same code repository. - Problem: Managing Application Dependencies
Solution:
a. Never manually source compile any dependent library or application. Always depend on the standard PackageManager for the intended platform (like rpm, pkg, gem, egg, npm). If there are no packages available, create one. It's not really difficult. On a standard practice, I'd suggest to utilize something like FPM (may be even fpm-cookery gem if you like), which would give you elasticity of easily changing your platform without worrying for the re-creation of packages. Even creating rpm, gem and other is not too much pain compared to the stability it would bring to infrastructure set-up.
b. Make your automated configuration management utility ensure all the required dependencies of your application are pre-installed in correct order of correct version with correct required configurations.
c. The dependency configuration will be specific enough to ensure the usage of the installed & configured dependencies. So in case of compiling binary, use static library linking. If you are loading external libraries, ensure the fixated path. Same configuration management tool can be run even in solo/masterless (no-server) usage mode. - Problem: Configuration in Code, Configuration at all Deploy
Solution:
a. Ideally no configuration details as in node's IP/Name, credentials, etc. shall not be a part of application's codebase. As if such a configuration file is locally available in developer-box repository, in non-alert & non-gitignore days it might get committed to your repository.
b. Make your automated configuration management tool generate all these configuration files for a node based on the node-specific details provided to configuration management tool, no the application.
c. Suggested practice for keeping these configurations with configuration management tool, also require to utilize a proper different data-store from normal configuration statements. Could be in CSVs, Hiera, dedicated parameter's manifest for a tool like Puppet. For a tool like OpsCode's Chef, there is already databag facility available. Wherever available and required the info should be encrypted with a non-repository available secret key. - Problem: Backing Services
Solution:
a. Whatever other application services are required by application to serve can be included in the 'Backing Services' list. It will be services like data-stores (databases, memory cache and more activesupport), smtp services, etc.
b. Every information required for these backing services should be configuration details like node-name, credentials, port#, etc. and maintained as a loaded configuration file via configuration management tool.
c. If it's a highly complex applications broken into several component applications supporting each other, then all other component applications for any component application are also 'Backing Services'. - Problem: Build, Release, Run
Solution:
a. The development stage code gets pushed to codebase and after passing intended tests should be pushed to Build Stage for preparing deploy-able (compile, include dependencies) code. Should read Continuous Integration process for the better approach at it.
b. The deploy-able code is packaged ready to deliver in Release Stage and pushed in to the package-manager repositories. The required configuration for execution environment is provided to automated configuration management solution.
c. Run Stage involves release application package from package-manager and intended system-level configurations via configuration management solution. - Problem: Processes
Solution:
a. No persistent data related to application shall be kept along-with it. All of user-input & calculated information required for service shall be at the 'Backing Services' available to all the instances of the application of that environment. Helping the application to be stateless.
b. Get the static assets compiled at 'Build Stage', served via CDN and cached at load balancing server.
c. Session state data is a good candidate to be stored and retrieved using backing memory powered cache service (like memcache or redis) providing full-blown stateless servers where lossing/killing one and bringing another doesn't impact on user experience. - Problem: Port Binding
Solution:
a. Applications shouldn't allow any run-time injection to get utilized by 'Backing Services' but instead expose their interaction over any RESTful (or similar) protocol.
b. In a standard setup, the data/information store/provider opens up a socket and the retriever contacts at the socket with required data transaction protocol. Now this data/information provider can be 'Backing Service' (like db service) or could be the primary application providing information over to a 'Backing Service' (like application server, load balancer).
c. Either way, they get configured with primary application via automated configuration management by url, port and any other service specific required detail being provided. - Problem: Concurrency
Solution:
a. Here concurrency is mainly used for highly scalable (on requirement) process model, which is almost equivalent to how used libraries manage internal concurrent processes.
b. All application & 'Backing Service' processes should be such managed that process count of one doesn't effect another as in say access via load balancer over multiple http processes.
c. All the processes have a process-type and process-count. There should be a process manager to handle continuous execution of that process with that count. Now it could be ruby rack server to be run with multiple threads on same server, or multiple nodes with nginx serving indecent amount of users via a load balancer. - Problem: Disposability
Solution:
a. Quick code & configuration deployment. Configuration Management solution makes sure of the latest (or required stage) application code & configuration changes cleanly & quickly replace the old application exactly as desired.
b. Application (and 'Backing Services') Architecture shall be elastic, spawning up new nodes under a load-balancer and destroying when the process-load is less must be smooth.
c. Application's data transactions & task list should be crash-proof. The data & tasks shall be managed to handle the re-schedule of those processes in case of application crash. - Problem: Dev/Prod Parity
Solution:
a. Keep dev, staging, ..., and production environments as similar as possible.If not in process count and machine nodes count, but necessarily similar on the deployment tasks. Could utilize 'vagrant' in coordination with configuration management solution to get quick production like environments at any development box.
b. Code manages the application and configuration both, any developer (with considerable system level expertise) could get a hang of configuration management frameworks and manage them. Using 'Backing Services' as mentioned would enable environment based different service providers.
c. Adapting Continuous Delivery would also ensure no new change in code or configuration breaks the deployment. - Problem: Logs
Solution:a. All staging/production environment will have application and 'Backing Services' promoting its logs to a central (like syslog, syslog-ng, logstash, etc) log hub for archival, if required back-up proof. It can be queried here for analyzing trnds in application performance over past time.
b. The central log solution is not configured within applications but the log solution takes care of what to pick and collect, can even have a look at log routers (fluentd, logplex, rsyslog).
c. Specific log trends can be put to alert everyone effected whenever captured again at Central Log Services (like graylog2, splunk, etc). - Problem: Admin Processes
Solution:
a. Application level admin processes (like db-migration, specific-case-tasks, debug console, etc.) shall also pick the same code and configuration as the running instances of application.
b. The admin tasks script related to application shall also ship with application code and evolve with it. As db management rake tasks in RubyOnRails applications, run using 'bundler' to stay pick required Environment related library versions.
c. Languages with REPL shell (like python) or providing it via separate utility (like 'rails console' for rails) give an advantage to easily debug an environment specific issue (which might me arising due to library versions of that environment, data in-consistency, etc) by directly playing around with the objects seemingly acting as the source of flaw.