just few tuts: July 2012

Saturday, July 28, 2012

DevOps AND 12FactorApp ~ some obsolete & much valid

Why?

Few months ago I came across The Twelve-Factor-App preaching the best practices for building and delivering software. Nothing really new, but a good central place with many good practices for people to refer and compare. Recently I saw some implementation of it in an environment where the basic concerns were already handled and thus the solution implemented was redundant and extra cost. To some level also low-grade.

What?

Actually what 12FactorApp is... it is a good set of ideas around basic set of concerns. The concerns are right, the solutions suggested are situational and the situation is the default/basic setup. With the teams following good DevOps-y practices, they don't turn out to be exactly same.

So to avoid the confusions for more people and foremost saving me the pain of explaining myself at different places in different times for my views against 12FactorApp..... here is what the concerns are and what the solutions turn into when following a proper DevOps-y approach.

What @12FactorApp doesn't suit at all for DevOps-y Solutions

~
Dependencies
[+] Obsolete: 'If the app needs to shell out to a system tool, that tool should be vendored into the app.'
Changed-to: Make your automation configuration management system handle it.
Configurations.
[+] Obsolete: The twelve-factor app stores config in environment variables changing between deploys w/o changing any code.
Changed-to: This is not a fine scalable with disaster management based solution. Now configuration management handles the node level deterministic state. The non-developer box level configuration is no more in code.
[+] Obsolete: The twelve-factor app stores config in environment variables changing between deploys w/o changing any code.
Changed-to: Now configuration management handles the node level deterministic state. In such a case keeping configurations in a file is much more verifiable, cleaner and broadly available solution. So, there will be no more noise of different environment level configurations in the same configuration file.
~
Build, Release, Run
[+] Obsolete: The resulting release contains both the build and config.
Changed-to: Packaging configuration along-with build makes it dependent of a set environment. Any disaster resistant or scalable architecture would be crippled with it as it requires creating new packages every change. Make your automated configuration management solution intelligent enough to infer required configuration and deploy the build.
~
~
Concurrency
[+] Obsolete: Twelve-factor app processes should never daemonize or write PIDfiles.
Changed-to: PID files help some automated configuration management solutions to easily identify the 'service' check placed in them. There are operating system level process managers also supporting PIDfiles. Having a pidfile eases up lots of other custom monitoring plug-ins too... and is not a bad practice to have.
~
~
~
~

Cumulative Correct Concerns 3C@12FactorApp and DevOps-y Solutions

Overall aiming to achieve a easy-to-setup, clean-to-configure, quick-to-scale and smooth-to-update software development ambiance.
The 12 Concerns+Solutions:

Problem: Maintaining Application Source Code
Solution:
a. Using Version Control Mechanism, if possible Distributed VCS like git. Private hosted (at least private account) code repository.
b. Unique application~to~repository mapping i.e. single application or independent library's source code in a single repository.
c. For different versions of same application depend on different commit-stages (not even branches in general cases) of the same code repository.
Problem: Managing Application Dependencies
Solution:
a. Never manually source compile any dependent library or application. Always depend on the standard PackageManager for the intended platform (like rpm, pkg, gem, egg, npm). If there are no packages available, create one. It's not really difficult. On a standard practice, I'd suggest to utilize something like FPM (may be even fpm-cookery gem if you like), which would give you elasticity of easily changing your platform without worrying for the re-creation of packages. Even creating rpm, gem and other is not too much pain compared to the stability it would bring to infrastructure set-up.
b. Make your automated configuration management utility ensure all the required dependencies of your application are pre-installed in correct order of correct version with correct required configurations.
c. The dependency configuration will be specific enough to ensure the usage of the installed & configured dependencies. So in case of compiling binary, use static library linking. If you are loading external libraries, ensure the fixated path. Same configuration management tool can be run even in solo/masterless (no-server) usage mode.
Problem: Configuration in Code, Configuration at all Deploy
Solution:
a. Ideally no configuration details as in node's IP/Name, credentials, etc. shall not be a part of application's codebase. As if such a configuration file is locally available in developer-box repository, in non-alert & non-gitignore days it might get committed to your repository.
b. Make your automated configuration management tool generate all these configuration files for a node based on the node-specific details provided to configuration management tool, no the application.
c. Suggested practice for keeping these configurations with configuration management tool, also require to utilize a proper different data-store from normal configuration statements. Could be in CSVs, Hiera, dedicated parameter's manifest for a tool like Puppet. For a tool like OpsCode's Chef, there is already databag facility available. Wherever available and required the info should be encrypted with a non-repository available secret key.
Problem: Backing Services
Solution:
a. Whatever other application services are required by application to serve can be included in the 'Backing Services' list. It will be services like data-stores (databases, memory cache and more activesupport), smtp services, etc.
b. Every information required for these backing services should be configuration details like node-name, credentials, port#, etc. and maintained as a loaded configuration file via configuration management tool.
c. If it's a highly complex applications broken into several component applications supporting each other, then all other component applications for any component application are also 'Backing Services'.
Problem: Build, Release, Run
Solution:
a. The development stage code gets pushed to codebase and after passing intended tests should be pushed to Build Stage for preparing deploy-able (compile, include dependencies) code. Should read Continuous Integration process for the better approach at it.
b. The deploy-able code is packaged ready to deliver in Release Stage and pushed in to the package-manager repositories. The required configuration for execution environment is provided to automated configuration management solution.
c. Run Stage involves release application package from package-manager and intended system-level configurations via configuration management solution.
Problem: Processes
Solution:
a. No persistent data related to application shall be kept along-with it. All of user-input & calculated information required for service shall be at the 'Backing Services' available to all the instances of the application of that environment. Helping the application to be stateless.
b. Get the static assets compiled at 'Build Stage', served via CDN and cached at load balancing server.
c. Session state data is a good candidate to be stored and retrieved using backing memory powered cache service (like memcache or redis) providing full-blown stateless servers where lossing/killing one and bringing another doesn't impact on user experience.
Problem: Port Binding
Solution:
a. Applications shouldn't allow any run-time injection to get utilized by 'Backing Services' but instead expose their interaction over any RESTful (or similar) protocol.
b. In a standard setup, the data/information store/provider opens up a socket and the retriever contacts at the socket with required data transaction protocol. Now this data/information provider can be 'Backing Service' (like db service) or could be the primary application providing information over to a 'Backing Service' (like application server, load balancer).
c. Either way, they get configured with primary application via automated configuration management by url, port and any other service specific required detail being provided.
Problem: Concurrency
Solution:
a. Here concurrency is mainly used for highly scalable (on requirement) process model, which is almost equivalent to how used libraries manage internal concurrent processes.
b. All application & 'Backing Service' processes should be such managed that process count of one doesn't effect another as in say access via load balancer over multiple http processes.
c. All the processes have a process-type and process-count. There should be a process manager to handle continuous execution of that process with that count. Now it could be ruby rack server to be run with multiple threads on same server, or multiple nodes with nginx serving indecent amount of users via a load balancer.
Problem: Disposability
Solution:
a. Quick code & configuration deployment. Configuration Management solution makes sure of the latest (or required stage) application code & configuration changes cleanly & quickly replace the old application exactly as desired.
b. Application (and 'Backing Services') Architecture shall be elastic, spawning up new nodes under a load-balancer and destroying when the process-load is less must be smooth.
c. Application's data transactions & task list should be crash-proof. The data & tasks shall be managed to handle the re-schedule of those processes in case of application crash.
Problem: Dev/Prod Parity
Solution:
a. Keep dev, staging, ..., and production environments as similar as possible.If not in process count and machine nodes count, but necessarily similar on the deployment tasks. Could utilize 'vagrant' in coordination with configuration management solution to get quick production like environments at any development box.
b. Code manages the application and configuration both, any developer (with considerable system level expertise) could get a hang of configuration management frameworks and manage them. Using 'Backing Services' as mentioned would enable environment based different service providers.
c. Adapting Continuous Delivery would also ensure no new change in code or configuration breaks the deployment.
Problem: Logs
Solution:a. All staging/production environment will have application and 'Backing Services' promoting its logs to a central (like syslog, syslog-ng, logstash, etc) log hub for archival, if required back-up proof. It can be queried here for analyzing trnds in application performance over past time.
b. The central log solution is not configured within applications but the log solution takes care of what to pick and collect, can even have a look at log routers (fluentd, logplex, rsyslog).
c. Specific log trends can be put to alert everyone effected whenever captured again at Central Log Services (like graylog2, splunk, etc).
Problem: Admin Processes
Solution:
a. Application level admin processes (like db-migration, specific-case-tasks, debug console, etc.) shall also pick the same code and configuration as the running instances of application.
b. The admin tasks script related to application shall also ship with application code and evolve with it. As db management rake tasks in RubyOnRails applications, run using 'bundler' to stay pick required Environment related library versions.
c. Languages with REPL shell (like python) or providing it via separate utility (like 'rails console' for rails) give an advantage to easily debug an environment specific issue (which might me arising due to library versions of that environment, data in-consistency, etc) by directly playing around with the objects seemingly acting as the source of flaw.

As I Always Say

Every Generic Solution is very Specifically Placed.

Tuesday, July 17, 2012

how to host your Public YUM (or any) Repo

almost an year ago came up the simple idea of getting a really simple static-content (html,css,js,...) website on public portal hosted by Google AppEngine for free upto a daily revived usage scheme: http://gae-flat-web.appspot.com/

few days back I was just playing around custom yum repos and thought why not get up one of my own for public usage with RPMs served for either my projects or other non-available rpms, and what I came up with is: http://yum-my.appspot.com/flat_web/index.htm

it's nothing fascinating but just a re-mixed usage of project created from gae-flat-web.

you can access base skeleton of this re-mixed gae-yum-my (the easy way to host your yum repo online) at https://github.com/abhishekkr/gae-yum-my which also has rpm served for gae-flat-web.

now to see how you could get one too~

First Task, register a new portal on Google AppEngine (it's free for reasonable limited usage)using your Google Account. Say, your appengine portal is name MY-YUM-MY.

Now the fun begins.

cloned the starter kit

$ git clone

enter the cloned repo

$ cd gae-yum-my

to actually change your application name in app.yaml

$ sed -i 's/yum-my/MY-YUM-MY/g' app.yaml

create the required linux distro, release branch

$ mkdir yummy/<distro><releasever>/<basearch>

copy all required RPMs in that distro, release branch

$ cp <ALL_MY_RPMS_of__DISTRO_ReleaseVer_BaseArch> yummy/<distro><releasever>/<basearch>/

prepare yum-repo-listing using createrepo command

$ createrepo yummy/<distro><releasever>/<basearch>/

now, place a file 'flat_web/yum-my-el6<or-whichever>.repo' with content

[yum-my-<distro><releasever>-<basearch>]

name=MY-YUM-MY

baseurl=http://MY-YUM-MY.appspot.com/yummy/<distro>$releasever/$basearch

enabled=1

gpgcheck=0

and can link this file on your 'flat_web/index.htm' homepage

to host:

$ <google_appengine_path>/appcfg.py update <MY-YUM-MY_path>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

now you yum repo has a homepage at http://MY-YUM-MY.appspot.com

and placing tthe *.repo file above with hinted content will get the RPMs added to you repo accessible.

Sunday, July 8, 2012

Puppet ~ a beginners concept guide (Part 2) ~ road to Modules

you might prefer first reading [ Puppet ~ beginners' concept guide ] Part#1
the section after this Part#3 talks about More on Modules is you work serious

Puppet
beginners concept guide (Part 2)
The Road To Modules

[] Puppet Modules? (no, no..... nothing different conceptually)
Puppet Modules (like in most other technological references, and the previous part of this tutorial) are libraries to be loaded and shared as per the required set of configuration.

Think if you have a war application to be deployed over tomcat. For the stated requirement you require tomcat to be present on the machine with correct required configurations and war file to be correctly downloaded and placed on the machine with correct permissions.
In a general scenario requirement like this, two modules come up. One to install, configure and service-start tomcat service. Another to download/locate war file, use tomcat's configure and service sub-module.

[] Logic of Structure Logic (just how is your module structured and )
The different components of structural design followed by each puppet module:

manifests
All your '<module/submodule>.pp' manifest files go into '<module_dir>/manifests'.
Puppet has an auto-load service for modules/sub-modules, so the naming of these *.pp files should be suiting the class names.
As discussed above for a 'tomcat' module, you are also gonna create sub-modules like 'tomcat::install', 'tomcat::configure', and 'tomcat::service'.
So the files that will get create be '<tomcat-module>/manifests/install.pp', '<tomcat-module>/manifests/configure.pp', '<tomcat-module>/manifests/service.pp'.
Now if there would have been a sub-module like 'tomcat::configure::war', then the file-path would go like '<tomcat-module>/manifests/configure/war.pp'.
templates
As any other language, where you want some static data merged with varying passed-on or environment variables and pushed in somewhere as content. Say, for 'tomcat::config' sub-module as you wanna parameter-ize somethings like 'war' file name. Then this war file-name is being passed-on by 'deploy_war' module.
This ruby template goes in '<tomcat-module>/files/war_app.conf.erb' and whenever required it's content received as "template('<tomcat-module>/war_app.conf.erb')"
files
Any kin'of static file can be served from a module using puppet's fileserver mount points. Every puppet module has a default file-server mount location at '<tomcat-module>/files'.
So a file like '<tomcat-module>/files/web.war' get to be served at Puppet Agents pointing to source of 'puppet:///modules/<tomcat-module>/web.war'.
lib
This is the place where you can plug-in your custom mods to puppet and use your newly powered up puppet features.
This is the one feature that lets you actually utilize your ruby-power and add-on custom facts, providers & types (with default location at '<tomcat-module> /lib/ <facter|puppet>', '<tomcat-module> /lib/puppet/ <parser|provider|type>') to be used via puppet in your modules. To be used it requires 'pluginsync = true' configuration to be present at 'puppet.conf' level.
We'll discuss this in more detail with all sorts of examples in following up blogs and add the links here. Until then it can be referred at docs.puppetlabs.com.
spec/tests
As Love needs Money to avoid worldly issues affect its charm. Similarly, Code need Tests. In location '<tomcat-module>/spec/' you can have your puppet-rspec tests for puppet module.
The path '<tomcat-module>/tests/' would have common examples on how the module classes would be defined.

[] Modules Fundamental Live (mean the actual code sample.....)