Thursday, December 5, 2013

go get pkg ~ easy made easier for project dependency management

For past sometime I've been trying out ways to improve practices upon awesome capabilities from GoLang. One of the things have been having a 'bundle install' (for ruby folks) or 'pip require -e' (for python folks)  style capability... something that just refers to an text file part of source code and plainly fetches all the dependencies path mentioned in there (for all others).

It and some other bits can be referred here...
https://github.com/abhishekkr/tux-svc-mux/blob/master/shell_profile/a.golang.sh#L31

It's a shell (bash) function that can be added to your Shell/System Profile files and used...

go_get_pkg(){   if [ $# -eq 0 ]; then if [ -f "$PWD/go-get-pkg.txt" ]; then PKG_LISTS="$PWD/go-get-pkg.txt"     else touch "$PWD/go-get-pkg.txt"       echo "Created GoLang Package empty list $PWD/go-get-pkg.txt"       echo "Start adding package paths as separate lines." && return 0     fi else PKG_LISTS=($@)   fi for pkg_list in $PKG_LISTS; do cat $pkg_list | while read pkg_path; do echo "fetching golag package: go get ${pkg_path}";         echo $pkg_path | xargs go get     done done }
---
What it do?
If ran without any parameters. It checks for current working directory for a file called 'go-get-pkg.txt'. If not found creates one empty file by that name. To be done at initialization of project. If found, then it iterates through each line and pass it directly to "get get ${line}". If ran with parameters. Each parameter is treated as path to files similar to 'go-get-pkg.txt' and similar action as explained previously is performed on each file.
Sample 'go-get-pkg.txt' file
-tags zmq_3_x github.com/alecthomas/gozmq github.com/abhishekkr/levigoNS github.com/abhishekkr/goshare
---

Friday, November 15, 2013

systemd enabled lightweight NameSpace Containers ~ QuickStart Guide

systemd (for some time now) provides a powerful chroot alternative to linux users for creating quick and lightweight system containers using power of cgroups and socket activation.

There is a lot more to "systemd" than this, but that's for some other post. Until then can explore it, starting here.

There is a utility "systemd-nspawn" provided by systemd which acts as container manager. This is what can be used to easily spawn a new linux container and manage it. It has been updated with (the systemd's amazing trademark feature) Socket Activation.

This enables any container to make parent/host's systemd instance to listen at different service ports for itself. Only when those service ports receive a connection, these container will spawn and act to it. Voila, resource utilization and scalability concepts.
More of this can read in detail at: http://0pointer.de/blog/projects/socket-activated-containers.html

Here we'll see some way to quickly start using it via some custom made commands.
All the script commands used here can referred from https://github.com/abhishekkr/tux-svc-mux/blob/master/shell_profile/a.virt.sh as well.

Just download and source the linked script in your shell, and the commands told here will be available...
And yes, your system also need to be running systemd already.

Currently this just lets you create archlinux containers, will soon create different containers and make the script mature.

In case you don't have any created container already, or wanna create a new one...
$ nspawn-arch
To list names of all created containers...
$ nspawn-ls
To stop a running container...
$ nspawn-stop
To start an already created conatiner
$ nspawn-start

---


---

Friday, July 26, 2013

Puppet ~ a beginners concept guide (Part 4) ~ Where is my Data

parts of the "Puppet ~ Beginner's Concept Guide" to read before ~
Part#1: intro to puppet, Part#2: intro to modules, and Part#3: modules much more.

Puppet
beginners concept guide (Part 4)


Where is my Data?

When I started my Puppet-ry, the examples I used to see had all configuration data buried inside the DSL code of manifests, people were trying to use inheritance to push down data. Then got to see a design pattern in puppet manifests keeping out separate parameters manifest for configuration variables. Then came along the External Data lookup via CSV files as a Puppet function. Then with enhancements in puppet and other modules came along more.

Below are few usable to fine ways utilizing separate data sources within your manifests,


Here, we will see usage styles of data for Puppet Manifests, Extlookup CSV, Hiera, Plug-in Facts and PuppetDB.

params-manifest:


It is the very basic way of separating out data from your functionality code, and the preferred way for in-future growing value-set type of data. It will keep it separate from the code since start. Once the requirement is at a level to have varied value to inferred based on environment/domain/fqdn/operatingsystem/[any-facter], it can be extracted to any preferred ways given below and just looked-up here. That would avoid changing the main (sub)module-code.
[ Gist-Set Externalize into Params Manifest: https://gist.github.com/3683955 ]
Say you are providing httpd::git sub-module for httpd module placing a template generated config file using params placed data...
```

File: httpd/manifests/git.pp
it includes the params submodule to access the data

File: httpd/templates/mynode.conf.erb

File: httpd/manifests/params.pp
it actually is just another submodule to only handle data

Use it: run_puppet.sh

```
_

extlookup-csv:


If you think your data would suit to a (key,value) CSV format being extracted to data files.Puppet need to be told the location for CSV files need to be looked up for key, and fetch the value assigned to it in that file.
Names given to these CSV files would matter to Puppet while looking up the values from all present CSV files. Puppet need to be given hierarchy order for these file-names to look for the key and the order could involve variable names.

For E.g. say you have a CSV by name of HOSTNAME, ENVIRONMENT and a common file, with hierarchy specified in respective order too. Then Puppet will first look for the queried Key in CSV by HOSTNAME, if not found looks up in ENVIRONMENT named file and after not finding it there goes looking into common file. If it doesn't find the key in any of those files, it returns the default value if specified in the 'extlookup(key, default_value)' method like this. If there is no default value also, Puppet will raise an exception for no value to return.

[ Gist-Set Externalize into Hierarchical CSV Files: https://gist.github.com/3684010 ]

It's the same example as for params with a flavor of ExtData. Here you'll notice a 'common.csv' external data file providing a default set of values. Then there is also a 'env_testnode.csv' file overriding the only required changed value. Now as in 'site.pp' file, precedence of 'env_%{environment}' file is higher than 'common', the 'httpd::git' would look-up all values first from 'env_testnode.csv' and if not found there would goto 'common.csv'. Hence would end-up overriding 'httpd_git_url' value from 'env_testnode.csv'.
```
```

extlookup() method used here is available as a Puppet Parser Function, you would read more in Part#5 Custom Puppet Plug-Ins on how to create your own functions
_


Hiera is a pluggable hierarchical data storage for Puppet. It was started to provide a better external data storage support than Ext-lookup feature with data formats other than CSV too.

This brings in the essence of ENC for data retrieval without having to write one.

Data look-up happens in a hierarchy provided by configuration with self scope resolution mechanism.

It enables Puppet to fetch data from varied external data sources using it's different backends (like local files, redis, http protocol) which can be added on to if needed.
The 'http' backend in turn enables support for data store from any service (couchdb, riak, web-app or so) to provide data.

File "hiera.yaml" from Gist below is an example of hiera configuration to be placed in puppet's configuration directory. The highlights of this configuration are ":backends:", backend source and ":hierarchy:". Multiple backend can be used at same time, their order of listing mark their order of look-up. Hierarchy configures the order for data look-up by scope.

Then depending on what backend you have added, you need to add their source/config to look-up data at.
Here we can see configuration for using local "yaml", "json" files. Look-up data from Redis server (it will set-up datasets for redis usage for current example) with authentication in place. And looking up data from any "http" service with hierarchy as the ":paths:" value.
You can even use GPG protected data as backend, but that is a bit messy to use.

Placing ".yaml" and ".json" from Gist at intended provider location.
The running "use_hiera.sh" would make you show the magic from this example on hiera.

```Gist
```
[Gist-Set Using Hiera with Masterless Puppet set-up: https://gist.github.com/abhishekkr/6133012 ]
_

plugin-facts:


Every system has its own set of information facter (
http://projects.puppetlabs.com/projects/facter) by default made available to puppet. Puppet also enable DevOps people to set custom facter to be used in modules.
The power of these computed Facters is they can use full ruby-power to use local/remote plain/encrypted data over REST/Database/API/anyway available channel.
These require the power of Puppet Custom Plug-Ins (http://docs.puppetlabs.com/guides/custom_facts.html). The ruby file doing this would go at 'MODULE/lib/puppet/facter' and would get loaded by the 'pluginsync=true' in action.
Way to set a Facter in such Ruby code is just...
my_value = 'all ruby code to compure it'
Facter.add(mykey) do
  setcode do
    my_value
  end
end
.....all the rest of code there need to compute the value to be set, or even key-set.

[Gist-Set Externalize Data receiving as Facter: https://gist.github.com/3684968 ]

Same 'httpd::git' example revamped to use Custom Facter as 
```
```
There is also another way to provide a Facter in Puppet Catalog, that can be done by providing an Environment variable with capitalized Facter name pre-fixed by 'FACTER_' and the value which it's supposed to have.
For E.g. # FACTER_MYKEY=$my_value puppet apply --modulepath=$MODULEPATH -e "include httpd::git"
_

puppetdb:


It's a beautiful addition to Puppet component set. Something that have been missing for long and possibly the thing because of which I delayed this post by half year.
It enables the 'storeconfig' power without the Master, provides a support of trusted DB for infrastructure-related data needs and thus best suited of all.

To set-up 'puppetdb' on a node follow the PuppetLabs has a nice documentation.
To set-up a decent example for master-less puppet mode, follow the given steps

Place the 2 '.conf' and 1 '.yaml' file in Puppet's configuration directory.
The shell script would prepare the node with PuppetDB service for masterless puppet usage scenario.

Puppet config setting storeconfig to 'puppetdb' enables saving of exported resources to it. The 'reports' config their would push the puppet apply reports to the database.
PuppetDB config makes Puppet aware of the host and port to connect database at.
The facts setting on routes.yaml enable PuppetDB to be used in a masterless mode.

```
```

[Gist-Set Using PuppetDB with Masterless Puppet set-up: https://gist.github.com/abhishekkr/6114760 ]

Now running anything say like...
puppet apply -e 'package{"vim": }'
and beautiful to that 'export resources' would work like a charm using PuppetDB.
The puppet.conf accompanied will make reports dumped to PuppetDB as well.

_


There's a fine article on the same by PuppetLabs...

Friday, May 31, 2013

Testing Chaos with Automated Configuration Management solutions


No noise making.

But let's be real, think of the count of community contributed (or mysterious closed-and-sold 3rd Party) services, frameworks, library and modules put to use for managing your ultra-cool self-healing self-reliant scalable Infrastructure requirements. Now with so many cogs collaborating in the infra-machine, a check on their collaboration seems rather mandatory like any other integration test for your in-house managed service. 
After all that was key idea behind having automated configuration management itself.

Now the utilities like Puppet/Chef have been out there accepted and used by dev & ops folks for quite some time now.
But the issue with the initially seen amateur testing styles is it evolved from the non-matching frame of 'Product' oriented unit/integration/performance testing. 'Product' oriented testing focus more on what happens inside the coded logic and less on how user gets affected by product.
Most of the initial tools released for testing logic developed in Chef/Puppet were RSpec/Cucumber inspired Product testing pieces. Now for the major part of installing a package, restarting a service or pushing artifacts these tests are almost non-required as the main functionality for per-say installing package_abc is already tested inside the framework being used.
So coding to "ask" to install package_abc and testing if it has been asked seems futile.

That's the shift. The logic developed for Infrastructure acts as a glue to all other applications created in house and 3rd party. Here in Infrastructure feature development there is more to test for the effect it has on the it's users (software/hardware) and less on internal changes (dependencies and dynamic content). Now the stuff in parentheses here means a lot more than seems... let's get into detail of it.

Real usability of Testing is based on keeping sanctity of WHAT needs to be tested WHERE.


Software/Hardware services that collaborate with the help of Automated Infrastructure logic needs major focus of testing. These services can be varying from the
  • in-house 'Product', that is the central component you are developing
  • 3rd Party services it collaborates with,
  • external services it utilizes for what it doesn't host,
  • operating system that it supports and Ops-knows what not.

Internal changes mainly revolve around
  • Resources/Dependencies getting called in right order and grouped for specific state.
  • It also relates to correct generation/purging of dynamic content, that content can itself range as
    • non-corrupt configuration files generated of a template
    • format of sent configuration data from one Infra-component to another for reflected changes
    • dynamically creating/destroying service instances in case of auto-scalable infrastructure


One can decide HOW, on ease and efficiency basis.


Unit Tests work for the major portion of 'Internal Changes' mentioned before using chefspecrspec-chef, rspec-puppet like libraries are good enough. They can very well test the dependency order and grouping management as well as the different data effect on non-corrupt configuration generation from templates.


Integration Tests in this perspective are a of a bit interesting and evolutionary nature. Here we have to ensure the "glue" functionality we talked about for Software/Hardware service is working properly. These will confirm that every type of required machine role/state can be achieved flawlessly, call them 'State Generation Test'. They also need to confirm the 'Reflected Changes Test' across Infra-component as mentioned in Internal changes.
Now utilities like test-kitchen/docker in collaboration with vagrant, docker, etc. help placing them in your Continuous Integration pipeline. This would even help in testing same service across multiple linux distros if that's the plan to support.
Library 'ServerSpec' is also a little nifty piece to write quick final state check scripts.
Then final set of Integration Testing is implemented in form of Monitoring on your all managed/affecting Infrastructure components. This is the final and ever-running Integration Test.


Performance Tests, yes even they are required for it. Tools like ChaosMonkey enable you to enable your Infra to be self-healing and auto-scalable. Should be load-test noticing dynamic containers count and behavior if auto-scalability is a desired functionality too.

Wednesday, April 24, 2013

Beginner's Guide to OpenStack : Basics of Nova [Part 2]

parts of Beginner's Guide to OpenStack to read before this ~

[Part.2 Basics of Nova] Beginner's Guide to OpenStack

# Nova?
It's the main fabric controller for IaaS providing Cloud Computing Service by OpenStack. Took its first baby steps in NASA. Contributed to OpenSource and became most important component of OpenStack.
It built of multiple components performing different tasks turning End User's API request into a virtual machine service. All these components run in a non-blocking message based architecture, and can be run off from same or different locations with just access to same message queue service.

---

# Components?

Nova stores states of virtual machines in a central database. It's optimal for small deployments. Nova is moving towards multiple data stores with aggregation for high scale requirements.








  • Nova API : supports OpenStack Compute API, Amazon's EC2 API and powerful Admin API (for privileged users). It's used to initiate most of orchestration activities and policies (like Quota). It gets communicated over HTTP, converts the requests to commands further contacting other components via Message Broker and HTTP for ObjectStore. It's a WSGI application which routes and authenticates requests.
  • Nova Compute : worker daemon taking orders from its Message Broker and perform virtual machine create/delete tasks using Hypervisor's API. It also updates status of its tasks in Database.
  • Nova Scheduler : decides which Nova Compute Host to allot for virtual machine request.
  • Network Manager : worker daemon picking network related tasks from its Message Broker and performing those. OpenStack's Quantum now with Grizzly release can be opted instead of nova-network. Tasks like maintaining IP Forwarding, Network Bridges and VLANs get covered.
  • Volume Manager : handles attach/detach of persistent block storage volumes to virtual machines (similar to Amazon's EBS). This functionality has been extracted to OpenStack's Cinder. It's an ISCSI solution utilizing Logical Volume Manager. Network Manager doesn't interfere in Cinder's tasks but need to be setup for Cinder to be used.
  • Authorization Manager : interfaces authorized APIs usage for Users, Projects and Roles. It communicates with OpenStack's KeyStone for details.
  • WebUI : OpenStack's Horizon communicates with Nova API for Dashboard interfacing.
  • Message Broker : All components of Nova communicate with each other in a non-blocking callback-oriented manner using AMQP protocol well supported by RabbitMQ, Apache QPid. There is also emerging support for ZeroMQ integration as Message Queue. It's like central task list shared and updated by all Nova components.
  • ObjectStore : It's a simple file-based storage (like Amazon's S3) for images. This can be replaced with OpenStack's Glance.
  • Database : used to gather build times, run states of virtual machines. It has details around instance types available, networks available (if nova-network), and projects. Any database supported by SQLAlchemy can be used. It's central information hub for all Nova components.


---

# API Style
Interface is mostly RESTful. Routes (python re-implementation of Rails route system) packages maps URIs to action methods on controller classes.
Each HTTP Request to Compute requires specific authentication credentials required. Multiple authentication schemes can be allowed for a Compute node, provider determines the one to be used.

---

# Threading Model
Uses Green Thread implementation by design using eventlet and greenlet libraries. This results into single process thread for O.S. with it's blocking I/O issues. Though single reduces race conditions to great extent, to eliminate them further in suspicious scenarios use decorator @lockutils.synchronized('lock_name') over methods to be protected from it.
If any action is long-running, it should have methods with desired process-state location triggering eventlet context switch. Placing something like following code-piece will switch context to waiting threads, if any. And will continue on current thread without any delay if there is no other thread in wait.
from eventlet import greenthread
greenthread.sleep(0)
MySQL query uses drivers blocking main process thread. In Diablo release a thread pool was implemented but removed because of trade-off for advantages over bugs.

---

# Filtering Scheduler
In short it's the mechanism used by 'nova-scheduler' to choose the worthy nova-compute host for new required virtual machine to be spawned upon. It prepares a dictionary of unfiltered hosts and weigh their costing for creating required virtual machine(s) request. Then it chooses the least costly host.
Hosts are weighted based on the configuration options for virtual machines.
It's a better practice for customer to ask for large count of required instances together as each request computes weight.

---

# Message Queue Usage
Nova components use RPC to communicate each other via Message Broker using PubSub. Nova implements rpc.call (request/response, API acts as consumer) and rpc.cast (one way, API acts as publisher).
Nova API and Scheduler uses message queue as Invoker, whereas Network and Compute act as workers. Invoker pattern sends messages via rpc.call or rpc.cast. Worker pattern receives messages from queue and respond back to rpc.call with appropriate response.
Nova uses Kombu library when interfacing with RabbitMQ.

---

# Hooks
Enable developers to extend Nova capabilities by adding named hooks to Nova code as decorator that will lazily load plug-in code matching hook name (using setuptools entrypoints, it's an extension mechanism). The hook's class definition should have pre and post method.
Don't use hooks when stability is a factor, internal APIs may change.

---

# Dev Bootstrap
To get started with contributing... read this (OpenStack Wiki on HowToContribute) in detail.

To get rolling with Nova wheels, system will need to have libvirt and one of the hypervisors (xen/kvm preferred for linux hosts) present.
$ git clone git://github.com/openstack/nova.git
$ cd nova
$ python ./tools/install_venv.py
this will prepare your copy of nova codebase with virtualenv required, now any command you wanna run on this in context of required codebase
$ ./tools/with_venv.sh

---

# Run My Tests
to run the nose tests and pep8 checker, when you are done with virtualenv setup (or that will be initiated first here)... inside 'nova' codebase
$ ./run_tests.sh

---

# Terminology

  • Server: Virtual Machines created inside Compute System, required Flavor & Image detail.
  • Flavor: Represents unique hardware configurations with disk space, memory and CPU time priority
  • Image: System Image File used to create/rebuild a Server
  • Reboot: Soft Server Reboot sends a graceful shutdown signal. Hard Reboot does power reset.
  • Rebuild: Removes all data on Server and replaces it with specified image. Server's IP Address and ID remains same.
  • Resize: Converts existing server to a different flavor. All resize need to be explicitly confirmed, only then the original server is removed. After 24hrs. delay, there is an automated confirmation.


Wednesday, April 17, 2013

Beginner's Guide to OpenStack : Basics [Part 1]

# OpenStack?

OpenStack (http://www.openstack.org/) is an OpenSource cloud computing platform that can be used to build up a Public and Private cloud. As in weaving of various technological components to provide a capability to build a cloud service supporting any use-case and scale.

Once upon a time RackSpace came into Cloud Services. In some parallel beautiful world, few Pythonistas at NASA started building there own Nova Cloud Compute to handle there own instances. RackSpace bought SliceHost which worked 'somewhat' fine. RackSpace came along with their Swift Object Storage Service and weaved in Nova with few more components around it. More other companies like HP, RedHat, Canonical  etc. came along to contribute and benefit from OpenSource cloud.

It's all Open it can be. Open Source. Open Design. Ope Development. Open Community.

---

# Quick Hands-On

DevStack (http://devstack.org/) gives you the easiest fastest way to get all OpenStack components installed, configured and started on any supported O.S. platform.
You can trial-run your app-code in an OpenStack environment at TryStack (http://trystack.org/).
RedHat RDO (http://openstack.redhat.com/Main_Page) is also coming in soon making it super easy to get OpenStack running on RHEL-based distros.
---

# Components?


OpenStack Cloud Platform constitutes of mainly following components:
  • Compute: Nova
    Brings up and maintains operations related to virtual server as per requirement.
    ~like aws ec2
  • Storage: Swift
    Allows you to store, retrieve & remove objects (files).
    ~like aws s3
  • Image Registry/Delivery: Glance
    Processes metadata for disk images, manages read/write/delete for actual image files using  'Swift' or similar scalable file storage service.
    ~like aws ami
  • Network Management: Quantum/Melange
    Provides all the networking mechanisms required in any instance or environment as a service. Handels network interface cards plug/un-plug actions, ip allocation procedures along-with capability enhancement possible to virtual switches.
  • Block Storage: Cinder
    Enables to attach volumes for persistent usage. Detach them, snapshot them.
    ~like aws ebs
  • WebUI: Horizon
    Provides usability improvement for users or projects for managing compute nodes, object storage resources, quota usages and more in a detailed web-app way.
    ~like aws web dashboard
  • Authentication: Keystone
    Identity management system, providing apis to all other OpenStack components to query for authorization.
  • Billing Service: Ceilometer (preview)
    Analyzes quantity, cost-priority and hence billing of all the tasks performed at cloud.
  • Cloud Template: Heat (under construction)
    Build your entire desired cloud setup providing OpenStack a Template for it.
    ~like aws cloudformation
  • OpenStack CommonOSLO (tenure code)
    Supposed to contain all common libraries of shared infrastructure code in OpenStack.

Hypervisors are software/firmware/hardware that enables to create, run and monitor virtual machines. OpenStack Compute supports multiple hypervisors like KVM, LXC, QEMU, XEN, VMWARE & more.

Message Queue Service is used by most of the OpenStack Compute services to communicate with each other using AMQP (Advanced Message Queue Protocol) supporting async calls and callbacks.

---

# Weaving of Components


asciigram: openstack ~ evolution mode, how varied components are connected
~~~~~

~~~~~










~~~~~












~~~~~













~~~~~















---

More Links:


Sunday, February 3, 2013

MessQ : message queue for quickly trying any idea

Past some time while trying up some set-up based on Message Queue at infrastructure... needed a quick to set-up, localhost friendly, network available Message Queue service to try out ideas.
So here is Mess(age)Q(ueue). Something quickly thrown together. Would work later to get it more performance oriented, good to go with smaller projects.

@GitHub:       https://github.com/abhishekkr/messQ
@RubyGems: https://rubygems.org/gems/messQ
_________________________

A Quick Tryout

[+] Install
$ gem install messQ --no-ri --no-rdoc
[+] Start Server (starts at 0.0.0.0 on port#5566)
$ messQ --start
[+] Enqueue user-id & home value to the Queue
$ messQ -enq $USER
$ messQ --enqueue $HOME
[+] Dequeue 2 values from Queue
$ messQ -deq
$ messq --dequeue
[+] Stop Server
$ messQ --stop
_________________________

Via Code

[+] Install
$ gem install messQ --no-ri --no-rdoc
or add following to your Gemfile
gem 'messQ'
require 'messQ'

[+] Start Server
MessQ.host = '127.0.0.1' # default is 0.0.0.0
MessQ.port = 8888 # default is 5566
MessQ.messQ_server

[+] Enqueue user-id & home value to the Queue
MessQ.host = '127.0.0.1' # default is 0.0.0.0
MessQ.port = 8888 # default is 5566
MessQ::Agent.enqueue(ENV['USER'])
MessQ::Agent.enqueue(ENV['HOME'])

[+] Dequeue 2 values from Queue
MessQ.host = '127.0.0.1' # default is 0.0.0.0
MessQ.port = 8888 # default is 5566
puts MessQ::Agent.dequeue
puts MessQ::Agent.dequeue

[+] Stop Server
MessQ::Server.stop