Thursday, 2 March 2017

Regarding Dependency Injection

A colleague of mine recently posted on an internal Slack channel about this blog article: How not to do dependency injection - the static or singleton container.
Reading the article, I became aware of something that a friend of mine had written about, namely the absolute vitriol that many coding blogs and generally good books seem to have against Service Location. For example, I’m reading Adaptive Code via C# at the moment and in that Gary McLean Hall has a small meltdown about it, which is a shame because it’s a pretty good book for people to learn about refactoring and use of basic patterns. However it also commits the cardinal sin of saying that the ‘D’ in SOLID is for Dependency Injection and not Dependency Inversion, and that betrays the architectural bias of the author.
The upshot being:
  • Anyone who tells you service location is an anti-pattern isn’t fully aware of the problems that it is supposed to be an answer for.
  • Dependency Injection moves away from the original point - separating the configuration of services from their use.
  • Dependency Injection vastly increases the surface area of an object via abuse of the constructor which isn’t bound by interface contract, thereby avoiding abstracting dependencies fully and leading to constructors being the medium by which relationships between objects are communicated - (this is not a virtue)
  • By making assumptions about state, Dependency Injection turns architectural uses-a relationships into has-a - which blocks the use of singletons and a bunch of other architectural patterns where they would be appropriate.
  • Further, it means that although relationships between data entities are modelled, behavioural relationships between objects are not because they become one great big dependency ball.
Another in-depth article Guy wrote about Inversion of Control can be found here:

Friday, 27 January 2017

Microscaling follow-up

Hello. Long time no follow-up blog about this.
My original post detailed the problem that I was seeing with using Amazon’s EC2 Container Service to manage a cluster of identical containers that required to have traffic routed to them individually from a load balancer. At the time, this couldn’t be performed without extra tooling around the system, and I hinted that I had come up with a solution that would allow the process of microscaling to be achieved on EC2.
I had a part two of that article sitting in draft for well over a year; various things happened - we bought our first house, my partner became pregnant and our daughter was born back in October, and part one even got cited by force12.io on their microscaling.org page in their collection of papers - so, you will perhaps forgive me for being a little distracted and not getting around to finishing that bit off.
Meantime, things moved along in the world of ECS.
The Application Load Balancer feature arrived and after a couple of months of me studiously ignoring its existence, I can now say that it does exactly what I needed Warden to do in the first place, namely that an ECS Service is now not limited to static port routing and a new Service instance with a dynamically allocated external port on the container host can be routed to automatically by the ALB. (Previously, it had meant that adding a new Service instance would require a new container host to be added to a cluster as the “Classic” Elastic Load Balancer could not route automatically to the dynamic ports.)
However, I will document what I’d come up with that filled the gap.

Warden

Warden (https://github.com/fractos/warden) was my hack for this, which enabled me to run multiple identical containers on an ECS cluster and have them load-balanced. It is the Go version of a set of Perl scripts that I had created which managed the ECS cluster of image servers.
It was impossible to manage the level (number of instances) of an ECS Service if its definition required a static container port and you wanted to re-use the same host for as many container instances as possible. So, instead of defining an ECS Service, Warden’s Manager would manage the number of Tasks that were programatically being run across the cluster, increasing or decreasing them according to a metric (e.g. number of Elastic Beanstalk instances in an upstream system that calls the image servers I’m working on). ECS would then schedule the new Tasks across the cluster and the Registrar process would detect a change in the running container instances on a host, updating the host’s local Nginx routing setup to match and enrolling or removing the host from an associated Elastic Load Balancer if required.
Each container host in the cluster would have two extra containers running:
  • Redx (https://github.com/CBarraford/docker-redx)
    Which is a modified version of Nginx that takes its routing configuration from a live Redis database.
  • Redis
    To serve as Redx’s configuration database. Lua code in the Nginx config allows the routing configuration to be read dynamically from Redis, so front-ends and back-ends can be added or removed without having to restart the Nginx process.
The cluster would need a central Redis instance that would serve as the service database and competition platform for Warden instances.
  1. Synchronised the list of currently active containers on a host with a local nginx configuration held in Redis - “Registrar”
  2. Managed the number of Tasks running across an ECS cluster for a particular Task Definition according to a connected metric - “Manager”

The Registrar

The Registrar periodically examines the list of Docker containers that are running on the host and, for those it recognises, it inspects the container, pulling out the local IP address that Docker has assigned. The IP addresses and the exposed port numbers are used to update the Nginx configuration held in a local Redis. Each Service has an Nginx front-end which is on a well-known port, unique for that Service. The separate container instances are added as Nginx back-ends associated with the Service front-end. Arriving traffic is therefore load-balanced internally across all the matching containers on the container host. If a Service has any container instances on the host then Warden ensures the EC2 instance is enrolled in the assigned Elastic Load Balancer, and removes it from the ELB if there are zero container instances.

The Manager

This process would also run on each container host, but the instances would hold a centralised competition for a leader. Each instance detects if the currently presiding leader’s Availability Zone is listed as currently active and checks for a recent heartbeat message from the leader. If there is a problem, then a competition is held and instances roll random numbers as their entry. The winner is picked from a Redis sorted set and they become the leader. A kill message is lodged for the previous leader to pick up if they return from whatever disaster befell their AZ.
Meanwhile, the new leader emits heartbeats and measures how many Tasks for a Service need to be running by using a specified metric. I’d hard-coded this in the Perl version to look at the number of Elastic Beanstalk instances were running for a particular application and then using a set of files in S3 to map between the metric and the number of Tasks that should run, like a very simple static database. Manager uses the AWS SDK to increase or decrease the running number of Tasks for a particular Task Definition across the ECS cluster before waiting and looping its lifecycle.

Afterwards

I never got around to writing a clean way to define a Service in such a way that meant Warden could pick up its configuration dynamically. Also, how to abstract away the metrics that Manager would use to decide on Task numbers. I’d like to look into how to produce a plug-in architecture for Go that is friendly to being completely agnostic for both of these factors.
Further, the network topology that the project moved towards made it unnecessary to run Warden’s Manager. Originally there was one ECS cluster that spread across the three Availability Zones in the eu-west-1 Region and there was no notion of an ECS Service to keep a desired number of container instances running across the cluster, so Manager filled that gap. Then there were changes that were made in response to realising that Elastic File System, once it eventually emerged from Preview, was more expensive to run than separate NFS volumes and servers per AZ. This meant splitting up the system into verticals - one per AZ - so that each layer of the system would only deal with one NFS volume and one ECS cluster per AZ. Warden’s Registrar still ran on each container host to synchronise the load balancing across the image server containers, but the number of containers running was managed by an ECS Service.
It was a good exploration of what was possible with a bit of scripting and glue. Experience of producing tooling in Go that was reactive to system conditions has also been invaluable.
For comparison, this is the original version of Warden which was written in Perl and effectively is just glue between the Redis, Docker and AWS CLIs: https://github.com/fractos/warden-perl.
Go was a natural choice for making a better, more solid version of Warden, mainly because of its Redis and AWS SDK access via libraries.

Wednesday, 7 December 2016

Configuring a containerised system

(from http://fractos.github.io/containers/2016/12/07/configuring-a-containerised-system.html)

These are some basic methods of passing configuration into a container.

1. Building in configuration files

At build time, copy configuration files into the container image.
  • inflexible
  • insecure

2. Configuration fetched and applied during transient build step

Essentially this means pulling configuration data from somewhere and removing it before the end of the build step.
  • fragile
  • values sit in build cache

3. Environment variables

Discrete environment variable values injected at container instantiation time.
  • flexible - can re-configure per instance
  • good integration with Docker, ECS, shell

4. Holding configuration in S3

Pass S3 address to container by either an environment variable or a command parameter. Scripts or apps then retrieve dynamic configuration from the S3 bucket. Can be private, with ambient credentials assumed from an associated EC2 role on the container host, or by AWS credentials that are passed by environment variables.
  • allows dynamic configuration during container lifetime

5. Attach a configuration volume

A specific volume that holds configuration files can be attached to the container during instantiation.
  • allows dynamic configuration during container lifetime
  • have to start managing volumes across container hosts
  • Kubernetes uses this type of solution

6. Use a configuration server/service

Pass server details into the container at instantiation time. Scripts or apps will retrieve configuration from a service such as Vault, etcd, Redis, a database etc.

  • client dependencies
  • container contents may have to be adapted to integrate with a configuration server
  • allows dynamic configuration during container lifetime

Friday, 9 September 2016

I can see the sea :)

I can see the sea from this train :)

It's quite far out, with bright orange beached buoys in the middle distance before a platoon of wind power generators that are standing in the eventual water toward the horizon.

Curlews are striding about, or maybe redshanks still in their summer plumage.

Always a thrill to catch sight of the shore and the different kind of life that teems there, all under a grey-blue coastal sky.

Tuesday, 23 August 2016

Canary farm


"Canary farm"

An IT system connected to various other brittle systems that invariably picks up the blame for them being unavailable.

Monday, 8 August 2016

Fractos-cumuli tipping point




Fractos-cumuli tipping point:

When the monthly cost of the Cloud project you are working on exceeds your monthly salary.


Sunday, 3 July 2016

Deploying a .Net website to Amazon Elastic Beanstalk with TeamCity

Note for posterity.

Steps I had to take to get a .Net website deployed to Elastic Beanstalk with TeamCity.

I was originally working from this tutorial, but soon realised that things weren't going to plan at all. This was a pain in the arse.

Pre-requisites:

Install Visual Studio Community edition on the TeamCity server.
Install AWS Deployment Tool awsdeploy according to the instructions in the tutorial.

1. Web.config transform

The version of the website I wanted to deploy needed the transformation for Release build to be performed before it was packaged up.

I edited the .csproj of the website to include BeforeBuild and AfterBuild steps that would transform the Web.config file to its Release form:
<target condition=" '$(Configuration)' == 'Release' " name="BeforeBuild"> <copy destinationfiles="Web.temp.config" overwritereadonlyfiles="True" sourcefiles="Web.config"> <transformxml destination="Web.config" source="Web.temp.config" transform="Web.$(Configuration).config"> </transformxml></copy></target> <target name="AfterBuild"> <copy destinationfiles="Web.config" overwritereadonlyfiles="True" sourcefiles="Web.temp.config"> <delete files="Web.temp.config"> </delete></copy></target>

2.  Working around MSBuild being retarded about indirect references

MSBuild tries to be smart about including references in a deployment package. So smart that it doesn't actually include the dependencies of references that it includes - so you will probably end up with a website that doesn't work.

The indirect dependencies are included as references in the website project but MSBuild still filters them out.

I found some advice somewhere saying you should set the Copy Local property on the references to True. I dutifully did this, but it turns out that Visual Studio (including 2015 Update 2) has a huge bug in this area as it will not update the .csproj file if you set Copy Local to true. However, if you set it to False then click Save All, it will create the XML element in the .csproj file. Once you've set it to False and saved, ONLY THEN can you set it to True (and hit Save All again).  You can select all the references at once to perform this action in Visual Studio.

3. Release build project in TeamCity must export everything as an artefact

I was finding it impossible to run MSBuild in the deployment chain without having access to the compiled code. So, my Release build project in TeamCity has all the project and library folders marked as being artefacts, including an umbrella folder for any git submodules that are included.

Then, in the deployment build chain, I have both a snapshot dependency on the Release build chain (which means that it will only use successful build artefacts) and an artefact dependency which includes all the project and library folders that were exported as artefacts from the Release build chain above.

4. Deployment build chain copies transformed web.config back into website

The transformed version of the web.config ends up squirrelled away under the website's obj folder. I had to include an initial build step in the deployment chain to copy the transformed file back into the right folder so the packaging step can find it.

This looks something like this (obviously replace <website> with the name of the website's folder):

copy /Y <website>\obj\Release\TransformWebConfig\transformed\Web.config <website>\Web.config

5. Getting the command line parameters right for the packaging build step

This particular website is based at the root of the IIS folder on the remote server (i.e. c:\inetpub\wwwroot), so I had to include an instruction to deploy to the "Default Web Site" bare IIS path in the command line parameters.

Also this includes the Package instruction, the Configuration type, the SolutionDir definition (which is the TeamCity checkout directory - this is set as I use a particular scheme for cutting down on duplication of nested git submodules that involves telling included projects to use $(SolutionDir) as the base for their references), and the PackageTempRootDir property - an empty property which tells the packager not to make a deep hierarchy within the zipped output file.

/T:Package /P:Configuration=Release /property:SolutionDir="%system.teamcity.build.checkoutDir%"\ /P:PackageTempRootDir= /P:DeployIISAppPath="Default Web Site"

6. Create a configuration file for awsdeploy for the specific Elastic Beanstalk environment

For reasons that escaped me, I found it impossible to include certain parameters on the command line for the awsdeploy command. I had to work around this by creating a static configuration file for a particular Elastic Beanstalk environment. Quite bare-bones data:

AWSProfileName = default
Region = <your aws region>
Template = ElasticBeanstalk
UploadBucket = <the s3 bucket that elastic beanstalk uses e.g. elasticbeanstalk-eu-west-1-accountnumber>
Application.Name = <elastic beanstalk application name>
Environment.Name = <elastic beanstalk environment name>

I put this file in a well known place on the build server that I knew TeamCity could get at.

7. Getting the command line parameters right for the awsdeploy build step

awsdeploy needs a few command line parameters and it was a bit hit and miss to sort it out, but here's what I ended up with:

/DAWSAccessKey=<api access key> /DAWSSecretKey=<api secret key> /DDeploymentPackage=%teamcity.build.checkoutDir%\<website>\obj\Release\Package\<name of project website>.csproj.zip /v /w /r <configuration file from step 6>

That should do the actual deployment to the Elastic Beanstalk environment. The deployment package is the pathname of the zip that the previous step creates which should be under the obj\Release\Package folder and have the same name as the .csproj that MSBuild was executed against.