Here & Now

Here & Now is a weather app that I made to experiment with an iOS app architecture based on nestable components and RxSwift.

View controllers are deliberately minimal. They wire up their root ViewComponent and add its subview. Nested ViewComponents are used to break the UI code down into smaller logical pieces.

A ViewComponent is defined as:

protocol ViewComponent {
  // Streams used by the component
  associatedtype Inputs
    
  // Streams produced by the component
  associatedtype Outputs
    
  // Root view of the component, add as subview of parent component
  var view: UIView { get }
    
  init(disposedBy: DisposeBag)
    
  // Subscribe to input streams, export streams produced by component
  func start(_ inputs: Inputs) -> Outputs
}

extension ViewComponent {
  // Stop any services started by the component
  func stop() {}
}

Core UI logic is written in component protocol extensions for ease of testing. UI state changes are implemented as pure functions that operate on Rx types like Observable. For example, the map style is a function of the current time at the map location. A light style is used when it’s day time at the map location, and a dark style when it’s night time.

extension MapComponent {

  func mapStyle(forCameraPosition: Observable<GMSCameraPosition>,
                date: Observable<Date>) -> Observable<MapStyle> {
    let location = forCameraPosition.map(toLocation)
    return uiScheme(forLocation: location, date: date)
      .map { $0.style().mapStyle }
  }

  func uiScheme(forLocation: Observable<CLLocation>,
                date: Observable<Date>) -> Observable<UIScheme> {
    return Observable
      .combineLatest(forLocation, date) { (l, d) in
        if let dayTime = isDaytime(date: d, coordinate: l.coordinate) {
          return dayTime ? .light : .dark
        }
        return .light
      }
  }

  // ...
}

The source code for Here & Now is available on GitHub.

Full Post + Comments

Multistage Docker Builds for Scala Applications

The following Dockerfile gives an example of a multistage build that runs sbt in a builder container. This means that users don’t need to install Scala tooling on their machines in order to build the project. To optimise build times, I cache dependencies first by running sbt update in a precursor step to sbt stage.

FROM hseeberger/scala-sbt:8u181_2.12.7_1.2.6 as builder
WORKDIR /build
# Cache dependencies first
COPY project project
COPY build.sbt .
RUN sbt update
# Then build
COPY . .
RUN sbt stage
# Download Geonames file
RUN wget http://download.geonames.org/export/dump/cities500.zip
RUN unzip cities500.zip

FROM openjdk:8u181-jre-slim
WORKDIR /app
COPY --from=builder /build/target/universal/stage/. .
COPY --from=builder /build/cities500.txt .
ENV PLACES_FILE_PATH=/app/cities500.txt
RUN mv bin/$(ls bin | grep -v .bat) bin/start
CMD ["./bin/start"]

The final image only has the JRE, and no build tools.

Full Post + Comments

A Reverse Geocoding gRPC Service Written in Scala

I made a reverse geocoder gRPC server as a demo of how one might structure a backend service in Scala. I structured the application to have a purely functional core, with an imperative shell.

However there’s a twist to the plot. I’m mixing classical OOP with pure FP. I wanted to see what the code looked like if I used a dependency injection framework (Airframe) to wire up the side effects at the outer edges.

The main method is where we build the object graph:

object Main extends App with LazyLogging {

  override def main(args: Array[String]): Unit = {
    val config = loadConfigOrThrow[Config]

    // Wire up dependencies
    newDesign
      .bind[Config].toInstance(config)
      .bind[Clock].toInstance(clock)
      .bind[Healthttpd].toInstance(Healthttpd(config.statusPort))
      .bind[LinesFileReader].toInstance(fileReader)

      // Load places from disk immediately upon startup
      .bind[KDTreeMap[Location, Place]].toEagerSingletonProvider(loadPlacesBlocking)

      // Startup
      .withProductionMode
      .noLifeCycleLogging
      .withSession(_.build[Application].run())

    // Side effects are injected at the edge:

    lazy val fileReader: LinesFileReader = () => {
      logger.info(s"Loading places from ${config.placesFilePath}")
      val reader = new BufferedReader(
        new InputStreamReader(new FileInputStream(config.placesFilePath), "UTF-8")
      )
      Observable.fromLinesReader(reader)
    }

    lazy val loadPlacesBlocking: PlacesLoader => KDTreeMap[Location, Place] = { loader =>
      Await.result(loader.load().runAsync, 1 minute)
    }

    lazy val clock: Clock = {
      Observable
        .interval(1 second)
        .map(_ => Instant.now())
    }
  }
}

Side effects are:

  • Reading from the file system
  • The clock

fileReader gives us a stream of lines from the file, and the clock is a stream of Instants. Both are modelled using the Monix Observable type.

The Application trait is still very much imperative. We set up application status, served via Healthttpd, then start the gRPC server.

trait Application extends LazyLogging {
  private val config = bind[Config]
  private val healthttpd = bind[Healthttpd]
  private val reverseGeocoderService = bind[ReverseGeocoderService]

  def run(): Unit = {
    healthttpd.startAndIndicateNotReady()
    logger.info("Starting gRPC server")

    val grpcServer = NettyServerBuilder
      .forPort(config.grpcPort)
      .addService(ReverseGeocoderGrpcMonix.bindService(reverseGeocoderService, monix.execution.Scheduler.global))
      .build()
      .start()

    sys.ShutdownHookThread {
      grpcServer.shutdown()
      healthttpd.stop()
    }

    healthttpd.indicateReady()
    grpcServer.awaitTermination()
  }
}

The core of the application, concerned with serving requests, is pure, and easily tested. I’m using Task as an IO monad.

class ReverseGeocodeLocationRpc(places: KDTreeMap[Location, Place], clock: Clock) {

  def handle(request: ReverseGeocodeLocationRequest): Task[ReverseGeocodeLocationResponse] = {
    findNearest(request.latitude, request.longitude)(places)
      .map(Task.now(_))
      .map(addSunTimes(_, clock).map(toResponse))
      .getOrElse(emptyTaskResponse)
  }

  private def findNearest(latitude: Latitude, longitude: Longitude)(places: KDTreeMap[Location, Place]): Option[Place] = {
    places
      .findNearest((latitude, longitude), 1)
      .headOption
      .map(_._2)
  }

  private case class Sun(rise: Option[Timestamp], set: Option[Timestamp])

  private def addSunTimes(place: Task[Place], clock: Clock): Task[Place] = {
    Task.zip2(place, clock.firstL).map {
      case (p, t) =>
        val zonedDateTime = t.atZone(ZoneId.of(p.timezone))
        val sun = calculateSun(p.latitude, p.longitude, p.elevationMeters, zonedDateTime)
        p.copy(sunriseToday = sun.rise, sunsetToday = sun.set)
    }
  }

  private def calculateSun(latitude: Latitude,
                           longitude: Longitude,
                           altitudeMeters: Int,
                           zonedDateTime: ZonedDateTime): Sun = {
    val solarTime = SolarTime.ofLocation(latitude, longitude, altitudeMeters, StdSolarCalculator.TIME4J)
    val calendarDate = PlainDate.from(zonedDateTime.toLocalDate)
    def toTimestamp(moment: Moment) = Timestamp(moment.getPosixTime, moment.getNanosecond())
    val rise = solarTime.sunrise().apply(calendarDate).asScala.map(toTimestamp)
    val set = solarTime.sunset().apply(calendarDate).asScala.map(toTimestamp)
    Sun(rise, set)
  }

  private def toResponse(place: Place): ReverseGeocodeLocationResponse = {
    ReverseGeocodeLocationResponse(Some(place))
  }

  private val emptyTaskResponse = Task.now(ReverseGeocodeLocationResponse.defaultInstance)
}

This was a pragmatic approach to putting togetger a Scala backend application. I picked a toy service to experiment with DI in the context of FP. I think that the result wasn’t too gnarly.

The source code is available on GitHub: reverse-geocoder.

Full Post + Comments

Setting Up a Kubernetes Cluster on Ubuntu 16.04 via kubeadm

I have just redone the software stack on my homelab cluster from scratch. I am still using Ubuntu 16.04 since the Docker versions that are currently available for 18.04 are not yet supported by Kubernetes.

These are the lab notes that I compiled while installing Kubernetes v1.10.3 via kubeadm. I chose Calico for pod networking.

All Nodes

In this section we’ll prepare the master and worker nodes for Kubernetes. We’ll start from a newly minted Ubuntu 16.04 on each node:

Install Docker 17.03

apt-get update
apt-get install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") $(lsb_release -cs) stable"
apt-get update && apt-get install -y docker-ce=$(apt-cache madison docker-ce | grep 17.03 | head -1 | awk '{print $3}')

Install kubeadm, kubelet and kubectl

apt-get update && apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl

Turn swap off:

swapoff -a

Master Node

Configure cgroup driver used by kubelet on Master Node

Make sure that the cgroup driver used by kubelet is the same as the one used by Docker. To check whether the Docker cgroup driver matches the kubelet config:

docker info | grep -i cgroup
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

If the Docker cgroup driver and the kubelet config don’t match, update the latter. The flag we need to change is –cgroup-driver. If it’s already set, we can update the configuration like so:

sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

Otherwise, open the systemd file and add the flag to an existing environment line. Then restart the kubelet:

systemctl daemon-reload
systemctl restart kubelet

Initialise Kubernetes Master

Initialise the master node by running kubeadm init. We need to specify the pod network CIDR for network policy to work correctly when we install Calico in a later step.

kubeadm init --pod-network-cidr=10.0.0.0/16

To be able to use kubectl as non-root user on master:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install Calico for networking:

kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

Once Calico has been installed, confirm that it is working by checking that the kube-dns pod is running before joining the worker nodes.

shane@master1:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                              READY     STATUS    RESTARTS   AGE
kube-system   calico-node-2zrrz                 2/2       Running   0          8m
kube-system   etcd-master1                      1/1       Running   0          11m
kube-system   kube-apiserver-master1            1/1       Running   0          11m
kube-system   kube-controller-manager-master1   1/1       Running   0          11m
kube-system   kube-dns-86f4d74b45-bpsgs         3/3       Running   0          12m
kube-system   kube-proxy-pkfjx                  1/1       Running   0          12m
kube-system   kube-scheduler-master1            1/1       Running   0          11m

Worker Nodes

Next we’ll join the worker nodes to our new Kubernetes cluster. Run the command that was output by kubeadm init on each of the nodes:

kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>

We should see nodes joining the cluster shortly:

shane@master1:~$ kubectl get nodes
NAME      STATUS     ROLES     AGE       VERSION
master1   Ready      master    22m       v1.10.3
minion1   Ready      <none>    6m        v1.10.3
minion2   Ready      <none>    4m        v1.10.3
minion3   NotReady   <none>    4m        v1.10.3
minion4   NotReady   <none>    4m        v1.10.3

Configure Access from Workstation

To control the cluster remotely from our workstation, we grab the contents of /etc/kubernetes/admin.conf from the master node and merge it into our local ~/.kube/config configuration file.

Full Post + Comments

Setting Up a ScaleIO Storage Cluster on Ubuntu 16.04

ScaleIO is a software-defined SAN product from EMC Corporation. It allows you to create a block storage cluster using commodity hardware. ScaleIO is a closed source product. It’s free for non-production use, for an unlimited time, without capacity restrictions.

Today I’m trialling ScaleIO on my homelab cluster to provide persistent storage for application containers. While I’m using Kubernetes to abstract compute, a product like ScaleIO allows me to abstract storage. The end result is that stateful applications can come and go, and it doesn’t matter which Kubernetes node they end up on. They will always be able to get access to their provisioned storage. Kubernetes enables this through Persistent Volumes, and the ScaleIO volume plugin is supported out of the box.

Node Preparation

My cluster nodes are running Ubuntu Xenial on bare metal. First, we’ll enable root login via ssh, so that the ScaleIO Installation Manager can log into each node and do its thing. On each node:

sudo passwd
sudo sed -i 's/prohibit-password/yes/' /etc/ssh/sshd_config
sudo service ssh reload

This dependency is also needed:

sudo apt-get install libaio1

Note that each ScaleIO Data Server (SDS) device needs a minimum of 90GB. I had set aside 75GB on each node for this exercise and ended up having to resize a bunch of partitions. The GParted Live USB came in handy for that.

Gateway and Installation Manager Setup

Next, we’ll install the ScaleIO Gateway and Installation Manager. We only need to set this up on one of the nodes.

The Gateway requires a Java 8 runtime as well as binutils:

sudo apt-get install openjdk-8-jre binutils

The Gateway and Installation Manager are installed using a .deb file that was included in the ScaleIO download:

sudo GATEWAY_ADMIN_PASSWORD=somepass \
    dpkg -i emc-scaleio-gateway_2.0-12000.122_amd64.deb

Cluster Install via the Installation Manager

Once dpkg is done, we can access the Installation Manager UI by pointing our browser to the server on which we have just performed the installation. We need to connect via HTTPS, and accept the certificate when prompted.

From then on it’s just a matter of using the fairly self-explanatory UI.

The ScaleIO Installation Manager UI

Add SDS Devices

After the Installation Manager has finished setting up all the nodes, it’s time to add some storage to the cluster.

Log into the Meta Data Manager (MDM)…

scli --login --username admin --password somepass

And add some devices. In the following example I’m adding an empty partition /dev/sda1 from the 192.168.1.101 node to the storage pool.

scli --add_sds_device \
     --sds_ip 192.168.1.101 \
     --protection_domain_name default \
     --storage_pool_name default \
     --device_path /dev/sda1

The ScaleIO Management GUI

Finally, I installed the ScaleIO Management GUI in a Windows virtual machine, and confirmed that the devices were correctly provisioned.

The ScaleIO Management UI

There was a warning waiting for me when I logged into the management GUI:

Configured spare capacity is smaller than largest fault unit

One of my nodes has a larger SSD than the others. The default spare percentage hadn’t set aside enough spare space to cover the loss of that device. Adjusting the spare percentage fixed that:

scli --modify_spare_policy \
     --protection_domain_name default \
     --storage_pool_name default \
     --spare_percentage 34

Full Post + Comments

First Steps with Ansible

Ansible is an open-source tool that allows you to automate server provisioning, manage configuration and deploy applications.

Where does a tool like Ansible fit in today’s immutable infrastructure world? While containers are better at enforcing immutability, if I’m starting from bare metal, I still need a tool to bootstrap and manage the compute and storage clusters that my containerised workloads will use. That’s where Ansible comes in.

Installation

First, let’s install Ansible on our control machine. In my case that’s my development laptop. On macOS we can use Homebrew:

brew install ansible

We also need to install Ansible on the nodes that we’ll be managing. It looks like this on Ubuntu:

sudo apt-get install software-properties-common
sudo apt-add-repository -y ppa:ansible/ansible
sudo apt-get update && sudo apt-get -y install ansible

Initial Configuration

Next we’ll need an inventory that lists the managed nodes. If you installed Ansible via homebrew, the default location is ~/homebrew/etc/ansible/hosts/hosts. Let’s go ahead and create our inventory:

[masters]
master1 ansible_host=192.168.1.101

[minions]
minion1 ansible_host=192.168.1.102
minion2 ansible_host=192.168.1.103
minion3 ansible_host=192.168.1.104
minion4 ansible_host=192.168.1.105

You can put the Ansible hostfile in a custom location. If you do that, you can tell Ansible about it in ~/.ansible.cfg. For example:

[defaults]
hostfile=~/projects/home-cluster/ansible/hosts

Ensure that you can log into the managed hosts using your SSH key.

First Commands

Let’s take Ansible for a test drive. We can run a command from the control machine and target specific managed nodes:

ansible master1 -a date
ansible minions -a date

Here’s an example of running a command against all the nodes, as root, via sudo:

ansible all -a "apt-get update" -bK

To run an Ansible module on a managed node:

ansible minion2 -m ping

Ansible modules are reusable scripts that can be used via the ansible command and in Ansible Playbooks.

Next Steps - Playbooks

While using Ansible to run ad hoc commands against managed nodes is useful, its real power is unlocked via playbooks. Playbooks are Ansible’s configuration, deployment, and coordination language. They are written using YAML. Here’s an example from Ansible’s documentation website:

---
- hosts: webservers
  vars:
    http_port: 80
    max_clients: 200
  remote_user: root
  tasks:
  - name: ensure apache is at the latest version
    yum: name=httpd state=latest
  - name: write the apache config file
    template: src=/srv/httpd.j2 dest=/etc/httpd.conf
    notify:
    - restart apache
  - name: ensure apache is running (and enable it at boot)
    service: name=httpd state=started enabled=yes
  handlers:
    - name: restart apache
      service: name=httpd state=restarted

As you can see, playbooks tend to be pretty self-documenting and more succint than ad hoc scripts.

To run a playbook, use the ansible-playbook command e.g.:

ansible-playbook bootstrap-kubernetes.yaml

I hope that this quick overview has given you an idea of what Ansible is, when you might want to use it, and how you would use it to manage remote hosts.

Full Post + Comments

Yvaine, born 20/01/2017

Yvaine

Full Post + Comments

5 Node Nano ITX Kubernetes Tower

I’ve just finished the latest addition to the home office. I call it my mini Kubernetes tower, a five node cluster built out of Nano ITX boards.

What is It for?

Software development. These days I deploy the server side applications that I build to Kubernetes. This will be my home development cluster.

What Is It Made of?

The cluster has a total of 16 physical cores, 40GB of memory and 720GB of SSD storage. Three of the nodes are passively cooled quad-core J1900 Celerons. The other two nodes have dual-core i3 processors.

The bottom layer contains an 8-port gigabit switch. On top of that is the first node: An SSD mounted on a piece of acrylic, a few mm of clearance, then the mainboard. The other nodes follow the same pattern.

The tower is 25cm tall and has a square footprint of 12.5 by 12.5cm.

The Build

I had some time to plan the build while waiting for all the parts to arrive. I decided to use acrylic sheets to mount the network switch and SSDs on, and threaded rods to hold everything together.

The hardest part of the build was figuring out how to drill close to the edges of acrylic sheets without causing breakages.

The trick was to use a very slow drilling speed, and to stop just before the drill bit poked through. I then turned the sheet over and drilled through the other side.

I took a 1m M3 threaded rod and cut it into 25cm pieces. I placed a spacer on each side of where I was going to cut, performed the cut and filed the ends. I then unwound the spacers off to rethread the ends.

I started with the top cover.

Then added the motherboard for the first node.

The SSD was next.

And the first node was done.

I had already installed Ubuntu 16.04 on each node before hand, but I made sure to test each node as the build progressed.

It would have been inconvenient to have to take a faulty part out from the middle of the stack.

I mounted the switch last.

And then crimped a bunch of cables and networked everything together.

I powered all the nodes up and finally, success: We have blinking lights!

Full Post + Comments

Akka Clustering with Kubernetes

klusterd is an example Akka Cluster project that is packaged using Docker and deployed to Kubernetes.

Taking klusterd for a test drive

First, clone the project from GitHub.

git clone https://github.com/vyshane/klusterd.git

We’ll use sbt to package the application as a Docker image.

$ cd klusterd
$ sbt "docker:publishLocal"

This creates a local Docker image for us:

$ docker images | grep klusterd
vyshane/klusterd                      1.0                  43ee995e8adb        1 minute ago      690.7 MB

To deploy our klusterd image to our Kubernetes cluster:

$ cd ../deployment
$ ./up.sh

We can see that a klusterd pod comes up:

$ kubectl get pods
NAME                   READY     STATUS    RESTARTS   AGE
klusterd-6srxl         1/1       Running   0          1m

Let’s tail its log:

$ kubectl logs -f klusterd-6srxl

We can see that klusterd is running at the IP address 172.17.0.3 and listening on port 2551.

INFO  14:58:06.022UTC akka.remote.Remoting - Starting remoting
INFO  14:58:06.235UTC akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://klusterd@172.17.0.3:2551]

Since we have only launched one klusterd node, it is its own cluster seed node.

INFO  14:58:06.446UTC akka.actor.ActorSystemImpl(klusterd) - Configured seed nodes: akka.tcp://klusterd@172.17.0.3:2551

We have a cluster of one.

INFO  14:58:06.454UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.3:2551] - Node [akka.tcp://klusterd@172.17.0.3:2551] is JOINING, roles []
INFO  14:58:06.482UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.3:2551] - Leader is moving node [akka.tcp://klusterd@172.17.0.3:2551] to [Up]
INFO  14:58:06.505UTC akka.tcp://klusterd@172.17.0.3:2551/user/cluster-monitor - Cluster member up: akka.tcp://klusterd@172.17.0.3:2551

Now, let’s scale this cluster up. Let’s ask Kubernetes for 2 more nodes:

$ kubectl scale --replicas=3 rc klusterd

We can see from klusterd-6srxl’s logs that two more nodes join our cluster shortly after.

INFO  15:11:46.525UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.3:2551] - Node [akka.tcp://klusterd@172.17.0.4:2551] is JOINING, roles []
INFO  15:11:46.533UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.3:2551] - Node [akka.tcp://klusterd@172.17.0.5:2551] is JOINING, roles []
INFO  15:11:47.484UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.3:2551] - Leader is moving node [akka.tcp://klusterd@172.17.0.4:2551] to [Up]
INFO  15:11:47.505UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.3:2551] - Leader is moving node [akka.tcp://klusterd@172.17.0.5:2551] to [Up]
INFO  15:11:47.507UTC akka.tcp://klusterd@172.17.0.3:2551/user/cluster-monitor - Cluster member up: akka.tcp://klusterd@172.17.0.4:2551
INFO  15:11:47.507UTC akka.tcp://klusterd@172.17.0.3:2551/user/cluster-monitor - Cluster member up: akka.tcp://klusterd@172.17.0.5:2551

We can scale the cluster down:

$ kubectl scale --replicas=2 rc klusterd

And a node leaves the cluster. In our case, the original pod klusterd-6srxl was killed. Here are the logs from another node showing what happened:

WARN  15:19:31.855UTC akka.tcp://klusterd@172.17.0.4:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fklusterd%40172.17.0.3%3A2551-0 - Association with remote system [akka.tcp://klusterd@172.17.0.3:2551] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 
WARN  15:19:35.376UTC akka.tcp://klusterd@172.17.0.4:2551/user/cluster-monitor - Cluster member unreachable: akka.tcp://klusterd@172.17.0.3:2551
WARN  15:19:35.405UTC akka.tcp://klusterd@172.17.0.4:2551/system/cluster/core/daemon - Cluster Node [akka.tcp://klusterd@172.17.0.4:2551] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://klusterd@172.17.0.3:2551, status = Up)]
INFO  15:19:45.384UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.4:2551] - Leader is auto-downing unreachable node [akka.tcp://klusterd@172.17.0.3:2551]
INFO  15:19:45.387UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.4:2551] - Marking unreachable node [akka.tcp://klusterd@172.17.0.3:2551] as [Down]
INFO  15:19:46.412UTC akka.cluster.Cluster(akka://klusterd) - Cluster Node [akka.tcp://klusterd@172.17.0.4:2551] - Leader is removing unreachable node [akka.tcp://klusterd@172.17.0.3:2551]
INFO  15:19:46.414UTC akka.tcp://klusterd@172.17.0.4:2551/user/cluster-monitor - Cluster member removed: akka.tcp://klusterd@172.17.0.3:2551
WARN  15:19:52.756UTC akka.tcp://klusterd@172.17.0.4:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fklusterd%40172.17.0.3%3A2551-3 - Association with remote system [akka.tcp://klusterd@172.17.0.3:2551] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://klusterd@172.17.0.3:2551]] Caused by: [No response from remote for outbound association. Associate timed out after [15000 ms].]
INFO  15:19:52.758UTC akka.tcp://klusterd@172.17.0.4:2551/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2Fklusterd%40172.17.0.3%3A2551-4 - No response from remote for outbound association. Associate timed out after [15000 ms].

To turn off everything:

./down.sh

Full Post + Comments

Multi-node Cassandra Cluster Made Easy with Kubernetes

Cassandra cluster in terminal window

I’ve just shared an easy way to launch a Cassandra cluster on Kubernetes.

The Kubernetes project has a Cassandra example that uses a custom seed provider for seed discovery. The example makes use of a Cassandra Docker image from gcr.io/google_containers.

However, I wanted a solution based on the official Cassandra Docker image. This is what I came up with.

First, I created a headless Kubernetes service that provides the IP addresses of Cassandra peers via DNS A records. The peer service definition looks like this:

<br />apiVersion: v1
kind: Service
metadata:
  labels:
    name: cassandra-peers
  name: cassandra-peers
spec:
  clusterIP: None
  ports:
    - port: 7000
      name: intra-node-communication
    - port: 7001
      name: tls-intra-node-communication
  selector:
    name: cassandra

Then I extended the official Cassandra image with the addition of dnsutils (for the dig command) and a custom entrypoint that configures seed nodes for the container. The new entrypoint script is pretty straight forward:

my_ip=$(hostname --ip-address)

CASSANDRA_SEEDS=$(dig $PEER_DISCOVERY_DOMAIN +short | 
    grep -v $my_ip | 
    sort | 
    head -2 | xargs | 
    sed -e 's/ /,/g')

export CASSANDRA_SEEDS

/docker-entrypoint.sh "$@"

Whenever a new Cassandra pod is created, it automatically discovers seed nodes through DNS.

Full Post + Comments