Posted on

Adding notes as something separate from posts. As I start to add IndieWeb things!


Send Webmentions with Github Actions

As I start working with webmentions I needed to find a way to publish webmentions as part of the build/release process for this site. I'm currently using Github Actions to build the site and upload to Netlify.

webmention.app came up frequently when I searched around for how to publish webmentions. It supports RSS/Atom as a feed, although the docs suggest using IFTTT to trigger webhooks. Github Actions can do that though! For whatever reason webmention.app didn't seen to find any links in my feed. As I was trying to figure out why by using the command line I discovered that the CLI version was able to find the links in my feed!

Adding this as a step post-release

- name: Send Webmentions
  run: |
    npm install @remy/webmention
    npx webmention ${{ secrets.WEBMENTION_TARGET_URL }} --limit=0 --send

In my pull-requests I have a variation of this, removing --send and using the temporary Netlify URL for the PR so I can see what webmentions would be sent.

As part of using my atom feed for this I now only include the last 10 posts in my feed to avoid sending lots of old webmentions, most of which didn't seem to work as the links are dead.


Starting Webmentions

Is anyone using webmentions? I've added Webmention.io for hosting my webmentions at the moment as this is currently a static site. If you're using them please try and mention this page and I can hopefully see in my RSS reader!

Testing things

Webmention test 1

Also trying to send them as part of my site build process. Maybe this will work?

Second part!


Prefetching Docker Images

While running Nomad I've been running into a bootstrapping/critical path problem. I have a Docker Registry running in the cluster and pulling an image requires:

The Registry is required to serve the image.

Traefik routes the requests to the Registry as well as requesting Lets Encrypt certificates

GoCast announces the floating IP for Traefik

Minio stores the images for the Registry

Problems updating images

Separate from bootstrapping, just updating the image of many of these will require everything to already be running, just to pull the next image. There is an open bug to address this in Nomad, but it doesn't seem like it's going to be resolved anytime soon.

When updating Traefik I run into a condition that GoCast has created the floating IP addr on the host but Traefik isn't running. The floating IP won't work while Traefik is running-but-not-serving. GoCast BGP is working correctly in that the floating IP is not accounced to the network, but the updating host still can't reach the other-host instances of the floating IP. I'm not sure if leaving the addr in place is a feature or a bug.

A way around this would be to run multiple instances of Traefik on each host. As currently setup though I need to bind multiple instances of Traefik to the same ports and SO_REUSEPORT isn't supported. With GoCast I could map the floating IP ports to container ports and not require host networking (thus avoiding the port collision) but that may be quite burdensome to manage. I also haven't tried running multiple ports with GoCast NAT'ing.

Solving part of the problem

For the Traefik case of not being able to pull the image there are some workarounds. Manually pulling, or system batch jobs could solve this but is fairly manual.

regclient has a daemon mode that can pull/sync images to registries, but it doesn't support pushing to a Docker Engine.

Docker Prefetch Image

I've started on a tool to prefetch Docker images based on a config file. Updating the config file appropriately to match the image used in Nomad Jobs is still a problem. This uses the Docker Engine API via the Rust docker_api crate to pull the image to the host.

Nomad Consul Template though can populate the config file from Consul to avoid manual file updates thought which isn't terrible. I'm not sure if there is a nice way to integrate with the Nomad API to watch what images might be needed and pull the in in advance of any job using it.

This has solved my case for updating parts of the critical path of Docker Image hosting. It doesn't fully solve the bootstrapping case though where none of the services are running yet. An idea though is to extend the config/API calls to have the "expected" image tag Nomad would look for and a "source". If the "expected" image cannot be pulled, try the "source" and tag it locally as the "expected" tag. This would allow prefetching all images required for bootstrapping the system!


What I want for a Queue System

I've been interested in queue systems since first learning about and using RabbitMQ in ... 2009 (woah... it's been a minute). What I've learned through most of this is that:

  • People won't care as much as me
  • You can't make people care
  • If they don't care then it's even easier to make mistakes.

Redis is quite popular as a queue system and I've joined multiple companies/teams where Redis and Python-RQ were used for async tasks. Redis is wonderful and is a great solution to many problems (including async tasks!) but in the cases I have seen, it's been mostly an incomplete, improper solution.

Google Pub/Sub is pretty wonderful generally and the pattern I love most is combining Pub/Sub and Cloud Run for HTTP delivery of events. There are some limitations with this pattern but I love most of all the removal of many problems developers can cause.

Event -> HTTP -> Service makes handling events much easier.

  • It's difficult to run tasks for hours from a single HTTP request
  • Handling of events requires little knowledge of a particular library
  • Much of the complexity doesn't need to be in the app
  • Removing the complexity from the app makes it easier for more apps to use it, without lots of work

I can't run Pub/Sub and Cloud Run at my house though.

What I want from a queue system

  • HTTP and/or GRPC submission of events to the queue system
  • HTTP and/or GRPC push to a service
  • Possible to run in a home environment, but not the-worst-idea in a larger environment
  • Back-pressure. When too many events are in the system the publishing will slow down.
  • Easy to run and not worry about it
  • Small idle footprint in memory/CPU
  • Horizontal scalability. If it's ever used in production somewhere, adding capacity should be easy.

What I don't need from a queue system

  • Super high throughput. 10k events per second is wonderful... but if it can do 100 and scales out, I'm not too worried.
  • Perfect durability. I'll assume that at some point data might be lost and those outliers are OK.
  • Perfect deliverability. I normally add end-to-end checks for data that a dropped event will only cause a delay, not a consistency problem.

What to do about it

I haven't found exactly what I'm looking for in other systems. Since I'm mostly scratching a self-hosting itch at the moment I'm looking to throw together a sample system to solve my problem, never expecting it to go beyond that (although maybe it will be useful for someone else?)

As I learn Rust, connecting Axum, Rust channels and Reqwest should get me pretty far.

And the real goal here is to use a simple enough system similar to what I'd recommend for production use cases with cloud services (and not my homegrown thing).


Nomad Events Logger

As part of learning Rust I built a tool to read events from the Nomad Events API and log them to stdout. This allows an easy, low-resource way to pull Nomad cluster events into your log processing stream.

Low-resource as in ~4MB of memory for the Docker container!

Nomad Events Logger is deployable as a Docker image, and if I get around to it, a native binary as well. At the moment I run ~everything in Docker in my Nomad cluster so let me know if you want other formats.


Paperless-ngx Celery won't consume documents

When running Paperless-ngx I ran into a problem where the Celery process in Docker (as part of supervisord) would start, supervisor would report it running, but the Celery process appeared to do nothing.

The last related lines I would see were:

INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
[INFO] [paperless.management.consumer] Adding [REDACTED] to the task queue.

I'm not sure what part of Celery does this, maybe it's just Paperless? But eventually I found a .__celery.lock file in the Paperless data directory. Removing that allowed everything to work again.

This was likely caused with Nomad terminating the process and the lock file not getting cleaned up. I now have my Nomad job remove .*.lock files before starting Paperless.


More Declarative Containers with NixOS

In newer versions of NixOS it's possible to use Docker directly in your /etc/nixos/configuration.nix!

Example from that page:

 { config, pkgs, ... }:
 {
   config.docker-containers = {
     hackagecompare = {
       image = "chrissound/hackagecomparestats-webserver:latest";
       ports = ["127.0.0.1:3010:3010"];
       volumes = [
         "/root/hackagecompare/packageStatistics.json:/root/hackagecompare/packageStatistics.json"
       ];
       cmd = [
         "--base-url"
         "\"/hackagecompare\""
       ];
     };
   };
 }

I've moved to this format as it's a bit cleaner and simpler to use for syncing container images than rkt wound up being.


Risks With Git Tag Triggered Deploys

Git workflows can come in many flavors. Once the code hits a continuous integration system your workflow will need to trigger a deploy to production. A common way of handling this is to create a Git tag that will trigger the deployment. Using a Git tag to trigger the deployment can lead to increased risk against safely deploying your code.

These risks can be countered in multiple ways, but these are patterns I've seen in the deployment process for various services.

Tags can be pushed by anyone with write access

Your process may allow anyone trigger a deploy to production. In many ways this is a good thing. In GitHub though, certain branches can be protected in order to enforce a certain workflow such as requiring each pull request receive approval from 1 other person.

Tags in Github do now have such a protection. Anyone with write access could push a tag, bypassing the Github workflow.

Tags do not have an order

Any commit in the repository can be tagged. There is little difference (to Git) between a tag on the latest commit and a tag on a commit from 3 months ago. If your process relies on some semantic meaning for these tags you will have to encode that information and handle it in your deployment automation.


Declarative Containers with NixOS

I spent some time recently attempting to setup some software on a NixOS system I have at home. It looks like declarative containers were removed in an earlier version of NixOS as they weren't quite ready for use. After some searching I was able to find an example with rkt!

Setting up a container can be as simple as adding this to your /etc/nixos/configuration.nix:

virtualisation.rkt.enable = true;

systemd.services."rkt-nginx" = {
  description = "Nginx (rkt)";
  wantedBy = [ "multi-user.target" ];
  serviceConfig = {
    Slice = "machine.slice";
    ExecStart = ''\
      ${pkgs.rkt}/bin/rkt run --insecure-options=image \
      --net=host \
      docker://nginx
    '';
    KillMode = "mixed";
    Restart = "always";
  };
};

OmniosCE Networking on OVH

I recently found that my DHCP leasing on OVH was unreliable. The address worked at one point, but after a few months/reboots I found that the instance could not longer obtain a lease. After a few attempts to release/renew, I decided to set a static IP.

The General Administration page has general information about setting this. The IP from your OVH control panel for the specific server is needed. From that information the routing gateway can be determined.

The gateway is the same as the IP of the server with the last octet replaced with 254. If the IP is 10.2.3.4, the gateway is 10.2.3.254. To set this on the host:

ipadm create-addr -T static -a $SERVER_IP/32 ixgbe0/v4
route -p add default $GATEWAY_IP

Listing All Versions of an IPS Package

Listing all packages (with FMRI) can be useful to see what you could install. It wasn't immediately obvious to me and couldn't easily find how to do.

pkg list -afv $PACKAGE

-af lists all versions, regardless of installation state

-v Includes the FMRI in the output

If you don't see a newer version you think should be there, try a pkg refresh!


Copying IPS Packages Across Repositories

With the release of OmniosCS I've found myself needing packages from OmniTI's Managed Services repository.

My first attempt was to copy packages with pkgrecv. This however caused problems where the IPS server doesn't know about the repository. Adding the repository to the IPS server didn't fix the problem.

This can be fixed by changing the repository FMRI before uploading.


Deploying This With CircleCI

Despite using automated deploys for most things I work on I had put off setting up such a mechanism for this site. Not sure what took so long.

With CircleCI I added a circle.yml file of:

dependencies:
  override:
    - pip install -r requirements.txt

test:
  override:
    - make build

deployment:
  deploy:
    branch: master
    commands:
      - make upload

And then an S3 user with the right permissions.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1492350849000",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::philipcristiano.com",
                "arn:aws:s3:::philipcristiano.com/*"
            ]
        }
    ]
}

Execute AWS User Data on OmniOS

As I started to use the OmniOS on AWS I ran into the problem that it does not, by default, include a way to execute the AWS User Data script when starting an instance. The User Data scripts provides a wonderful mechanism to bootstrap new instances. Without it, you may be required to use a configuration management tool or manually configure instances after they have booted.

The script is helpfully available through the instance metadata API at 169.254.169.254 with the URL http://169.254.169.254/2016-09-02/user-data. It should be simple to pull that down and execute the script with SMF!

I've put together a script to do this. It runs with SMF with a default timeout 15 minutes and will restart if there are errors. There is a handy dandy install script in the repo that will download and install the needed files. At the moment this isn't packaged as this script is needed before I would set up a package repository.

There is still the problem of how to get this into an AWS AMI. Packer can build the image for us so that the AMI we launch will already have this script. The buildfile for this image is rather simple but the whole process is a powerful one.

To get your own OmniOS AMI with AWSMF-Data installed you can use the above Packer build.

  • Install Packer

  • Clone the repo

$ git clone https://github.com/philipcristiano/packer-omnios.git`
  • Execute build.sh after setting a few variables
$ export AWS_ACCESS_KEY_ID=...
$ export AWS_SECRET_ACCESS_KEY=...
$ export VPC_ID=...
$ export SUBNET_ID=...

$ ./build.sh

VPC_ID and SUBNET_ID are only required if you have a need to specify them (like no default VPC in your account), in which case the build.sh can be modified.

From here we can create User Data scripts in AWS and have new EC2 instances run code when they start!


How To Package a Service For OmniOS

A previous post showed how to install files. If you wanted to run a service from that package there are a few more steps.

Service Management Facility

The Service Management Facility provides a way to manage services in OmniOS. If you are running a service you installed from a package, this is the way to do it.

Steps to Package and Run a Service

We will need to complete a few steps to package up a service and deploy it with IPS.

  • Create an SMF manifest that instructs SMF how to run our service

  • Deploy the SMF manifest

  • Start the service.

Optionally, the service can be modified to read SMF properties so that it can be configured through svccfg

Tools

Creating an Echo Server

Creating an SMF Manifest

A service manifest is an XML documents that contain the information required to run a command as a service. This would normally mean that you have to create a new XML document for each service. Thankfully there is the tool Manifold that can create an manifest with answers to the relevant questions.


How to Package Your Software for OmniOS

Packaging for OmniOS goes over how to create a package using the same build system as is used for building OmniOS. The layout of this repository seems designed for building already written software to be used in OmniOS. If you need to package your own software then this can be more overhead then you are looking for. The tools used by that GitHub repository are included in the default installation of OmniOS and have plenty of documentation on Oracle's site about how to use IPS. It turns out you can start making packages for OmniOS with only a few commands.

This post will cover the tools required to create a package, not necessarily best practices in packaging for OmniOS.

I've created an example repository that can build and upload a package to an IPS package depot if you want to skip ahead.

Tools

The packaging commands we will be using are

  • pkgsend - Generates the package manifest and publishes the package

  • pkgmogrify - Transforms the package manifest

  • pkglint - Linter for package manifests

  • pkgfmt - Formatter for package manifest

  • pkgrepo - (optional) Refresh the repository search index after upload

Example Application

We will be packaging a Hello World script stored in hello-world.sh.

#!/usr/bin/bash

echo Hello World!

This file needs an execute bit as well so we will run

chmod +x hello-world.sh

Building the Manifest

pkgsend will generate a manifest for us if we can build a directory that mimics the deployed layout. If we put our script in build/usr/bin (and remove the extension) then run pkgsend generate build we will get a manifest of files and directories to package.

$ /usr/bin/pkgsend generate build
dir group=bin mode=0755 owner=root path=usr
dir group=bin mode=0755 owner=root path=usr/bin
file usr/bin/hello-world group=bin mode=0755 owner=root path=usr/bin/hello-world

Our manifest so far says we need two directories and a file. This would be enough of a manifest to start with but can be problematic if the directories don't line up with the host used to install the package. It would be better to remove the directories and assume that /usr/bin already exists on the system, since it really should already be there.

The command pkgmogrify can take a manifest and a transform file and output a transformed manifest.

A simple transform to do this will be stored in transform.mog

<transform dir path=usr -> drop>

This will drop any directories that include the path usr. If you need are building a more complex directory structure then using something like usr/bin$ as the path will only drop the common /usr/bin elements from the manifest.

For this we will write the manifest to a file the mogrify it to remove the directories.

$ /usr/bin/pkgsend generate build > manifest.pm5.1
$ /usr/bin/pkgmogrify manifest.pm5.1 transform.mog

file usr/bin/hello-world group=bin mode=0755 owner=root path=usr/bin/hello-world

This now has just our script in the manifest. Using pkgmogrify we can easily script changes to manifests instead of relying on manual changes to clean up a generated manifest.

We'll write the updated manifest to a new file

$ /usr/bin/pkgmogrify manifest.pm5.1 transform.mog > manifest.pm5.2

Package Metadata

We have the manifest for what the package should contain but we still need to describe the package with metadata. We will need to include at least a name, version, description, and summary for the package.

The name and version are contained in an Fault Managed Resource Identifier or FMRI.

I recommend reading the link above about proper format and conventions for FMRIs but for now we will write metadata.mog to contain

set name=pkg.fmri value=example/hello-world@0.1.0,0.1.0-0.1.0:20160915T211427Z
set name=pkg.description value="Hello World"
set name=pkg.summary value="Hello World shell script"

We can use pkgmogrify to combine our metadata and current manifest file to make a file manifest used for publishing our package. In this case we use pkgfmt to format the file as well.

$ /usr/bin/pkgmogrify metadata.mog manifest.pm5.2 | pkgfmt > manifest.pm5.final

Linting

The manifest we have now should work for publishing the package. We can verify using pkglint on the final manifest to check.

$ /usr/bin/pkglint manifest.pm5.final
Lint engine setup...
Starting lint run...
$ echo $?
0

No errors or warnings, wonderful!

Publishing the Package

We now have a directory structure for the package we would like to create as well as a manifest saying how to install the files. We can publish these components to an IPS package depot with pkgsend

$ pkgsend publish -s PKGSERVER -d build/ manifest.pm5.final
pkg://myrepo.example.com/example/hello-world@0.1.0,0.1.0-0.1.0:20160916T182806Z
PUBLISHED

-s specifies the package server, -d specifies the directory to read, and we pass along the path to our manifest. Our package was then published!

Troubleshooting

If you are using an HTTP depotd server to publish and see the error pkgsend: Publisher 'default' has no repositories that support the 'open/0' you will need to disable read-only mode for the server or publish to a filesystem repository.

Refresh the Package Search Index

The HTTP depotd interface doesn't refresh the search index when a package is published. This can be done with the pkgrepo command.

$ pkgrepo refresh -s PKGSERVER

Refreshing pkg.depotd After Package Upload

After uploading a package to an OmniOS package repository I was unable to find the package by searching. The package could be installed and local searching would find it, but the depotd server didn't know how to find the package when searching. Restarting pkg/server would work around the issue but having to do that after each publish would get annoying.

There is a command pkgrepo that will refresh the search index remotely!

Running

pkgrepo refresh -s PKGSRVR

is enough to reload the search index.


Error Publishing to pkg.depotd

When publishing to an IPS depotd server you may see the line

pkgsend: Publisher 'default' has no repositories that support the 'open/0' operation.

If the depotd server will show you a web page but publishing does not work with pkgsend you may have the server setup in read only mode. svccfg will allow you to change the property with

svccfg -s pkg/server setprop pkg/readonly = false

Don't do this to a server on the internet though, placing an HTTP server in front of depotd will allow you to add authentication. This is otherwise insecure!


Building IPS Packages For OmniOS

I've started trying to package some software for OmniOS for personal use. The OmniOS Packaging page in the wiki goes through how to do it using the tools used to build the OS. This is a bit more than I would want to do when publishing software to GitHub. I would rather not rely on a repository used to build the OS just to package one piece of software.

A few months ago I was trying to package a personal project and got most of the way there! So far there is a make target that will package an Erlang release into an IPS package. I think it only got as far as putting the files on disk. I still to add the SMF manifest and fix permissions, but it's much smaller when used to package a single piece of software.