Back in the pre-historic era when I was at Uni, I lost a lot of time to Twitch Plays Pokemon.
I was facinated by the idea of a collective playing a game, and the chaos that ensued. A whole mythos got built up, and it was a lot of fun to follow and occasionally spam in the chat.
Anthropic has started using "How far can Claude get in Pokemon" as one of their benchmarks, but they have also set up an emulator streaming on Twitch.
There is an easter egg in the system prompt, where if you ask "How many R's in Strawberry?" it will generate a little website that counts the number of R's in the word strawberry.
I have pasted the HTML directly below (simultaneously testing how markdown with HTML gets handled by MkDocs)
I've been really happy with my self-hosted RSS reader FreshRSS .
I've consolidated a lot of the newsletter spam and occasional checking of blogs with a more considered decision to go to the RSS reader.
This means that I have had less desire to check out whats going on on Mastodon.
I just found out that I can get an RSS feed for Mastodon accounts. So, if I wanted to follow @charliemarsh I can just append .rss to the url of his profile webpage. (Note, this has to be his original page).
So, Charlies original page is https://hachyderm.io/@charliermarsh and the RSS feed with all his public posts are available at https://hachyderm.io/@charliermarsh.rss.
I have started to follow a few Python people where I'm not aware of any blog to follow instead. Aside from Charlie Marsh, I also now follow:
When automating infrastructure changes through CI/CD pipelines, it can be VERY scary to merge a pull request that changes something in your infrastructure that you are not very familiar with.
Sure, you have tested the change in a test environment. Did you try to make a plan against a dev environment too? Have you tested against all the relevant targets of this change?
Maybe you are reviewing someone else's infrastructure changes. How can you be sure you have caught if this is actually destroying and recreating all the databases?
I've dealt with too much of this anxiety! AUTOMATE IT AWAY!
With some inspiration from my friend, Lasse Hels, I created this bash script for Azure DevOps Pipelines.
It is assumed to be running as part of a pull request validation pipeline
Assuming there is a terraform plan created in the file tfplan, it parses the plan as plaintext and json
No matter what, the plaintext plan is posted as a comment in the pull request. The comment will be collapsed by default.
If it finds any destructive changes in the plan, the comment will have a big scary warning and be marked as "Active". This means someone will have to look at it and resolve it before the pull request can be merged.
#!/bin/bashset-euopipefail
# Somehow create your plan as tfplanPLAN_TEXT=$(terraformshowtfplan)PLAN_JSON=$(terraformshow-jsontfplan)HAS_DESTRUCTIVE_CHANGES=$(echo"$PLAN_JSON"|jq-r'.resource_changes[] | select(.change.actions[] | contains("delete"))')# Conditional alertDANGER_MESSAGE=""if[!-z"$HAS_DESTRUCTIVE_CHANGES"];thenDANGER_MESSAGE="**DANGER! YOU ARE ABOUT TO DESTROY RESOURCES**"fi# Actual comment to be postedCODE_BLOCK_FENCE='```'COMMENT=$(cat<< EOF${DANGER_MESSAGE}<details><summary>Click to expand</summary>${CODE_BLOCK_FENCE}${PLAN_TEXT}${CODE_BLOCK_FENCE}</details>EOF)# Set comment status to Active for destructive changes, Resolved otherwiseCOMMENT_STATUS=2# Resolvedif[!-z"$HAS_DESTRUCTIVE_CHANGES"];thenCOMMENT_STATUS=1# Activefi# Build payload for ADO APIJSON_PAYLOAD=$(jq-n\--argcontent"$COMMENT"\--argstatus"$COMMENT_STATUS"\'{comments: [{content: $content}], status: ($status|tonumber)}')# Call ADO API to make the commentcurl-XPOST\"$(SYSTEM_COLLECTIONURI)/$(SYSTEM_TEAMPROJECT)/_apis/git/repositories/$(BUILD_REPOSITORY_NAME)/pullrequests/$(SYSTEM_PULLREQUEST_PULLREQUESTID)/threads?api-version=6.0"\-H"Authorization: Bearer $(SYSTEM_ACCESSTOKEN)"\-H"Content-Type: application/json"\-d"$JSON_PAYLOAD"
I'm distributing a small Python package at work. A small library with some utilities for doing Machine Learning work.
I'm using uv to manage the dependencies and the build process.
How do I know that my library will work with both pydantic==2.0 and pydantic==2.10 (The current version at time of writing)?
I could just require a much smaller band of possible versions, but I want my library to be useful for as many users as possible.
And they might need to use a different version of pydantic for their own projects.
Similarly, I want to make sure my library actually works with the range of allowed Python versions.
I run my tests with uv run pytest. This will use the locked dependencies in the uv.lock file to create a virtual environment and run the tests in that environment.
But, I can use the --resolution flag to test my library with different versions of the dependencies.
According to the uv documentation, there are three possible values for the --resolution flag:
highest: Resolve the highest compatible version of each package
lowest: Resolve the lowest compatible version of each package
lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies
I have found that using --resolution lowest is not really that useful, because some transitive dependencies might not specify a version range. Maybe they just require "numpy" without specifying a version. In that case, I will be testing my library against numpy==0.0.1 or whatever the lowest version is. That is not really useful. Instead, I use --resolution lowest-direct to test against the lowest version of the direct dependencies and then just select the highest version of the transitive dependencies.
I can also specify the python version to use with the --python flag.
Finally, I can use the --isolated flag to make sure that the tests are run in an isolated virtual environment, not affecting the active venv of my workspace.
Here is the entry in my justfile that runs the tests with different dependency resolutions:
I spent some time tonight setting configuring some services on my home kubernetes cluster. See this post for more details on how I set up the cluster.
So far it's been a fun experiment to see if I can avoid anything spontanously catching file. At work there is a full team of experts dedicated to keep our cluster running smoothly. At home, there is... me.
I've been using Cursor as a guide generating the yaml files and asking questions about how Kubernetes works. I think I am a decent user of Kubernetes clusters, but a rank novice as an operator of a cluster.
I want to try to get away from doomscrolling, and being caught in some algorithmically generated news feed. I'll try FreshRSS for a while at least.
To get started I asked Cursor to generate a deployment, giving it a link to the FreshRSS documentation.
I had to go back and forth a few times to understand how to get a URL to resolve on my home network.
The kubernetes cluster is running on the host tyr, so I can ping that from my home network on tyr.local.
Initially I wanted to host FreshRSS at rss.tyr.local, but I didn't figure out how to do that. Instead I hosted it at tyr.local/rss and then added Middleware to strip the /rss path before sending the traffic to the Service.
Since both services are local and not exposed to the internet, there is probably no harm in commiting something like this.
But I had a discussion about this problem with my smart, very kubernetes savvy friend, Lasse. So I wanted to find some solution to this that did not commit any deadly sins, such as committing a secret in plaintext.
Again, I asked Cursor for suggestions, prompting it to not make any changes to my files, just list out options.
Here is the full list of suggestions:
List some ways I can add secrets here without leaking it in git
Using Sealed Secrets:
# Install sealed-secrets controllerhelminstallsealed-secretssealed-secrets/sealed-secrets
# Encrypt your secretskubeseal<atuin/secrets.yaml>atuin/sealed-secrets.yaml
This lets you safely commit encrypted secrets that only your cluster can decrypt.
Using External Secrets Operator with a secrets manager:
This encrypts secrets with age, PGP, or cloud KMS keys.
I chose to use Sealed Secrets. It sounded by far the simplest, and I could sort-of understand what was going on:
I install the SealedSecret helm chart on my cluster. This can encrypt and decrypt text using some fancy crypto magic. Basically a private/public key pair is generated and stored as Kubernetes secerets and these are used to encrypt my secrets.
I install kubeseal on my local machine. It can communicate with the SealedSecret controller running in the cluster.
I pass a local (uncommitted) kubernetes Secret to kubeseal, it encrypts it and I get back a SealedSecret.
I can then store, apply and commit this SealedSecret. It will get unsealed when applied to my cluster (so my services can use it), but the unsealing only happens inside the cluster. My local manifest file is encrypted.
Let's say I want to encrypt this Secret
secrets.yaml
apiVersion:v1kind:Secretmetadata:name:atuin-secretsnamespace:atuintype:OpaquestringData:ATUIN_DB_USERNAME:atuinATUIN_DB_PASSWORD:"123"ATUIN_DB_URI:"postgres://username:123@postgres/atuin"# Match the password here
I can run kubeseal to encrypt:
kubeseal<secrets.yaml>sealed-secrets.yaml
and I get back
sealed-secrets.yaml
apiVersion:bitnami.com/v1alpha1kind:SealedSecretmetadata:creationTimestamp:nullname:atuin-secretsnamespace:atuinspec:encryptedData:ATUIN_DB_PASSWORD:AgBKfphBarMiNX8CIsvjAXqEtRp/Bq+a4y67k/M6bxMm1w/[TRUNCATED FOR SPACE]ATUIN_DB_URI:AgCfm2AisGVBlMrOqPvMWOor0e0UXDruZnWVG3klrfSzbtZfrzYF4x[TRUNCATED FOR SPACE]ATUIN_DB_USERNAME:AgAt8yDkKRjmvJtB4ecxOOcuEm1Zcoa8pX1UvtvwAAT4M18PN3JK[TRUNCATED FOR SPACE]template:metadata:creationTimestamp:nullname:atuin-secretsnamespace:atuintype:Opaque
Pretty cool!
I have also backed up the Sealed Secrets private key in my 1Password.
# config.yamlapiVersion:v1kind:ConfigMapmetadata:name:atuin-confignamespace:atuindata:ATUIN_HOST:"0.0.0.0"ATUIN_PORT:"8888"ATUIN_OPEN_REGISTRATION:"true"---# deployment.yaml---apiVersion:apps/v1kind:Deploymentmetadata:name:postgresnamespace:atuinspec:replicas:1strategy:type:Recreate# Prevent data corruption by ensuring only one pod runsselector:matchLabels:app:postgrestemplate:metadata:labels:app:postgresspec:containers:-name:postgresqlimage:postgres:14ports:-containerPort:5432env:-name:POSTGRES_DBvalue:atuin-name:POSTGRES_PASSWORDvalueFrom:secretKeyRef:name:atuin-secretskey:ATUIN_DB_PASSWORD-name:POSTGRES_USERvalueFrom:secretKeyRef:name:atuin-secretskey:ATUIN_DB_USERNAMElifecycle:preStop:exec:command:["/usr/local/bin/pg_ctl","stop","-D","/var/lib/postgresql/data","-w","-t","60","-m","fast",]resources:requests:cpu:100mmemory:100Milimits:cpu:250mmemory:600MivolumeMounts:-mountPath:/var/lib/postgresql/data/name:databasevolumes:-name:databasepersistentVolumeClaim:claimName:database---apiVersion:apps/v1kind:Deploymentmetadata:name:atuinnamespace:atuinspec:replicas:1selector:matchLabels:app:atuintemplate:metadata:labels:app:atuinspec:containers:-name:atuinimage:ghcr.io/atuinsh/atuin:v18.4.0# Using a specific version as recommendedargs:-server-startenv:-name:ATUIN_DB_URIvalueFrom:secretKeyRef:name:atuin-secretskey:ATUIN_DB_URI-name:ATUIN_HOSTvalueFrom:configMapKeyRef:name:atuin-configkey:ATUIN_HOST-name:ATUIN_PORTvalueFrom:configMapKeyRef:name:atuin-configkey:ATUIN_PORT-name:ATUIN_OPEN_REGISTRATIONvalueFrom:configMapKeyRef:name:atuin-configkey:ATUIN_OPEN_REGISTRATIONports:-containerPort:8888resources:limits:cpu:250mmemory:1Girequests:cpu:250mmemory:1GivolumeMounts:-mountPath:/configname:atuin-configvolumes:-name:atuin-configpersistentVolumeClaim:claimName:atuin-config---# ingress.yamlapiVersion:networking.k8s.io/v1kind:Ingressmetadata:name:atuinnamespace:atuinannotations:traefik.ingress.kubernetes.io/router.entrypoints:webtraefik.ingress.kubernetes.io/router.middlewares:"atuin-strip-prefix@kubernetescrd"spec:rules:-host:tyr.localhttp:paths:-path:/atuinpathType:Prefixbackend:service:name:atuinport:number:8888---# middleware.yamlapiVersion:traefik.io/v1alpha1kind:Middlewaremetadata:name:strip-prefixnamespace:atuinspec:stripPrefix:prefixes:-/atuinforceSlash:true---# namespace.yamlapiVersion:v1kind:Namespacemetadata:name:atuin---# sealed-secrets.yaml---apiVersion:bitnami.com/v1alpha1kind:SealedSecretmetadata:creationTimestamp:nullname:atuin-secretsnamespace:atuinspec:encryptedData:ATUIN_DB_PASSWORD:AgBKfphBarMiNX8CIsvjAXqEtRp/Bq+a4y67k/M6bxMm1w/fJUERNqBKaPWqaABfHR4WEk9ePj4CWcVbHb2xVCviX4zYE4pZ9onMvzRGJa2UUl1qRsJGN/ooMRJux+ztfSXJfRzzZxt1QjBlJOmMxG0XjKu0TdahXnI4BMJ2rrBPPmWx9sr4z8YxG8BU/TL8DiJGiD2DtarQWmqSogueGpsOE/9hdeWvW4E7RNlcd7JJ0Hv/nELlhVIUB9fzGoaioDJO6qodYBWNtt2ckyNp3KwoOKXddwRV5tq1ggPKnZOqlHpDgmTaYAFNPXVGIpMNxzUfs+CU0VdT60hx5e3qMbVD86NrnqmbQ38GYc/A7TDrWImSEPjkweLPSTgK5YuQEHJBGYDy9jNNVTMHwfcXkAZkD8swu8+2Whw6No1D2WO2LwewVdTDOynjVhekGk3UF6B2lqIn9TowkIBbZZ6mYYK4VzXRCRXmo2ZiEqDMQK78ejUHdK5m43cZ9M+BEmE3lKzAmgZt+xons/xcisI63pff31urXWZsFylZvnVUnR/l0cp5jmr8KDnMp1WDPf+UyhSlxVvnfAKRyXIGi6jpMQluXVvx/waX4MdqgJMfyn3cQ6tFH4YiZCX6kdNNWjJp5lYxmhRdqWRznCB1vxuWIfXCc9eUT8Kz0Houmw/S8HR11ApNoxopbalC23wdTa9ZXlJdC4bXElfdC8HHwjTcNezDN9mc+4e+WdaKkbuYZljPATUIN_DB_URI:AgCfm2AisGVBlMrOqPvMWOor0e0UXDruZnWVG3klrfSzbtZfrzYF4x+sY7fVLsfUY3RSRF84m13hIJPBxhiO3pFPAs6e6zm5GH7B+8Iem1ijIXWNVW5oc7h/Kas77k1h+TcJTVyZ4gL52oqzZM3cwAX0UdE/enNrvYWoeTsJ0UMbNw3bKZ9Ll0BPfdirdHT8Ve7jMzaDF+d11difPOhyZ7wgK3ykzOGu9G8LbzJ8IwUYYFK/1DETYU76XC/d79tUOwSYxGwf88/r2zjn9ZFA7rnzzEnV7ECR33fSoRJALZMyHMUOp8cxa1rYGPrBRyHhivdhhUnyRgXqAq/oymQo4+cwBHZFSpmtEqafQ8RpuOr2ymRgrxBGfe4n4eLprzY5EUZpFRhgxonb10YL16vg/oAlWObdYkS17ZayQtsfbHBD2udjljQXrjWNIWlT6fXG8JeJth+kFewr9+2c0Rfh9sQJ+F2otBk5x+dbt5xTKppAsAEHIy9lN8/Gbh+U+woCxgP11x+w/HYX9KXDkGHcOiAteYEI7Cf2Eo1TKD7ICVTVfReETWxAzSpKMabltNuM8fuLj6dHakvkQ6PgS537ShhyGofbLQaWTB8AMpwRCIUZme6EkfZuoO2CBt8gCnL3U6geDhHUB4ZGU4g9wPL/FlIqSPaWhafwbjc+PCyXqpOMNHdXtNc7D7bAsWN1Nri3Gk1D4ae0BDTunG/SgX4rlx6zc8kGgmFtJ/cnX//RO40Om2Yf36bdeb3KgDo4Ia49EZDaH7FlRn1cwUax0Gr3Jz4=ATUIN_DB_USERNAME:AgAt8yDkKRjmvJtB4ecxOOcuEm1Zcoa8pX1UvtvwAAT4M18PN3JK+6yOyhHuuTwWtWphlQnAjSWx6Bu8usgIxrw9dhBCRxf4pJIaW2VmszUnn1HOtdEFcU6+40PEZ8vJEqCQz/sQoilhZyH06VYecNZFtUHleFAaEFfSGPtxd73lqpjY62fOI8yoGfd/lmXays5vjSx9kUtUVd71FYEOf7P6x+OWlFWsbQ6FepiHygoCXTiCi9umbherpIHWCMZxELja/mNdVZp2wIO+NytedM47LIy2U0FP3b6quPc1H52OK/9AK9TJf/Ke8vUaRDE6TAqv1K0fT5diD4zwERzpNoHKHhnejKj1FOCm6WVcnPHk17zy9Et+kdB+feKpgbeZlolCSJ+JgNWnM2Y3WaovQI4i4yq3ipqQDI1AgY6hHMj1HGNH8gpFjHRy/+UfPd1f4aDO6hGAbL86O2y18VcqD7gESRJ7XVWikJWpU2hIp2FAEpopoqU1QPWyTGvvC46g+gfTARIphn1EzjKymdc4ICb8Viuy/B1oVuwFaD7y9FnNx3tPP4cSuODiG2u6q0j/UTMkAftGqPZUNu3yfkrJHziKUnGc9kuasgAFJKXL2qJuG4VBxNPwTmp2VnJiBysvUb1JTTYd+2uEu4woGmzVfm/9kjkP1rbRk+hAUj5fyW2Nebds9dgD2gXZ2yGOK/S1G0TXnriSQA==template:metadata:creationTimestamp:nullname:atuin-secretsnamespace:atuintype:Opaque---# services.yaml---apiVersion:v1kind:Servicemetadata:name:atuinnamespace:atuinspec:type:ClusterIPports:-port:8888targetPort:8888selector:app:atuin---apiVersion:v1kind:Servicemetadata:name:postgresnamespace:atuinspec:type:ClusterIPports:-port:5432targetPort:5432selector:app:postgres---# storage.yaml---apiVersion:v1kind:PersistentVolumeClaimmetadata:name:databasenamespace:atuinspec:accessModes:-ReadWriteOnceresources:requests:storage:1Gi---apiVersion:v1kind:PersistentVolumeClaimmetadata:name:atuin-confignamespace:atuinspec:accessModes:-ReadWriteOnceresources:requests:storage:100Mi
I found an old mini PC (ASUS Mini PC PN30), left in a drawer from when I thought I needed it to run a Plex media server.
With a sudden (unexpected) burst of motivation I decided to run a local kubernetes cluster on it.
(In hindsight, I think I might also have been inspired to try self-hosting an RSS reader by this post. I just got distracted, by deciding to self-host using kubernetes).
I asked ChatGPT and Claude for help on how to set up a simple kubernetes setup at home.
After some back-and-forth I landed on installing Debian with no graphical desktop environment, and then installing k3s.
The choice of k3s was mainly made to limit the resource requirements. The Mini PC is not exactly beefy, with an underpowered CPU and only 4GB RAM (While trying to confirm this number, I found the listing for this on Amazon and it claims that I can upgrade to 8GB RAM. I might do that at some point).
I downloaded an ISO of Debian 12 and made a bootable usb.
I connected the Mini PC to a monitor, keyboard and mouse and booted the Debian installer from the usb stick.
I selected graphical installer and followed the guidance.
I did not create a root user, instead letting my own user get sudo privileges.
I did not install a desktop environment.
I gave it the hostname tyr.
I made sure to select SSH, to allow access after I unplug the peripherals.
I tried accessing the mini PC over shh from my desktop.
sshkasper@tyr.local
This did not work, but using the local IP directly works fine.
I really want to use the hostname.local thing, I learned that it is called mDNS, and I need a mDNS service.
I installed Avahi, both on my desktop and on the Mini PC
After a minute, the kubernetes cluster is running and I can query it from tyr
$ sudok3skubectlgetnodeNAME STATUS ROLES AGE VERSIONtyr Ready control-plane,master 15h v1.31.4+k3s1
Next, I want to access it from my desktop.
Following the k3s guide I copy /etc/rancher/k3s/k3s.yaml from tyr to ~/.kube/config my desktop with scp, and edit the server field to point to IP of tyr.
I tried a lot get tyr.local to resolve instead of the IP, but as far as I can tell, kubectl is not using the mDNS stuff from above.
Here is the last chat message (in a long back-and-forth) from o1 on why .local does not work.
A statically compiled binary often does not use the system's usual NSS (Name Service Switch) mechanisms—like /etc/nsswitch.conf and libnss-mdns—for hostname resolution. Instead, it typically performs "pure DNS" lookups.
That explains why:
ping tyr.local succeeds, because it honors nsswitch.conf and uses Avahi/mDNS.
kubectl fails on tyr.local, because it bypasses your local mDNS setup and tries querying a DNS server that doesn't know about .local names.'
ChatGPT suggest some ways to fix it, but the simplest seemed to be to just plug in the IP.
I made sure to go to my router and reserve the local IP address of tyr, so it does not change after a reboot or something.
And finally, I can run the following from my desktop
$ kubectlgetnodeNAME STATUS ROLES AGE VERSIONtyr Ready control-plane,master 44h v1.31.4+k3s1
I have posted (very) few post to my blog over the years.
Recently, one of the things holding me back from posting is that I can't really build it locally any more.
I'm using Github Pages, and although it probably very easy to use, I just use it so rarely that I don't really know what is going on.
My brief guide says to run bundle install, but I don't have bundle installed. I also don't know what it is.
Github tells me to install Jekyll and Ruby. I don't have either of them.
At work I use Python a lot, and I have created a few docs sites with MkDocs, with Material for MkDocs helping out in making everything pretty. I want to use that tool-stack instead.
All the content is markdown anyway, so it should not be too bad.
EDIT: After publishing I had some problems with my custom domain, kplauritzen.dk. Every time I ran mkdocs gh-deploy it wanted to deploy to kplauritzen.github.io instead.
I think the solution is to create a CNAME file in _posts/ as that will get picked up during the build.
See the docs.
I have a CI/CD pipeline that interacts with a Databricks workspace through the Databricks CLI.
I usually develop the pipeline locally, testing it against a sandbox Databricks workspace, authenticated as myself.
But when I deploy the pipeline to the CI/CD environment, it runs as a service principal, first against a dev workspace, then against a prod workspace.
There can be some issues that only appear when running as a service principal, like permissions errors or workspace configurations. And the feedback loop is too slow: I have to commit, push, wait for the pipeline to run, check the logs, and repeat.
I want to test the pipeline locally, authenticated as a service principal, to catch these issues earlier.
Reading about the one million ways to authenticate to an Azure Databricks workspace is enough to give me a headache (Seriously, there are too many options).
I have previously used environment variables to authenticate as a service principal, the various secrets in an .env file, and commenting and un-commenting as needed.
It is a mess, and I'm guaranteed to forget to switch back to my user account at some point.
Instead, I can use databricks profiles to store the different authentication configurations.
In ~/.databrickscfg, I can create a profile for each service principal, and switch between them with the --profile flag.
Here is an example of a ~/.databrickscfg file with two Service principal profiles:
.databrickscfg
[DEFAULT]host=<SOME_HOST>token=<SOME_TOKEN>[project-prod-sp]host=azure_client_id=azure_client_secret=azure_tenant_id=[project-dev-sp]<same setup as above>
Of course, you should replace the placeholders with the actual values.
To test what workspace and user your profile is using, you can try the following command:
databricksauthdescribe--profileproject-prod-sp
This will also show you where the authentication is coming from (because, as I mentioned above, there are too many ways to authenticate).
Finally, you can run your pipeline locally, using the --profile flag to specify that you want to use the service principal profile:
At work, we need to consume and produce messages to some queue. And one of the tools available already is Kafka.
Before integrating with the existing Kafka cluster, I want to test my client code. I want to ensure that it can consume and produce messages correctly.
I have an existing BaseQueueService class like this:
I can use pytest-docker to create a test fixture that starts a Kafka container. This way, I can test my KafkaQueueService class without needing a Kafka cluster.
This is how I did it:
A docker-compose.yml file to start a Kafka container:
A conftest.py file to create a test fixture that starts the Kafka container:
defcheck_kafka_ready(required_topics,host="localhost",port=9092):fromconfluent_kafkaimportKafkaExceptionfromconfluent_kafka.adminimportAdminClienttry:admin=AdminClient({"bootstrap.servers":f"{host}:{port}"})topics=admin.list_topics(timeout=5)# Check if all required topics are presentifall(topicintopics.topicsfortopicinrequired_topics):returnTrueelse:returnFalseexceptKafkaException:returnFalse@pytest.fixture(scope="session")defkafka_url(docker_services):"""Start kafka service and return the url."""port=docker_services.port_for("kafka",9092)required_topics=["testtopic","input_test_topic","output_test_topic"]docker_services.wait_until_responsive(check=lambda:check_kafka_ready(port=port,required_topics=required_topics),timeout=30.0,pause=0.1,)returnf"localhost:{port}"
And finally, a test file to test the KafkaQueueService class:
This passes the test above (and a few other tests I wrote):
fromconfluent_kafkaimportConsumer,KafkaError,ProducerclassKafkaQueueService(BaseQueueService):def__init__(self,broker:str,topic:str,group_id:str):# Configuration for the producer and consumerself.topic=topicself.producer:Producer=Producer({"bootstrap.servers":broker})self.consumer:Consumer=Consumer({"bootstrap.servers":broker,"group.id":group_id,"auto.offset.reset":"earliest","enable.partition.eof":"true",})self.consumer.subscribe([self.topic])defpublish(self,message:str)->None:"""Publish a message to the Kafka topic."""logger.debug(f"Publishing message to topic {self.topic}: {message}")self.producer.produce(self.topic,message.encode("utf-8"))self.producer.flush()defconsume(self)->str|None:"""Consume a single message from the Kafka topic."""logger.debug(f"Consuming message from topic {self.topic}")# Get the next messagemessage=self.consumer.poll(timeout=20)ifmessageisNone:logger.debug("Consumer poll timeout")returnNone# No new messageifmessage.error()isnotNoneandmessage.error().code()==KafkaError._PARTITION_EOF:logger.debug("No new messages in topic")returnNone# Check for errorsifmessage.error()isnotNone:raiseException(f"Consumer error: {message.error()}")self.consumer.commit(message,asynchronous=False)returnmessage.value().decode("utf-8")def__repr__(self)->str:returnf"{self.__class__.__name__}(topic={self.topic})"
At work, I'm using an M3 Macbook. It's a great machine, but it's not perfect.
One issue is that I can't always build Docker images target to linux/amd64 on it.
Recently, I had an issue where I needed to package a Python application in Docker, and one of the dependencies was pytorch.
I suspect that is where my issue was coming from.
Building the image on Mac works fine when running it on the same machine, but when I try to run it on a Linux machine, it fails with the following error:
exec /app/.venv/bin/python: exec format error
This indicated that the Python binary was built for the wrong architecture. Luckily, you can specify the target architecture using
the --platform flag when building the image.
dockerbuildxbuild--platformlinux/amd64-tmy-image.
Unfortunately, this didn't work for me. I suspect that the pytorch dependency was causing the issue. I got the following error:
To solve this issue, I decided to build the image on a remote x86_64 Linux VM. This way, I can ensure that the image is built for the correct architecture.
I used an Azure Virtual Machine with an Ubuntu 24.04 image. I enabled "Auto-shutdown" at midnight every day to save costs.
After ssh-ing into the VM, I installed docker and ensured the user was added to the docker group.