« Back to home

A Tale of Two Umasks

Posted on

The Problem

A few days ago, my blog suddenly started returning 403 errors after I deployed from my Android phone. Let’s look at why.

This blog is statically generated by hugo from Markdown files, and deployed to a Linode server using rsync. I mostly write 1 and publish from my GPD Pocket or my Ubuntu phone, but I also have all necessary tools on my Android phone, thanks to the fantastic Termux2. I often write and edit on Android, but this was the first time I deployed from it.

The local previews (generated by hugo server) had looked fine, so I suspected some issue with the deploy. My deploy script is one line in bash:

hugo && rsync -avz --delete public/ matthias@simplicissimax.com:/usr/local/www/simplicissimax.com/

hugo generates the public HTML directory, rsync uploads it to my server. The option flags -avz tell rsync to archive - be recursive and maintain metadata -, give verbose output, and compress the data. Pretty short and straightforward, not much opportunity for failure. So, what happened?

Let’s Debug

The first rule of troubleshooting is: know the steps needed to reproduce the issue. If you can consistently reproduce a problem, you can analyze it, and you can observe the effect of attempted fixes. In my case, I could reproduce the 403 error every time I ran the deploy script from my Android phone.

The next step in troubleshooting is to eliminate potential causes. This can simply mean changing environment factors one at a time and observing the results, but ideally should be guided by technical expertise and a working hypothesis. In this case, I knew that the deploy had succeeded as recently as a few days ago on a different device (running a regular Linux environment instead of Termux). So it made sense to try a deploy from another system and compare the results. I pulled out my Ubuntu phone, git pulled the updated repository and deployed - successfully. The same script was producing different results in different environments.

What Does HTTP Status Code 403 Mean?

At this point, it helps to remind ourselves what a 403 status means:

403 Forbidden The client does not have access rights to the content, i.e. they are unauthorized, so server is rejecting to give proper response. Unlike 401, the client’s identity is known to the server.

I didn’t have permission to view the web pages of my blog, which are just files on a server. That server runs Linux, where every file and directory has read, write and execute permissions for (1) the owner, (2) the group of users the owner belongs to, and (3) everybody else. Each permission is a bit - on or off. For example if a file’s permissions read rwxr-xr--x, that means it can be modified, read (= list contents) and opened by the owner (first three bits), read and opened by his group (next three bits) and read (but not opened) by everybody else (last three bits).

Let’s Verify The Data

I sshed into my server and checked the files at the nginx web root directory for this site:

$ ls -l simplicissimax.com/
total 196
-rw-------  1 matthias matthias   3902 May 29 09:24 404.html
-rw-------  1 matthias matthias   7993 Jun  4  2017 apple-touch-icon.png
drwx------  4 matthias matthias   4096 May 29 07:53 categories
drwx------  3 matthias matthias   4096 Nov 23  2017 css
-rw-------  1 matthias matthias  15086 Jun  4  2017 favicon.ico
drwx------  2 matthias matthias   4096 Jun  4  2017 font
drwx------  2 matthias matthias   4096 May 27 21:10 images
drwx------  2 matthias matthias   4096 Jun  4  2017 img
-rw-------  1 matthias matthias  11594 May 29 09:24 index.html
-rw-------  1 matthias matthias 115411 May 29 09:24 index.xml
drwx------  3 matthias matthias   4096 Jun  4  2017 js
drwx------  4 matthias matthias   4096 May 29 07:53 page
drwx------ 14 matthias matthias   4096 May 29 07:53 post
-rw-------  1 matthias matthias   3978 May 29 09:24 sitemap.xml
drwx------ 11 matthias matthias   4096 May 29 07:53 tags

Sure enough, all files and directories are forbidden to anybody other than the owner. That’s definitely not what this should look like, and it explains the 403 response. Let’s compare the results of a known-good deploy:

$ ls -l simplicissimax.com/
total 196
-rw-rw-r--  1 matthias matthias   3902 May 29 09:51 404.html
-rw-r--r--  1 matthias matthias   7993 Jan  7 07:54 apple-touch-icon.png
drwxrwxr-x  4 matthias matthias   4096 May 29 08:14 categories
drwxr-xr-x  3 matthias matthias   4096 Jan  7 07:54 css
-rw-r--r--  1 matthias matthias  15086 Jan  7 07:54 favicon.ico
drwxr-xr-x  2 matthias matthias   4096 Jan  7 07:54 font
drwxr-xr-x  2 matthias matthias   4096 May 27 15:33 images
drwxr-xr-x  2 matthias matthias   4096 Jan  7 07:54 img
-rw-rw-r--  1 matthias matthias  11594 May 29 09:51 index.html
-rw-rw-r--  1 matthias matthias 115411 May 29 09:51 index.xml
drwxr-xr-x  3 matthias matthias   4096 Jan  7 07:54 js
drwxrwxr-x  4 matthias matthias   4096 May 29 08:14 page
drwxrwxr-x 14 matthias matthias   4096 May 29 08:14 post
-rw-rw-r--  1 matthias matthias   3978 May 29 09:51 sitemap.xml
drwxrwxr-x 11 matthias matthias   4096 May 29 08:14 tags

But Why Were The Permissions Broken?

At first I suspected that the version of rsync in the Termux apt repositories may be at fault. Termux has its own repositories and maybe its version of rsync was slightly non-standard. The -a flag is a shorthand for several other flags - what if it was unsupported and I needed to specify the individual flags? However, man rsync showed that this flag was indeed supported. At this point, I don’t want to assume a bug in one of the most well-known Unix utilities, so let’s keep looking elsewhere.

The only other command in my deploy script is hugo. Unlike rsync, this was the same binary download straight from the developer’s release page on both devices, so it should behave consistently.

I tried running only the hugo command on both devices - and saw that it did produce exactly the permissions we saw above - correct on Ubuntu, broken on Termux. What’s going on? Surely no static site generator would break file permissions. Something in the environment must be at fault.

What’s Different About Termux?

This is where I remembered one crucial detail about Termux. Unlike traditional Linux distros, Termux is a single user environment! I feel I’m on to something here. I wonder what’s my umask in both systems (type umask at your shell prompt, without any arguments)? On Termux, I saw a umask of 0077. On my Ubuntu phone, desktop and server I get 0002.

Wait, what’s a umask?

When you tell your system to create a new file, how does it know what default permissions you want? It looks at your umask. For details, check man umask, but basically it works like this: each set of read/write/execute permissions can be expressed as a number from 0-7 (each set is a byte, and each individual permission is a bit). By default, most programs try to set all permissions (0777) for directories, and all but execute (0666) for regular files. Your umask is used to modify (“mask”) this value - usually restrict permissions given to users other than the owner.

Termux has no need to set permissions for anybody but the owner since there is nobody else, so it masks out all permissions (=7) for both group and others, and hugo applies this mask.

Ok, How Do We Fix This?

We can change the umask value easily:

umask 0002

But this will not persist beyond the current shell session. If we want 0002 to be the new default, we can put the same command into a script umask.sh under <termux install directory>/files/usr/etc/profile.d. This way, 0002 will be your default umask in any termux session. Please note that changing your default umask could lead to unexpected behaviour. In this case, since we are making it less restrictive than before, I believe there should be no issues, but I will update this post if I find anything. If it does turn out to cause any problems, we could modify just our deploy.sh to change the umask for that sub shell.

I thought this would be the end of it, but even after changing the umask, my site kept returning 403 after deploying from Termux. Why?

Ok, how do we completely fix this?

Turns out the entries generated from Markdown by hugo were fine after changing the umask, but hugo also copied static assets from the blog’s repository (images, CSS and JavaScript files). The old umask had been in effect when git originally cloned the repo from my server, and hugo preserved the current permissions on copy.

Simply rm -rfing the whole repo and re-running git clone fixed this, and now I can deploy from Android as well :-)

  1. I mostly write in vim, in case you were wondering. It’s the only editor available across all of my devices. ↩︎

  2. I have all the same tools on my Macbook and occasionally use it to write or publish, but I’m trying to move more of my daily computing to Linux. I also own an iPhone and iPad, but the lack of any local command line environment makes iOS very cumbersome for this workflow. As far as I am aware, no hugo app exists for iOS, so I cannot build (or preview) pages locally. I could use Working Copy to write and sync to my git server, then ssh into my server to build or run a preview, but that’s ludicrous (an internet connection needed just to preview a post?). ↩︎