Thu, 08 Jan 2015

One way to use a simple usb foot pedal to control VLC Media Player for audio transcription, featuring NodeJS for Unix socket manipulation

(being an extended note-to-self so I don't have to scour the Internet for my sources after another three months).

I need to do some audio transcription, but foot pedals meant for this purpose are expensive. So is transcription software. Both of these ease and speed up the transcription process: which is very welcome since transcribing is slow work. For me, fifteen minutes of interview recording takes an hour to transcribe near-verbatim. A foot pedal allows you to pause and resume the recording without taking your fingers off the keyboard or having to switch windows or anything. Specialized software might provide useful hotkeys to drop in timestamps or automatically enter conversation partners' names on a new line when you press Enter.

Actually, for this project there was potentially some budget for expensive equipment (unlike my last transcription project, my zero-budget master's thesis. Back then, f4 was still free). But as I was pondering all this, the Internet was quick to point out to me that all of this could be done on a shoestring budget. Take a cheap usb foot pedal, said the Internet, in the range of 10-30 Euros (instead of 60 to 160 or more for 'transcription' pedals), and augment its powers with a few keymaps and quick scripts in your language of choice.

Me being someone in whose house a 30 Euro router is currently performing the functions of a 150 Euro router thanks to openwrt, this obviously appealed to me.

This was all several months ago; and I got it working fairly quickly. Now that I actually need to use it, I went digging for the scripts I had found on the Internet or pieced together from pieces on But of course, I couldn't get away with such a cheap setup without paying a replacement price. I won't mention how many hours of my time exactly went into getting things working again. Finding all the sources I had used was one thing, but on top of that my OS has not been standing still in the meantime and I found I had a different mechanism to deal with to capture and conver the signal coming from the foot pedal. All the code I had found or written was clear enough, but it took a good deal of time figuring out how registering the keypress needed to be done with udev.

What follows now seems fishily roundabout, but it worked so once I had finally cleared up the other bother I left it as it is. The solution I had figured out was to use Node's net module to create a Unix socket to listen on, toggling a counter on or off at each signal from the foot pedal, and either pausing or resuming VLC accordingly. I remember wanting to try out Node since I was getting interested in it at the time, and was impressed with the modules available for operating system interaction.

What basically happens, on my Debian system, is this:

The USB pedal sends a scancode, which the system translates to a keycode which higher-level processes can then convert into symcodes (the glyphs then rendered by fonts on screen). Unhelpfully, by default pressing the pedal prints '1'. So we need to intercept the scancode and map it to some other keycode that we find more interesting. First, we find out what the scancode is with evtest:

$ sudo evtest

My device is no. 21 in the list, so I select that, then press the foot pedal and evtest tells me (among many other things) what it's scancode is: 0x7001e.

To do this, we register the input device in udev's hardware database by identifying the device with its vendor and product id, and specifying what scancodes we want to map to what keycodes. In /lib/udev/hwdb.d/90-custom-keyboard.hwdb I put:

keyboard:usb:v0426p3011* #identify hardware by vendor id and product id
 KEYBOARD_KEY_7001E=f12 #scancode (0x)7001e should map to f12

Then rebuild the hardware database and draw udev's attention to the change:

sudo udevadm hwdb --update
sudo udevadm trigger

Now when we press the foot pedal, it produces F12. Next, we need to do something with this key. I wrote a little script in Bash to send a signal to a socket, and I used xbindkeys to run this script whenever F12 is pressed. (You could also just use the bind command, but that's less sustainable: requires that terminal window to be open and you have to mess with the escape sequences for the key in question, also it's not persistent). The script is called and looks like this:

#!/usr/bin/env bash

echo 'signal' | nc.openbsd -U /home/hans/keypress;

We're getting there. Here comes the Javascript. The following script called listen.js creates the socket to which the above script sends the signal, and calls the final script as a result.

#!/usr/bin/env node

//a script to listen for a certain keypress and do something with it.
//intended to be used with a footpedal for speeding up transcription.

var net = require('net');
var fs = require('fs');
var sys = require('sys')
var exec = require('child_process').exec;

var stat = 0;

//callback for exec to log result to stdout
function puts(error, stdout, stderr) { sys.puts(stdout) }

//define server and callback
var unixServer = net.createServer(function(client){

if (stat == 0){
        stat = 1;
        exec("/home/hans/ pause", puts);

}else if(stat == 1){
        stat = 0;
        exec("/home/hans/ jogbackward && /home/hans/ pause", puts);

//recover server if already in use etc.
unixServer.on('error', function (e) {
    if (e.code == 'EADDRINUSE') {
        var clientSocket = new net.Socket();
        clientSocket.on('error', function(e) { // handle error trying to talk to server
            if (e.code == 'ECONNREFUSED') {  // No other server listening
                unixServer.listen('/home/hans/keypress', function() { //'listening' listener
                    console.log('server recovered');
        clientSocket.connect({path: '/home/hans/keypress'}, function() { 
            console.log('Server running, giving up...');

//listen to server

The first block pulls in the modules we're going to use. The second block is just an if/else that calls with pause the first time the key is pressed, and with with jogbackward and (un)pause the next time. Pretty basic. The next block is copied pretty much straight from stackoverflow and is some error recovery if the filehandle for the socket already exists. Finally we start listening on the socket, so as long as this script is running in a terminal window every press of the foot pedal will get picked up and handled.

This bit of NodeJS glue code then calls the real stuff controlling VLC. I got it from VLC's wiki in this guide, which also explains how to configure some more sockets which this script uses to talk to VLC: send it commands and read status like the current time elapsed in the file you're playing. I modified it by encapsulating the timestamp stuff to its own function, because I also wanted to add a function to put interview participants' names in the transcript with a shortcut key, and occasionally get a timestamp to go along with those. I used xbindkeys again to map those shortcut keys, but for this I can directly call speakerswitch 1 (or 0 or 2 for other pre-defined speakers).

As I said before, there is suspiciously much redirection going on here with all these socket connections and glue scripts. The Python script makes yet more calls to xdotool to send text to the text editor (that's the final satisfaction, instead of a specialized program combining a rich-text editor with an audio player, I'm just transcribing in plain text in Geany or Gedit or such). But, I guess this collection of small independent tools is pretty Unixy, and it works, so once again we surpress the faint nagging desire to rather build a grand monolithic edifice to do all of this in one bundle, and declare it an elegant solution under the circumstances. Plus, I've wasted enough time figuring this out so if it works then that's good enough.

PS: the original resource on this topic for me was this one, and the one which finally got me on the right track concerning the changes to udev since that was written is here. You can follow the links from there ...

Wed, 29 Oct 2014

Showing one post per page

ooh, looks like I haven't been very busy at my newest blog, But here's a nice tweak to the theme over there: showing one post per page and browsing posts one at a time.

In ode_config, we set setting number 12 to show one post per page.

The previous and next post links now say next 1, previous 1.

We use an Ode theme to power the archive (a theme being simply a set of html templates plus css), which is linked to in each post's footer. This theme shows all posts (ten at a time, as hard-coded in the URL), showing only the date and the title.

The result: less of a 'blog' feel, more of an 'independent articles' feel as you always have one post per page. This is in line with the simple, calm intention of the theme. But easy access to a list of all posts in the archive.

Two things missing:

  1. remove the '1' from the next/previous string. I think I've done this before, but can't remember how at the moment.
  2. on a post page (if you click on a title from the archive page), you should really still have the next/previous links.
Wed, 13 Aug 2014

I've been busy, and there's a new blog

It's been a few days since I've written about my new commitment to consistently do a bit of work every day towards my goals. I've been too busy working to write. In fact, I have been working every day; but not all the projects I'm working on are ones I necessarily want to talk about on the Internet at the moment.

However, one project I do want to to work on consistently and in public is my new website design. Over the last few days I've been planning and fiddling, and today I finished step one: setting up a blog on which I can document my further steps.

This is; going forward I'll be writing mainly on I thought I would put the new blog right in the thick of the development. So I set up a subdomain of to point to an Ode installation: It has a minimal theme (including an archive feature implemented as a separate theme) and will hopefully be the site of many discoveries and reports to come.

Fri, 08 Aug 2014

change, days one and two: two steps forward and one step back with SASS

What have I done so far?

A project I'm applying myself to is working on a new website design. The first day, I got set up with ftp and all. I have often felt a barrier to publishing to my blog, and so my goal was to take that barrier away to the greatest extent possible. I reviewed the ftp method for accessing my hosting account, and I also put a little script in my local bin folder to hit the reindex command on this website. Now I can just write in text files on my local file system, and publishing to my blog is literally two steps: click to upload the file from my ftp client, then run the script that hits the URL to reindex the site.

The project which this generates is to explore how to get writing from other computers into this pipeline.

That was day one -- some satisfying infrastructure setup. Day two was more setup -- and here the fun begins.

Read the rest of this post

Doing something every day: what got me onto this idea and why I'm writing about it

A blog post I read recently -- before I travelled for a few weeks, then got back into my daily life here -- inspired me to action. So, after I did get back into my daily routine, I thought it was time to make a public commitment and go for it. The idea of doing a little bit every day towards your goals is certainly not new to me -- we've discussed it on this blog several times, in fact. But this blog post was so relatable to me, and the results which the author reported achieving when he applied this daily discipline looked so successful, that this really motivated me. So go over and read John Resig's account of coding every day.

The mechanism which he describes of building up a week's worth of expecations that can't possibly be fulfilled in a day of work really resonated. In fact, I gave the post to my wife to read as a well-worded explanation for my sometimes weekly cycle of exhaustion which I recently realized was coming from this mechanism. Now the strategy which he applied to overcome this, of doing a little each day instead of trying to do a lot once a week, is well-known. But the evidence which he presented for its positive effect in his life was riveting to me. Both the quantitative evidence (the Github activity chart) and the qualitative (his list of changing experiences). For me this functions as a strong motivator; I guess because it's such a concretely told story of the impact of this behavioural change.

If doing a little every day is one keystone strategy to learning, habit change, and effective work, perhaps the other most important one I've read about is public commitment. If you want to change a habit, you need to engineer your environment to make it harder to keep doing the old habit than the new. And one of the most powerful forces you have at your disposal is your web of social relationships. So I've been told. And I suspect it works similarily not only for habit change but for any kind of 'getting stuff done'. The strongest motivator is the social relationships in which your daily activities are embedded. If you want to do some hard work, you'll be much more likely to stick with it if you have people waiting for you; either to work on it together or at least to be interested in your progress.

So that's why I want to make the change of doing a little bit every day, and why I'm making myself blog about it. I will continue with further definition of what it is I want to work on in upcoming posts. Also, published simultaneously with this is a report of what I did yesterday and the day before (the first two days of my new pattern).

If you didn't click through above, here's the link again to John Resig's post. Also, if you have the extra time, click to through Jennifer DeWalt's project which he links. Really cool!

Wed, 06 Aug 2014

I don't necessarily have anything important to say. I just want to practice saying things.

Getting anything written and published (I'm talking blog context here) is actually pretty hard. So is building complex systems out of software. The ability to learn and maintain complex activities depends on your whole schedule. How you feel, what you have time to think about, WHEN during the day you have what time available.

I have so many things I want to learn, change, and work on. But they say you can only change one thing at a time. Well I'm hacking that -- I'm going to change only the practice of DOING SOMETHING every day towards one of my goals.

There are enough things I'm aware of about my environment and habits that aren't in line with what I want to be moving towards. I have ample supply of projects to choose from each day. And I know (from experience) that the hardest thing is choosing what from that long list to do when you abruptly find yourself with some free time. But I have to start somewhere, and so the start for me is a public commitment. That's one of the most important things, they say, to do when you want to change something.

Refining and making more concrete what exactly I'm committing to count as things to do.

Warning: All change is hard. If you follow this project, you might be exposed to discomfort, including but not limited to:

  • verboseness
  • complaining
  • pointless comments
  • bad humour
  • mushy, embarassing soul-searching
  • recalcitrance, reluctance expressed in tone of writing
  • bad writing
  • relapses (gasp).

I hope you'll stay, though, I need you! Do you want to change anything? Maybe we can help each other stay on track.

Ok. This is just a test.

Tue, 26 Nov 2013

Computer Literature Queue

The Ode Community Book Club has just started up its second edition: we're reading Dive into HTML5 and Beginning HTML and CSS. After this we'll probably read something on Git, so I've got my reading line d up for the next few months, but nonetheless I've also been adding some more to the list.

Read the rest of this post

Tue, 19 Nov 2013

A really simple server ... that really makes me happy

I just completed my first source code edit and custom recompile. In C, no less.

Granted, the program I modified is all of 200 lines long. And its author told me exactly how to customize it for my purposes. But still. It makes me inordinately happy that I succeeded.

A while ago I came across nweb, a program by Nigel Griffiths at IBM. The article summary from that link:

Have you ever wondered how a Web server actually works? Experiment with nweb -- a simple Web server with only 200 lines of C source code. In this article, Nigel Griffiths provides a copy of this Web server and includes the source code as well. You can see exactly what it can and can't do.

So, that's great. Nweb basically just serves static files and cannot run any server side scripts. It's basically meant to show how the very fundamentals of a web server work: receiving requests, handling them, keeping the connection open, the very basics. In the README, Griffiths writes that he originally wrote it in 100 lines of code and "[that] worked fine too but then [I] added comments, file type checks, security checks, sensible directory checks and logging". You get the idea: 200 lines of code that are written just to make the most minimum steps of a http request work, and do a few checks to make sure nothing dangerous happens, and no more. Oh, and log what happens so that people looking at the code to learn how this web thing works get some information about what's happening.

In the article, in describing nweb's features, Griffiths lists the filetypes which it can serve:

nweb only transmits the following types of files to the browser :

  • Static Web pages with extensions .html or .htm
  • Graphical images such as .gif, .png, .jgp, or .jpeg
  • Compressed binary files and archives such as .zip, .gz, and .tar

And he adds:

If your favorite static file type is not in this list, you can simply add it in the source code and recompile to allow it.

Well, I was musing and thought that nweb was maybe just a bit too simplistic: I think css at least is a fundamental part of a real website. But then I remembered that I have poked around in C files before, and I thought I might give it a try. So here's a really brief not-quite-tutorial runthrough of how I served real web pages with nweb.

Compile nweb

The instructions in the README.txt file were pretty complete. After downloading and extracting the source code, I opened up the C file and sure enough, found a 'struct' as follows:

struct {
    char *ext;
    char *filetype;
} extensions [] = {
    {"gif", "image/gif" },  
    {"jpg", "image/jpg" }, 
    {"png", "image/png" },  
    {"ico", "image/ico" },  
    {"zip", "image/zip" },  
    {"gz",  "image/gz"  },  
    {"tar", "image/tar" },  
    {"htm", "text/html" },  
    {"html","text/html" },  
    {0,0} };

This looked pretty self-explanatory. My code edit was simply a matter of adding a line:


before the last line of this block, to associate the file extension .css with the content type text/css. With this, nweb would know that requests for URLs ending in '.css' should be accepted and handled and it should pass the content type 'text/css' back to the browser.

Then it was a matter of compiling nweb, following the instructions in the README. A compiled binary was provided in the download (in fact, several for several different architectures were included), but I replaced it with the command:

cc nweb23.c -o nweb

This compiles the source code written in the C language (which I edited above) into a machine-code executable -- written in ones and zeroes that the computer can understand.

Starting the web server

All of this is happening on a local server I have. Now that we have the modified executable, we just need to start it up and see if it works. I won't go into the details but here is my setup:

  • nweb runs as a normal user out of the user's ~/bin directory
  • I'll run it on port 8181 so it doesn't conflict with the server's Apache web server.

We start nweb:

/home/user/bin/nweb 8181 /home/user/web

This starts nweb and tells it to listen on port 8181 and serve files from /home/user/web.

The test website

Here's the html of the file I want to view:

<!DOCTYPE html>
        <title>A simple site</title>
        <link rel="stylesheet" type="text/css" href="simple.css">
        <p>Hello, world! This was served with nweb!</p>
        <p class="styled">And I compiled nweb with css support. It wasn't very hard.</p>

There are two paragraphs, one with a class that I will target with a css rule from the linked file simple.css. Here's the only line in that file:

.styled{font-size: 20px}

So, if my modified nweb executable works, I should not only see the html file, but the second paragraph should be bigger than the first.

And sure enough!

Here is a screenshot of the test website opened in my browser. I've opened up the developer console to show the structure of the html, including the one-line content of the css file (click to open a larger image).


Making sure I'm not kidding myself

For completeness' sake, I thought I would demonstrate that the changes I made to the source code are actually effective. So I undid the changes and compiled the source again into a second file named nweb2. I stopped the nweb server and started it again by running this second version of nweb. This is what the output looks like now:


Again, click the image to view a larger copy. You can see that instead of the contents of the css file, we've been given a quite informative error message from 'this simple static file webserver'.

And that is how you modify source code to suit your needs ;)

Oh, and I'm still smiling with the sheer simplicity of this little web server. I look forward to reading it's documentation to learn more about how this and all other web servers work.

Fri, 18 Oct 2013

When context is key ...

Here's an amusing case of how not having a certain piece of knowledge (if knowledge comes in pieces) can make you totally miss the boat.

I was trying to get the web server Nginx running with cgi (Ode is a cgi script. nginx doesn't handle cgi's the same way as the Apache server but requires that they be launched via a seperate cgi process handler). Lots of things are different than my previous experience with Apache and a different Linux distribution: the default document root, the configuration file syntax, etc. But I was calmly troubleshooting one step at a time, trying to think and to remember all the usual gotcha's (like file permissions). I knew it would take a bit of work, and I wanted to understand how Nginx worked so I took it slowly.

I was using the 'fcgiwrap' program which I installed via my distribution's package manager to run ode as a cgi, making it available to nginx. This, too, was new, but I seemed to be making headway: what I learned from the first time I installed Ode was that error messages are a good thing because at least you know there's something running to send an error message back, and I saw that nginx was getting an error back from the cgi wrapper program. So, I stayed calm.

But it was starting to take a bit long ... until I finally realized what my problem was. A complete misunderstanding of how these fastcgi programs work.

Let's see if I can avoid the same pitfall I will describe in a minute when I get to the moral of this story. Essentially: where I thought I needed to launch the cgi script (ode.cgi), I actually needed to launch the cgi wrapper, and tell nginx to tell the wrapper what cgi script to run. Does that make sense? probably not (I'm thinking of people with little to no previous knowledge of these kinds of things).

Well, I'm tired out from an hour of troubleshooting so I won't explain in more detail. But how did I finally get this? by reading the documentation of a different program than the one I thought I needed to be understanding, where I saw a configuration example which clued me in. The whole time I was thinking 'fastcgi' worked one way, when really they worked differently and thus I was reversing two pieces of the puzzle.

The moral of the story? Is not that I, as the learner, need to learn to learn better. What else can I do? Eventually I noticed what the issue was, but I don't think I had much control over how long that took. The moral of the story is, rather, that documentation can be wonderful and complete, but still always assumes a certain level of background information and it can be really difficult to imagine the framework from which someone else will read your documentation.

(Incidentally I was talking about this, the level of detail to put into instructions/tutorials, with some the other day. I tend to try to back way up and give lots and lots of context. I think that can be helpful, but it sure takes a lot of energy and time -- so much that I often end up aborting the documentation effort).