The release of the OpenAI GPT-2 sample language model from the paper Language Models are Unsupervised Multitask Learners (also see Better Language Models and Their Implications) shows great promise of what is to come. The paper describes how training data was collected by following outbound links from Reddit. This got me thinking about what types of content it has seen. I have experimented with triggering recipe generation from the model by using “recipe” and similar conditioning texts.Continue reading
Not that long ago almost all political advertising was clearly visible in the physical world. Posters, newspaper ads and pamphlets were widely circulated, collected and preserved. It is not unlikely that you are able to walk into an antiquarian book shop and find examples of political propaganda hundreds of years old. Other copies will end up in attics, basements, shoeboxes and occasionally be donated to archives or sold on eBay. It was easier to see who said what and to whom. This enables modern day researchers to understand how campaigns worked and what issues people were engaging with. Such research contributes to new knowledge about our society and help us understand how we got where we are. Continue reading
After the previous experiment with a character based recurrent neural network (RNN) for romantic novel titles I wanted to find more details about word level RNN:s. I was happy to find that Sung Kim has made it easy to explore word-level RNN:s with Tensorflow in this repository. Training text is 50/50 Christmas carols and death metal lyrics.
Output examples below. (added line breaks and bolded lines that seemed like reasonable song names :-). The resulting texts seem choppier than the character level RNN I tried previously. But still fun! Continue reading
The Stupid Hackathon is a one-day event where participants conceptualize and create projects that have no value whatsoever. Me and Memeleader made Chat Roulette Kastanjett – a service for all castanet players who want to talk to other castanet players over castanet morse code. We made a small chat website (with the help of socket.io and Heroku), wired up castanets with a micro Arduino controller running morse code debugging code. And we won a prize!
Torch is a scientific computing framework with wide support for machine learning algorithms. Andrej Karpathy has an excellent blog post explaining recurrent neural networks (RNNs) and character level models. With his sample code for Torch it is very easy to get started creating your own RNN by using text from a specific domain. This model learns to predict the next character in a sequence. The model can be used to generate text character by character that looks similar to the original training data. Continue reading
A small hack to simplify the application for temporary parental benefit for care of children from the Swedish Social Insurance Agency (Försäkringskassan). Typically this involves multiple steps in a web browser. Now, moving a doll to a square and the computer will send relevant information to kindergarten and the social insurance agency.
(Cunningly illustrated using patent application art). A small Python script monitors twitter for the phrase “I hate lego”. Upon finding a matching tweet it sends a signal to an Arduino board that turns a servo pushing a small lego figurine over the edge of a table. Live example here (at approx 7:30). A hack for the Make All event in Stockholm. Continue reading
A small crab attacked my GoPro camera.
Data about public sector spending is often hard to understand and compare. Statistics about government is often presented in tables. In this hack I have used open data from Eurostat (the EU statistics agency) about government expenditure for a set of countries and rendered sets of boxes in a Minecraft world. This makes it possible to explore the data by walking (flying) around the world. Continue reading
The queue ticket machine in Uppsala, Sweden. Continue reading