Sunday, August 28, 2011

Installing hmatrix-glpk under Windows

Today I had some trouble trying to install hmatrix-glpk on my Windows machine, but I finally found a way. Here's how:

Step 1: Installing hmatrix

The package hmatrix-glpk relies on hmatrix, so it's best to install and test that first. Instructions on installing hmatrix on Windows have been provided by the author of the package. When following these instructions, it is important to pay attention to line 32:
"It may be necessary to put the dlls in the search path."
It is certainly necessary to add the DLLs to the search path. If you extracted the files to, e.g., "c:\lib\gls", then make you sure you add that directory to your PATH environment variable.

Step 2: Getting glpk

Since hmatrix-glpk is a binding for glpk, we need to find the right binaries and header files.
  1. First, create a directory $GLPK (e.g., "c:\lib\glpk") on your computer.
  2. Download glpk-4.34-lib.zip from the GnuWin SourceForge project. From this archive, extract "include/glpk.h" and copy this header file to the $GLPK directory created in the first step.
  3. Download winglpk-4.46.zip from the GLPK for Windows SourceForge project. From this archive, extract "glpk-4.46/w32/glpk_4_46.dll" and copy this file to "$GLPK/glpk.dll".
  4. Double-check: Your $GLPK directory should now contain two files, glpk.dll and glpk.h.
  5. Add $GLPK to your PATH environment variable.

Note: The GnuWin project also has glpk binaries, but on my machine, GHC couldn't load them.


Step 3: Installing hmatrix-glpk

Installing hmatrix-glpk is now easy. Just open an MSYS shell and enter:

$ cabal install hmatrix-glpk --extra-lib-dir=$GLPK \
                             --extra-include-dir=$GLPK

This should fetch the package from Hackage and install it.


Step 4: A simple test

Fire up ghci and try solving a simple problem:

ghci> import Numeric.LinearProgramming

ghci> let prob = Maximize [4, -3, 2]
ghci> let constr = Sparse [[2#1, 1#2] :<=: 10, [1#2, 5#3] :<=: 20]

ghci> simplex prob constr []
Optimal (28.0,[5.0,0.0,4.0])



UPDATE 2011-09-01: Notes for compiling

I found out today that when you compile a Haskell program, the executable will actually look for glpk_4_46.dll instead of glpk.dll. It's not a pretty solution, but you can easily solve this issue by making an additional copy of the DLL with the correct name. If I find a better solution, I'll update this blog post again.

Friday, August 26, 2011

History of Tribler

With the upcoming release of Tribler 5.4, it's time to look back at the history of Tribler.

Background and versions prior 5.0

Tribler is a peer-to-peer (P2P) file-sharing application that's being developed at the Delft University of Technology and the Vrije Universiteit Amsterdam. The universities' researchers started the Tribler project in 2005 by forking ABC 3.x (Yet Another BitTorrent Client) and adding social features to it. By introducing notions of friends and "taste buddies" (people with similar tastes), download speeds could be improved and recommendations could be given to the users. Visually, Tribler 3.x did not differ much from ABC as can be seen below.

Tribler 3.x. More screens available at Web Archive.

Tribler's GUI was given a facelift and contained a bunch of new features with the introduction of the 4.x series in 2007. People could now search for files in the Tribler network (instead of browsing in 3.7) and play them in a streaming fashion instead of downloading the full file first. The 4.x series also allowed users to view content from other video sources like YouTube, but this feature was short-lived and was scrapped starting with Tribler 5.0.

Tribler 4.x.

The 5.x series

With a new major version number, a completely new GUI was introduced yet again in 2009. The old GUI of the 4.x series was deemed to be clunky, so for Tribler 5.0, a minimalistic look was chosen. This also meant that a lot of old features like cooperative downloading were no longer visible or accessible.

Users who started Tribler 5.0 were greeted with a screen like the one below. It only contained a search box, your current sharing reputation and two links to change your settings and view your downloaded files.

Main screen of Tribler 5.0.

In 4.x, when you searched for files, the results were presented in a grid with thumbnails. This approach, however, had problems like most results not having a thumbnail at all, so it was dropped in 5.x and instead the search results were displayed in a list.

Search results in Tribler 5.0.

Version 5.1.x introduced some minor changes based on community feedback. Most of the visual changes can be found by comparing the two screenshots depicted below.

 
Tribler 5.1 (right) introduced minor changes.

New features were introduced again starting from version 5.2. Tribler 5.2 introduced the concept of channels similar to YouTube. Channels allowed users to publish BitTorrent files and to subscribe to channels. Channels help with reducing the flow of spam by favoring content from popular channels. Unfortunately, users did not receive direct benefit by subscribing to channels. For example, they were not notified when new content was available in their subscribed channels.

The introduction of channels caused the search box to change. A drop-down menu allowed users to switch between searching for files and channels. There was also a minor change in how search results were displayed. Instead of showing a file's "popularity", the number of BitTorrent seeders and leechers were shown. This change, however, was of temporary nature as Tribler 5.3 reintroduced the notion of popularity again.

Channels appearing for the first time in Tribler 5.2.

Tribler 5.2 also shipped with some code I had written. Tribler was given a "RePEX" mode that was enabled for completed, but inactive downloads. In short, the RePEX mode periodically keeps in touch with previously seen peers in a download swarm. This mode was developed as an alternative way of doing distributed tracking, but so far it is not being used to its full extent. While Tribler peers are currently tracking swarms using RePEX, the information gathered through this process cannot be queried yet by others.

The GUI in Tribler 5.3 was made to look more native.

The GUI was changed again in Tribler 5.3, although not as radical as with the transitions to 4.0 and 5.0. Most custom GUI widgets were replaced with native controls. Further changes included the pagination of the search results being replaced by a single scrollable, sortable and filterable list, the settings screen being replaced by a dialog window, and the search box's behaviour being changed to perform both file and channel search simultaneously.

The formerly empty main screen of Tribler was also changed. We found that many new users were not able to formulate successful queries. To address this issue in Tribler 5.3, users were now greeted with an animated term cloud (code-named "NetworkBuzz" and developed by yours truly, amongst others), which showed what's hot in the Tribler network. Clicking on any of the terms would initiate a search. To allow people to go back to the main screen, a "Home" button was included in Tribler's navigation row.

"NetworkBuzz" making its appearance in Tribler 5.3's main screen.


Beyond Tribler 5.3

For now, this concludes the history of almost 6 years of Tribler. The Tribler project is still alive and kicking, and thus its history is being written as we speak. What will the future bring us? Who knows, but for the time being it will be Tribler 5.4.


Sneak peak of Tribler 5.4.



Monday, August 8, 2011

Knight-Mozilla Learning Lab – Software product proposal

Introducing LikeLines




LikeLines unlocks user-sourced video and serves as a building block for rich news story-telling. Interesting bits of a video emerge naturally through community interaction with the video using an intelligent video player, enabling video navigation, browsing, retrieval and linking at the fragment level.

Design and prototype

LikeLines transforms existing video players into intelligent ones by adding heatmap navigation below the player. The heatmap shows which parts of a video are found to be interesting by the viewer community and allows viewers to jump to these interesting bits right away. A prototype showcasing the heatmap can be viewed below.

Click here to try the prototype in action!


The hotspots in the heatmap are generated through interaction with the video. When a user explicitly expresses "liking" at a particular time point during playback of the video, this act of expression together with the current playback position of the video is stored in the system. The system aggregates over this feedback to derive the "hottest" points in a video. In addition to explicit feedback, implicit feedback in the form of playback and seeking behavior is also used. Information on users playing, pausing, re-playing and seeking in the video can be used to refine existing or discover new hot points.


The LikeLines system consists of two components: a client-side script that extends existing video players and a LikeLines repository server that is responsible for aggregating user feedback and deriving the hotspots in the video. The LikeLines API allows the web application developer to pick any source of videos and any LikeLines repository. The LikeLines repository stores and allows applications to retrieve heatmap metadata (i.e., time-code specific popularity information about specific videos). The key, innovative contribution that LikeLines makes to unlocking video is the collection and management of heatmap metadata, which can be applied in wide variety of use cases.


Integration into existing newsroom infrastructure

There are many examples where LikeLines can be used:
Tips bin: If news organizations open up their tips bin such that visitors can see submitted videos, visitors can already begin interacting with these videos (liking/seeking) and thereby annotating them. This eases the task of the news staff of sorting through the submitted videos as they can focus on the highlights.
Archive: When LikeLines is deployed in archives, users can find the most popular past segments, which will fuel ideas for new story subjects. It can be used in both private archives and public archives (e.g., Dutch Footage). Related videos can be linked at the fragment level, allowing discovery of new and interesting patterns.
Web monitoring tools: When LikeLines is adopted externally, e.g., on YouTube, monitoring of these video sites can be improved. Instead of indiscriminately showing everything, snippets based on the hottest parts of a video can be generated and displayed instead.
When building LikeLines and these tools, it is important to work closely with both journalists and end-users. End-users need to be able to understand and use the LikeLines interface if we want to generate heatmaps effectively. On the other hand, the metadata coming from LikeLines needs to be sufficiently suitable for the purposes of journalists.

Collaborative power

LikeLines combines eyewitnesses, Internet viewers and news reporters into a strong collaborative workforce. Eyewitnesses can capture news on the street using their cellphones and upload their raw videos onto the web. No editing is needed. Instead, viewers watching these videos are annotating which parts are hot through their interaction with the LikeLines player. News reporters can then process these enriched videos by extracting the interesting bits and weaving a story out of it.



Challenges and unknowns

  1. How to interpret user clicks on the like button and their implicit playback behavior? How to amalgamate and denoise user input?
    When a user clicks the like button, it is not certain if the "like" should apply to this position or a position several seconds earlier. A user study involving an early working prototype is needed to address this aspect of the concept and also refine the user interface and determine the optimal algorithm for aggregation of the heatmaps of multiple users.
  2. How to deal with the cold start problem, i.e., unwatched videos?
    For new videos, the user-feedback process can be jump-started by generating an initial heatmap, for example using multimedia content analysis (MCA). We need to address the issue of finding platforms with sufficient computational capacity for MCA and motivating them to make the necessary investment to generate initial heatmaps. Further, platforms using LikeLines need to make sure fresh content is highlighted so that the process of aggregating user feedback starts as soon as possible. Attention should be devoted to the development of mechanisms for incentivizing users (e.g., via awards such as access to premium content) to contribute user feedback for fresh video.
  3. How to ensure a large user-base?
    The success of LikeLines will requires that the system be used by a critical mass of viewers in order to generate useful heatmap. To ensure a sufficiently large user-base, LikeLines is designed as an open and versatile building block such that it can easily be integrated in existing web applications.

Executive summary

  • Gets to the core of news quickly and effectively — as stories are breaking.
  • Supports creation of news stories attuned to current viewer concerns by exploiting the compelling story-telling power of first-hand accounts and user-sourced video.
  • Solves the problem of time-consuming sifting through user-sourced video, which can be critical under deadline pressure.
  • Competes effectively in current user-sourced footage landscape where coverage is low because individuals must filter raw footage.
  • Makes it affordable for news organizations to be present along more steps of the user-driven production-through-consumption chain.

Related projects

Juan Gonzalez is working on a dashboard that helps users to quickly scan a large stream of videos. In his Tribal Mix dashboard system, airtime reflects popularity votes for the entire video. Most popular videos are summarized as animated thumbnails. LikeLines could serve as a building block for the dashboard's visual summarization back-end by supplying the underlying timeline-specific popularity weights.