Thursday, December 1, 2011

A rant about Scala vs. Haskell

After two weeks of non-stop Scala programming I now feel comfortable working with implicits, abstract member types, path-dependent types, variance annotations, existential types and everything else. We even used the CPS transformation plugin as a Cont-Monad replacement, already.

Scala's type system is powerful but also complex right from the start. With Haskell you can use most basic libraries without enabling GADTs, RankNTypes, OverlappingInstances and TypeFamilies or having to learn about them. And you can take your time to learn all these concepts. The containers API is really simple compared to Scala's collections API. On the other hand in Haskell you have to learn about monads quite early. (Oh, I miss this warm fuzzy feeling in Scala!)

We're using Scala for integration with existing Java code and libraries and that's working really well. As a "stand alone" language I still prefer Haskell and I doubt that's going to change with more Scala experience. For me as a functional programmer subtyping doesn't add much more than Java compatibilty. Maybe I have to change but IMHO it makes API design and the type system too complex. I'm still overwhelmed by the "default" design possibilities: type parameters or abstract member types? Sealed case classes or open subtyping? Implicit parameters and conversions (aka type classes) offer even more design choices! Maybe I'm just more experienced with Haskell but mostly I just decide between "data" and "class" and with higher-order functions that's enough 95% of the time. For the missing 5% we have extensions (and FlexibleX + Rank2Types + TypeFamilies + ExistentialQuantification is enough for me).

Often people complain about GHC's error messages but they're really great compared to scalac's. I also miss STM. Actors are nice, but using react doesn't feel natural and we can't just use "receive" with thousands of threads "without thinking" just like in Haskell. (We're now using react with the CPS transform plugin to hide the fact that internally everything is implemented using continuations.)

I don't miss lazyness, yet. I've used call-by-name and lazy vals on some occasions and I think I prefer strict by default with optional lazyness annotations. I just can't think "the lazy way" - even after 18 months of full-time Haskell coding.

I do miss control over side effects. When writing a higher order function I always get scared when I realize that I can only expect the caller to provide a pure function but the function could be doing all sorts of things behind my back!

One thing I'd like as a GHC extensions is explicit dictionary passing. It's nice that implicit parameters can be used implicitly but also explicitly. Of course it's risky, too, because nobody can stop you from using a different Ord instance every time you're inserting into a Set. "unsafeCoerce" seemed so evil - until erasure forced me to use foo.asInstanceOf[...] (Of course an exception is not as bad as a segfault.) Sometimes I know what I want and I'm willing to take the risk. As long as all those features are not enabled by default I think hard enough before enabling {-# LANGUAGE IncoherentInstances #-}.

Scala might be the best way to live between worlds. But you have to understand both worlds and you're still in between. It's not twice as good. If I have to live in OO world I prefer having a functional world in addition so I'd chose Scala over Java at any time. But if I can choose freely I still prefer the pure Haskell world.

Sunday, October 30, 2011

WifiAlive Android App released

WifiAlive is a very simple and low resource application aimed at maintenance of a Wifi connection. A connection is periodically (every 5 seconds) checked by trying to contact the gateway, as well as IP addresses or domain names which you can enter via the UI. WifiAlive tests each of these addresses by icmp and http head. If one of these tests succeeds, Wifi is considered up. If not, Wifi is reset (that is deactivated and activated). After Wifi (re-)activation Android on its own tries to establish a connection to an access point in reach. On a failed connection check, WifiAlive will only wait 20 seconds for a reset, scan and successful connection until the connection is reset again. Considering the time needed for a Wifi reset and a scan to complete, it can be said, that, as soon as a known network is available, there is a worst-case of roughly 30 seconds for the connection to be established. WifiAlive is thus much more simple then many of its competitors. The popular WifiFixer for example conveys about 6000 lines of code whereas WifiAlive has only around 500.

The code of WifiFixer is publicly available on github. I have checked those parts of the code which seemed important in respect of a problem I believed to have with WifiFixer 0.8.0.6. I was rather stunned by the complexity of the program, since I had in mind the solution I have just proposed. WifiFixer almost replaces the Wifi Settings Tab in Android and will continue this approach in upcoming versions. In some cases, settings from the Android Wifi Settings Tab can be overridden, it does its own management of known networks and behind the scenes does scans for networks in reach, matches them to known networks and harbours a small collection of hacks for devices that have to be dealt with in a special way - for example the Google Nexus One. WifiFixer is a great application. It helps many people with their Wifi troubles. But after I had seen its complexity and, in addition, found a bug in its matching of known to found networks on a scan, I decided that this was not the solution to my problem, as, additionally, a lot of code seemed to do things that Android normally does on its own. No offence David, I've learned a lot from looking at your code.

There are other applications, like Advanced Wifi Lock. But I couldn't get access to their code and very often the free (limited) versions did not seem to cover all aspects of Wifi maintenance that I wanted to ensure. Moreover I am already beyond trusting a random application from the market without knowing its code. The Advanced Wifi Lock application (a bare Wifi lock, if it does what the name suggests) does not solve my problems, as it sometimes takes Android a bit too much of time to re-establish a connection when re-entering the area of an access point or on walking from one access point to another. In my experience this could take up to two minutes, which was just too long for my purpose. These experiences may be better on other devices, still it feels very good to know, that as soon as a known network is in reach, there is a rough 30 second worst-case until you have a connection.

On some devices there also seem to be problems with the supplied Wifi drivers. These drivers may enter an undetermined state or cause wpa supplicant to run in endless loops. I have not delved into these problems. But it seemed to me, that, if at all, they can be resolved by doing a Wifi reset.

To sum up, I think that some applications aimed at taking care of your Wifi connection may be too complicated and question is, why such an application should do things like scanning for access points, as Android usually does a pretty good at this. Others may be doing the things I had in mind, yet there was no way for me to be sure.

That's, why I wrote WifiAlive. I hope you like it.

Feel Free to browse the code at http://darcs.factisresearch.com/pub/WifiAlive or just install the app from the android market.

Sunday, October 9, 2011

stm-stats: Retry statistics for STM transaction

Two days ago, Stefan Wehr talked at the Haskell in Leipzig workshop about the experiences that factis research had with Haskell when creating real-world applications. When he mentioned that we have code that keeps count of retries of STM transaction, someone from the audience asked us to publish the code. So, after some refactoring to make the interface generally usable, here it is. The stm-stats package provides a few functions (in Control.Concurrent.STM.Stats) that can replace atomically: trackSTM and variants that allow you to name the transaction or have more control about the warnings that will be emitted when the number of retries exceeds a certain value. The following code demonstrates how the module works:
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad

import Control.Concurrent.STM.Stats

main = do
var <- trackSTM $ newTVar 0
forkIO $ forM_ [1..23] $ \i -> do
    threadDelay (100*1000)
    trackNamedSTM "writer" $ writeTVar var i
putStrLn "Starting reader..."
trackNamedSTM "reader" $ do
    i <- readTVar var
    when (i < 23) retry
putStrLn "Reader finished."
dumpSTMStats
When run, you will see this output:
Starting reader...
STM transaction reader finished after 23 retries
Reader finished.
STM transaction statistics (2011-10-09 16:26:27.226675 UTC):
Transaction     Commits    Retries      Ratio
_anonymous_           1          0       0.00
reader                1         23      23.00
writer               23          0       0.00
PS: As I usually post on my own blog, I should explain why I from now on will also post on the factis research company blog. Factis research relies heavily on Free Software and wants to contribute back to the community where possible. But in the course of every-day work, this sometimes falls by the wayside. Therefore, I was hired to help out as the community interface: My task is identifying, packaging and uploading components of internal code that are of wider interest, following up on user requests and bug reports, and talk about it. This module is the first brought to you by this new strategy, but expect more to come.

Saturday, October 8, 2011

Talk about developing commercial software in Haskell

Yesterday, I gave a talk at the Haskell in Leizpig workshop about our experience in developing commercial software in Haskell. You can find the slides (in german) here. I was really surprised by the large number of attendees (about 50), given that the workshop's language was german (except the very interesting talk by Kevin Hammond).

Saturday, October 1, 2011

New Version of HTF with Diffs, Colors, and Pretty-printing

I've just uploaded version 0.8.1.0 of HTF to hackage. HTF (Haskell Test Framework) allows you to define unit tests, QuickCheck properties, and black box tests in an easy and convenient way. We use HTF at work all the time, where it has proven to be quite valuable for organizing our test suite. An earlier blog post describes HTF in more detail.

The new version comes with some cool new features:

  • Support for diffs and pretty-printing. If an equality assertions fails, you now get a proper diff of the two values involved. If possible, the values are pretty-printed (thanks to Edward Yang and his groom package for inspiration). Here's an example:
    [TEST] Main:diff (TestHTF.hs:68)
    assertEqual failed at TestHTF.hs:68
    * expected:
    PlusExpr
      (PlusExpr
         (MultExpr
            (PlusExpr (Variable "foo")
               (MultExpr (Literal 42) (Variable "bar")))
            (PlusExpr (Literal 1) (Literal 2)))
         (Literal 581))
      (Variable "egg")
    * but got:
    PlusExpr
      (PlusExpr
         (MultExpr
            (PlusExpr (Variable "foo")
               (MultExpr (Literal 42) (Variable "bar")))
            (PlusExpr (Literal 2) (Literal 2)))
         (Literal 581))
      (Variable "egg")
    * diff:
    6c6
    <         (PlusExpr (Literal 1) (Literal 2)))
    ---
    >         (PlusExpr (Literal 2) (Literal 2)))
    *** Failed!
        
  • As you can see from the example above, HTF now supports colored output.
  • There's a new commandline option --quiet which causes HTF to produce output only if absolutely necessary (e.g. for failed test cases).

Just get HTF from hackage, it now also works with GHC 7.0.* and 7.2.1!

Wednesday, May 18, 2011

xmlgen: a feature-rich and high-performance XML generation library

I’ve released xmlgen to hackage just a few days ago. Xmlgen is a pure Haskell library with a convenient API for generating XML documents. It provides support for all functionality defined by the XML information set and offers good performance and low memory usage. In our company, we developed xmlgen because we wanted the readability of XML literals (as for example provided by the hsp library) without the drawbacks of a custom preprocessor (wrong line numbers in error messages, non-compositionality).

In this blog article, I’ll show you how to use the combinators provided by xmlgen to generate the following XML document:

<?xml version="1.0"?>
<people>
  <person age="32">Stefan</person>
  <person age="4">Judith</person>
</people>

First, we import some modules:

> import Data.Monoid
> import qualified Data.ByteString.Lazy as BSL
> import Text.XML.Generator -- the main xmlgen module

Then we continue by generating the person element.

> genPersonElem :: (String, String) -> Xml Elem
> genPersonElem (name, age) =
>     xelem "person" $ xattr "age" age <#> xtext name

The xelem combinator constructs an XML element from an element name and from the children of the element. Xmlgen provides overloaded variants of xelem to support a uniform syntax for the construction of elements with qualified and unqualified names and with different kinds of children. The <#> combinator separates the element’s attributes from the other children (sub-elements and text nodes). The combinators xattr and xtext construct XML attributes and text nodes, respectively.

The result of an application of xelem has type Xml Elem, whereas xattr has result type Xml Attr. This distinction is important so that attributes and elements can not be confused. The result type of the xtext combinator is Xml Elem; we decided against an extra type for text nodes because for xmlgen’s purpose text nodes and elements are almost interchangeble.

The types Xml Elem and Xml Attr are both instances of the Monoid type class. Constructing a list of elements from a list of persons and their ages is thus quite easy:

> genPersonElems :: [(String, String)] -> Xml Elem
> genPersonElems = foldr mappend mempty . map genPersonElem

The pattern shown above is quite common, so xmlgen allows the following shorthand notation using the xelems combinator.

> genPersonElems' :: [(String, String)] -> Xml Elem
> genPersonElems' = xelems . map genPersonElem

We are now ready to construct the final XML document:

> genXml :: Xml Doc
> genXml = let people = [("Stefan", "32"), ("Judith", "4")]
>          in doc defaultDocInfo $ xelem "people" (genPersonElems people)

For convenience, here is a standalone variant of the genXml function:

> genXml' :: Xml Doc
> genXml' =
>   let people = [("Stefan", "32"), ("Judith", "4")]
>   in doc defaultDocInfo $
>        xelem "people" $
>          xelems $ map (\(name, age) -> xelem "person" (xattr "age" age <#> xtext name)) people

Xmlgen supports various output formats through the overloaded xrender function. Here we render to resulting XML document as a lazy byte string:

> outputXml :: IO ()
> outputXml = BSL.putStrLn (xrender genXml')

Loading the current file into ghci and evaluating outputXml produces the following result:

*Main> outputXml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<people
><person age="32"
>Stefan</person
><person age="4"
>Judith</person
></people
>

This blog post introduced most but not all features of the xmlgen API. Check out the documentation.

Happy hacking and have fun!

Author: Stefan Wehr

Friday, April 29, 2011

Empty GHC profile files and signal handling

We recently installed signal handlers in our Haskell application to clean up some stuff on program termination. Later we realised that we couldn’t do profiling anymore because the generated .prof files were empty. We were catching SIGTERM and SIGINT (using System.Posix.Signals) to run our cleanup signal handler. As signal handlers are run in their own thread, we had to use “exitImmediately” (from System.Posix.Process) instead of “exitWith” to terminate the application. Unfortunately that call really exits immediately and doesn’t give the RTS the opportunity to write the profile.

The GHC Commentay  http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals explains that the default interrupt handler calls ”interruptStgRts” to exit the program. We now import this C-function via FFI and call it in our signal handler.

Update! We lose the ability to specify an exit code when using “interruptStgRts” so we now call “shutdownHaskellAndExit” instead.

Here’s an example program that demonstrates the use of System.Posix.Process and the FFI call to “shutdownHaskellAndExit”:

{-# LANGUAGE ForeignFunctionInterface #-}
import Control.Concurrent
import Foreign.C( CInt )
import System.IO
import System.Posix.Process
import System.Exit
import System.Posix.Signals

foreign import ccall "shutdownHaskellAndExit" shutdownHaskellAndExit :: CInt -> IO ()

firstSigINT :: IO ()
firstSigINT =
    do hPutStrLn stderr "caught interrupt (shutdown takes 2s)"
       hFlush stderr
       installHandler keyboardSignal (Catch secondSigINT) (Just emptySignalSet)
       threadDelay (2*1000*1000)
       shutdownHaskellAndExit 13

secondSigINT :: IO ()
secondSigINT =
    do hPutStrLn stderr "forcing immediate exit"
       hFlush stderr
       exitImmediately (ExitFailure 13)

main =
    do installHandler keyboardSignal (Catch firstSigINT) (Just emptySignalSet)
       let busyloop i = busyloop (i+1)
       busyloop 0

Author: David Leuschner

Thursday, April 14, 2011

Re-Signing von iOs-Apps

iOs-Apps die als Distribution mit einem AdHoc-Profil verbreitet werden, müssen entweder neu gebaut oder neu signiert werden, wenn das Profil ausläuft und erneuert wird.

Mit folgenden, einfachen Schritten ist ein Signieren einer bestehenden ipa-Datei möglich:

  1. in ein Temp-Verzeichnis wechseln:
    cd tmp

  2. ipa-Datei auspacken:
    unzip .ipa
    Ergebnis: Unterordner "Payload", der einen Ordner für die App mit dem Name "Appname.app" enthält

  3. AdHoc-Profil ersetzen:
    cp .mobileprovision Payload/.app/embedded.mobileprovision

  4. Neu signieren:
    codesign -f -vv -s "iPhone Distribution" Payload/.app

  5. Packen:
    zip -r .ipa Payload

Autor: Dirk Spöri

Thursday, February 17, 2011

Microbenchmark of Haskell MD5 implementations

We're just cleaning up our code including our long list of library dependencies. We have a dependency to libcrypto.so from openssl introduced by the usage of nano-md5. There are (at least) two other packages that provide md5 implementations: pureMD5 and cryptohash. The following microbenchmark compares them using criterion:

import qualified Data.ByteString as BS
import qualified Data.ByteString.Lazy as BSL

import qualified Data.Digest.OpenSSL.MD5 as NanoMD5
import qualified Data.Digest.Pure.MD5 as PureMD5
import qualified Crypto.Hash.MD5 as ChMD5

import Numeric (showHex)
import Criterion.Main (defaultMain, bench, nf)
import System.Environment (getArgs)

main =
do x <- BSL.readFile "/lib/libc.so.6"
BSL.length x `seq`
defaultMain [ bench "cryptohash" $ nf ch x
, bench "nano" $ nf nano x
, bench "pure" $ nf pure x
]

go :: BS.ByteString -> Int -> [String] -> String
go bs n acc
| n `seq` bs `seq` False = undefined
| n >= 16 = concat (reverse acc)
| otherwise = go bs (n+1) (draw (BS.index bs n) : acc)

draw w =
case showHex w [] of
[x] -> ['0', x]
x -> x

nano :: BSL.ByteString -> String
nano lazy =
let strict = BS.concat $ BSL.toChunks lazy
in NanoMD5.md5sum strict

pure :: BSL.ByteString -> String
pure = show . PureMD5.md5

ch :: BSL.ByteString -> String
ch bs = go (ChMD5.hashlazy bs) 0 []
On my Intel i7 860 the results are:

benchmarking cryptohash
mean: 2.886409 ms, lb 2.885261 ms, ub 2.887946 ms, ci 0.950
std dev: 6.718781 us, lb 5.388123 us, ub 8.687483 us, ci 0.950

benchmarking nano
mean: 2.704862 ms, lb 2.704301 ms, ub 2.706016 ms, ci 0.950
std dev: 3.970617 us, lb 2.260051 us, ub 7.420531 us, ci 0.950

benchmarking pure
mean: 10.27061 ms, lb 10.26878 ms, ub 10.27485 ms, ci 0.950
std dev: 13.54268 us, lb 7.179537 us, ub 27.38285 us, ci 0.950

Author: David Leuschner