Learn Haskell Fast and Hard

用 Haskell 开阔你的视野

Magritte pleasure principle

我坚信所有开发者的都该来学 Haskell. 我不认为所有人应该成为 Haskell 的超级忍者, 但至少, 他们能发现 Haskell 的存在提供些什么. 学习 Haskell 打开你的思维.

主流语言共通的基础:

  • 变量
  • 循环
  • 指针1
  • 数据类型, 对象和类(对于大多数)

Haskell 非常特别. 这门语言应用了很多我从未听说的概念. 其中很多会助你成为更好的程序员.

然而, 学习 Haskell 可能很难. 对我就这样. 在这篇文章, 我尝试提供我学习过程中所认为缺少的.

这篇文章的进度会很难被跟上. 这是有意为之的. 世上没有学习 Haskell 的捷径. 只有难题和挑战. 但我相信这是一件好事. 因为 Haskell 很难, 所以她会很有趣.

学 Haskell 的常规办法是的看两本书. 第一本 “Learn You a Haskell” 接着是 “Real World Haskell”. 我也相信这是正确的途径/ 但是, 为了学会所有 Haskell 索要表达的, 你需要阅读全部的细节.

同时, 本文对 Haskell 各个方面一个简短和浓缩的概览. 我还加了我学习 Haskell 时我所缺少的资源.

本文包括五个部分:

  • 介绍: 一个简短的例子来表明 Haskell 的友好性.
  • 基础的 Haskell: Haskell 语法, 和一些必需的记法.
  • 难点:
    • 函数风格; 一个渐进的例子, 从命令式到函数式风格
    • 类型; 类型和标准 binary tree 的例子
    • 无穷类型; 操作一个 infinite binary tree!
  • 超难的地方
    • 处理 IO; 一个很迷你的例子
    • IO 手法解析; 我理解 IO 所缺失的隐藏细节
    • Monads; 难以置信我们能怎样生成
  • 附录:
    • 更多关于 infinite tree; 关于 infinite trees 更数学化的讨论

注: 每次你都会看到一个文件后缀为 .lhs 的分隔, 你可以点击文件名下载这些文件. 如果你把文件存为filename.lhs, 你可以运行之

runhaskell filename.lhs

某些大概不能运行, 但绝大多数该可以. 你应该能看到这下面有个链接.


01_basic/10_Introduction/00_hello_world.lhs

介绍

安装

Haskell logo

工具:

  • ghc: 编译器, 类似于对 C 而言的 gcc.
  • ghci: 交互式的 Haskell (REPL)
  • runhaskell: 不编译而直接运行 Haskell 程序. 方便, 但对比经过编译的程序显得很慢.

别担心

The Scream

很多 Haskell 的书籍文章一开始的介绍用的是那些深奥的程序(快速排序, Fibonacci, 等等...). 我想做的恰恰相反. 首先我不会给你看任何 Haskell 的超能力. 我会从 Haskell 与其他编程语言之间的相似点开始. 让我们开始看惯例的 "Hello World".

main = putStrLn "Hello World!"

要运行的话, 可以把代码存为 hello.hs 然后执行:

~ runhaskell ./hello.hs Hello World!

你也可以下载这份文本的 Haskell 代码. 你应该能看到标题"介绍"上方有个链接. 把文件下载保存为 00_hello_world.lhs 然后执行:

~ runhaskell 00_hello_world.lhs
Hello World!

01_basic/10_Introduction/00_hello_world.lhs


01_basic/10_Introduction/10_hello_you.lhs

现在, 一个询问你姓名并根据你输入的姓名回答"Hello"的程序:

main = do
    print "What is your name?"
    name <- getLine
    print ("Hello " ++ name ++ "!")

首先, 让我们跟一些命令式语言写的类似程序对比:

 # Python
print "What is your name?"
name = raw_input()
print "Hello %s!" % name
 # Ruby
puts "What is your name?"
name = gets.chomp
puts "Hello #{name}!"
// In C
 #include <stdio.h>
int main (int argc, char **argv) {
    char name[666]; // <- An Evil Number!
    // What if my name is more than 665 character long?
    printf("What is your name?\n");
    scanf("%s", name);
    printf("Hello %s!\n", name);
    return 0;
}

结构是一样的, 但有些语法差异. 这分教程主体部分会专门用来解释原因.

Haskell 里有个main 函数, 然后每个对象有它的类型. main 的类型是 IO (). 意味着, main 将引起副作用.

反正记住 Haskell 可以更像主流命令式语言.

01_basic/10_Introduction/10_hello_you.lhs


01_basic/10_Introduction/20_very_basic.lhs

基础的 Haskell

Picasso minimal owl

继续文章之前你需要注意 Haskell 一些本质的属性.

函数式

Haskell 是一门函数式语言. 如果你有命令式语言背景, 那你会有很多需要学的. 希望其中很多心概念甚至能在命令式编程里帮到你.

自动静态类型

不像C, C++ or Java里那个样子, 类型系统这里使用来帮助你的.

一般来说你的函数不会修改任何外部的内容. 就是用, 她无法修改变量的值, 不能获取用户输入, 不能读取屏幕字符, 不能发射火箭. 另一面, 并行计算是很容易实现的. Haskell 理清了哪里副作用将出现以及哪里你是纯的. 同时, 难懂你的程序将会容易非常多. 纯的部分程序很多 bug 都被预防掉了.

而且纯函数遵循着 Haskell 的基本定律.

用相同的参数调用相同的函数往往有相同的返回值.

惰性

默认的惰性是种很不寻常的语言设计. 默认情况下, Haskell 只有在必要是才进行求值. 一种结果是, 她提供给常优雅的方式来处理无限结构.

最后一个关于你该怎样阅读 Haskell 代码的注意点. 对我, 那就像阅读科学文献. 一些部分很清晰, 但等你看到个公式, 就要集中精神慢慢读. 同时, 学习 Haskell, 你不懂语法的细节真的不很重要. 如果你遇到 >>=, <$>, <- 或者其他古怪的符号, 干脆跳过去看代码的流程.

函数声明

你大概习惯这样声明函数

In C:

int f(int x, int y) {
    return x*x + y*y;
}

In Javascript:

function f(x,y) {
    return x*x + y*y;
}

in Python:

def f(x,y):
    return x*x + y*y

in Ruby:

def f(x,y)
    x*x + y*y
end

In Scheme:

(define (f x y)
    (+ (* x x) (* y y)))

最后, Haskell 的方式:

f x y = x*x + y*y

很干净. 没括号, 没 def.

记住, Haskell 经常用到函数和类型. 就这样定义它们非常简单. 语法为了这些对象专门考虑的.

类型的例子

通常的办法是声明你函数的类型. 这不是非有不可的. 编译器足够聪明去替你给出类型.

Let’s play a little.

我们写写看.

-- We declare the type using ::
f :: Int -> Int -> Int
f x y = x*x + y*y

main = print (f 2 3)
~ runhaskell 20_very_basic.lhs
13

01_basic/10_Introduction/20_very_basic.lhs


01_basic/10_Introduction/21_very_basic.lhs

再试试

f :: Int -> Int -> Int
f x y = x*x + y*y

main = print (f 2.3 4.2)

你看到这个报错:

21_very_basic.lhs:6:23:
    No instance for (Fractional Int)
      arising from the literal `4.2'
    Possible fix: add an instance declaration for (Fractional Int)
    In the second argument of `f', namely `4.2'
    In the first argument of `print', namely `(f 2.3 4.2)'
    In the expression: print (f 2.3 4.2)

问题: 4.2 不属于 Int.

01_basic/10_Introduction/21_very_basic.lhs


01_basic/10_Introduction/22_very_basic.lhs

解决方案, 别声明 f 的类型. Haskell 会为我们推断出尽可能通用的类型:

f x y = x*x + y*y

main = print (f 2.3 4.2)

运行成功. 好, 我们不需要声明新函数的每个类型. 比如, 用 C, 你得声明函数为int, 为 float, 为 long, 为 double, 等等...

但是, 我们该怎么去声明类型? 要看 Haskell 推出来的类型, 得运行 ghci:


% ghci
GHCi, version 7.0.4: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Prelude> let f x y = x*x + y*y
Prelude> :type f
f :: Num a => a -> a -> a

呃? 这个古怪的类型什么意思?

Num a => a -> a -> a

首先, 仔细看右边的部分 a -> a -> a. 为了理解, 来看一列步进的例子:

类型的写法 意义
Int Int类型
Int -> Int 类型由 IntInt 的函数
Float -> Int 类型由 FloatInt 的函数
a -> Int 任意类型到 Int 的函数
a -> a 类型由某个 a 到同个 a 的函数
a -> a -> a 接收两个参数类型 a 返回 a 类型的函数

在类型 a -> a -> a 里, 字母 a 是个 类型变量. 意思是f 是有两个参数的函数而且两个参数和返回值是相同的类型. 类型变量 a 能表示多种类型. 比如 Int, Integer, Float...

所以不像强制类型的 C 声明函数为 int, long, float, double, 等等... 我们定义函数就像是动态语言.

通常 a 能是任意类型. 比如 String, Int, 也有更复杂的类型, 像 Trees, 其他的函数, 等等... 不过这里我们的类型加了前缀 Num a => .

Num 是个 类型类. 类型类可以理解为一些类型的集合. Num 仅包含表现为数字的几个类型. 更简单说, Num 类中的类型能运行一套特定的函数, 尤其是 (+)(*).

类型类是很强大的语言构建. 这能被用来做一些特别强大的事. 后面会再细说.

最后, Num a => a -> a -> a 是说:

a 表示一个属于 Num 类型类的类型. 这是个类型由 a 到 (a -> a) 的函数.

是的, 怪. 事实上 Haskell 没一个函数真有两个参数. 反而所有函数都只有一个参数. 但我们会发现接收两个参数等同于接收一个参数再返回一个来接收第二个参数的函数.

更直白地 f 3 4 等于是 (f 3) 4. 注意 f 3 是个函数:

f :: Num a :: a -> a -> a

g :: Num a :: a -> a
g = f 3

g y3*3 + y*y

函数有另一个记法. lambda 记法让我们能创建函数而不带函数名. 称为匿名函数. 像这样写:

g = \y -> 3*3 + y*y

\ 是因为它像 λ 而且属于 ASCII.

如果你没习惯函数式编程, 你的脑子该开始热了. 可以去做真实应用了.

01_basic/10_Introduction/22_very_basic.lhs


01_basic/10_Introduction/23_very_basic.lhs

但那之前, 我们要确认下类型系统能像预期那样运行:

f :: Num a => a -> a -> a
f x y = x*x + y*y

main = print (f 3 2.4)

运行正常, 因为, 3 同时是 Float 和 Integer 合法的分数写法. 因为 2.4 是分数, 3 也就作为分数解释.

01_basic/10_Introduction/23_very_basic.lhs


01_basic/10_Introduction/24_very_basic.lhs

如果强制让函数接收两个不同的类型, 运行会出错:

f :: Num a => a -> a -> a
f x y = x*x + y*y

x :: Int
x = 3
y :: Float
y = 2.4
main = print (f x y) -- won't work because type x ≠ type y

编译器报错. 两个参数必须类型一致.

如果你坚持这个想法不好, 那么就让编译器帮你从一个类型转到另一个, 你很应该看这个不错(也有趣)的视频: WAT

01_basic/10_Introduction/24_very_basic.lhs

基础的 Haskell

Kandinsky Gugg

我建议你这部分只是浏览. 当作一份手册. Haskell 有大量特性. 很多信息这里没有给出. 看起费解的记法了可以回来看.

我用 符号标示两个表达式等价. 这是个元符号, 不属于 Haskell 语法. 我也会用 标示 Haskell 表达式的返回值.

记法

算术
3 + 2 * 6 / 3 ⇔ 3 + ((2*6)/3)
逻辑
True || FalseTrue
True && FalseFalse
True == FalseFalse
True /= FalseTrue  (/=) is the operator for different
Powers
x^n     for n an integral (understand Int or Integer)
x**y    for y any kind of number (Float for example)

Integer 没有上限, 而你的设备会到极限:

4^103
102844034832575377634685573909834406561420991602098741459288064

对! 还有分数 FTW! 但你需要 import 一个模块 Data.Ratio:

$ ghci
....
Prelude> :m Data.Ratio
Data.Ratio> (11 % 15) * (5 % 3)
11 % 9
Lists
[]empty list
[1,2,3]List of integral
["foo","bar","baz"]List of String
1:[2,3][1,2,3], (:) prepend one element
1:2:[][1,2]
[1,2] ++ [3,4][1,2,3,4], (++) concatenate
[1,2,3] ++ ["foo"]ERROR StringIntegral
[1..4][1,2,3,4]
[1,3..10][1,3,5,7,9]
[2,3,5,7,11..100]ERROR! I am not so smart!
[10,9..1][10,9,8,7,6,5,4,3,2,1]
Strings

Haskell 里 strings 是 Char 组成的 list.

'a' :: Char
"a" :: [Char]
""  ⇔ []
"ab" ⇔ ['a','b'] ⇔  'a':"b"'a':['b'] ⇔ 'a':'b':[]
"abc""ab"++"c"

注意: 实际编写代码是你不应该用 char 的 list 来表示文本. 大多数情况你应该用 Data.Text. 如果你想像是 ASCII char 的文本流, 你该用 Data.ByteString.

Tuples

两个元素的 tuple 类型为 (a,b). tuple 允许元素属于不同类型.

-- All these tuple are valid
(2,"foo")
(3,'a',[2,3])
((2,"a"),"c",3)

fst (x,y)       ⇒  x
snd (x,y)       ⇒  y

fst (x,y,z)     ⇒  ERROR: fst :: (a,b) -> a
snd (x,y,z)     ⇒  ERROR: snd :: (a,b) -> b
处理圆括号

可以用两个函数还下一部分括号: ($)(.).

-- By default:
f g h x         ⇔  (((f g) h) x)

-- the $ replace parenthesis from the $
-- to the end of the expression
f g $ h x       ⇔  f g (h x) ⇔ (f g) (h x)
f $ g h x       ⇔  f (g h x) ⇔ f ((g h) x)
f $ g $ h x     ⇔  f (g (h x))

-- (.) the composition function
(f . g) x       ⇔  f (g x)
(f . g . h) x   ⇔  f (g (h x))

01_basic/20_Essential_Haskell/10a_Functions.lhs

常用函数记法

留意下:

x :: Int            ⇔ x is of type Int
x :: a              ⇔ x can be of any type
x :: Num a => a     ⇔ x can be any type a
                      such that a belongs to Num type class
f :: a -> b         ⇔ f is a function from a to b
f :: a -> b -> cf is a function from a to (b→c)
f :: (a -> b) -> c  ⇔ f is a function from (a→b) to c

在函数定义前声明其类型不是强制性的. Haskell 会自行推断出最宽泛的类型. 但声明类型是一个好的实践.

中缀表示法

square :: Num a => a -> a
square x = x^2

注意 ^ 用了中缀表示法. 每个中缀表示法有对应的前缀表示法. 只要加上括号.

square' x = (^) x 2

square'' x = (^2) x

两边的 x 可以约掉. 这叫做 η-reduction.

square''' = (^2)

注意我们可以定义含有 ' 的函数名. Here:

squaresquare'square''square '''

判别

absolute 函数的一个实现.

absolute :: (Ord a, Num a) => a -> a
absolute x = if x >= 0 then x else -x

注意: Haskell 表达式 if .. then .. else 很像 C 语言操作符 ¤?¤:¤. 不能遗漏后面的 else.

另一个等价的版本:

absolute' x
    | x >= 0 = x
    | otherwise = -x

关于表示法的警告: Haskell 里缩进很重要. 好比 Python, 出错的缩进会破坏代码.

main = do
      print $ square 10
      print $ square' 10
      print $ square'' 10
      print $ square''' 10
      print $ absolute 10
      print $ absolute (-10)
      print $ absolute' 10
      print $ absolute' (-10)

01_basic/20_Essential_Haskell/10a_Functions.lhs

难点

现在开始讲难点.

函数风格

Biomechanical Landscape by H.R. Giger

这一章, 我会给个关于 Haskell 令人印象深刻的重构能力的小例子. 我们会选个问题然后用标准的命令式办法去解决掉. 然后我会让代码演进. 最终结果会更优雅更容易接受.

来解决以下问题:

给定 integer 的 list, 返回 list 中偶数的和.

例如: [1,2,3,4,5] ⇒ 2 + 4 ⇒ 6

为展示函数式和命令式方案的不同, 我开始会提供一个命令式的方案 (JavaScript):

function evenSum(list) {
    var result = 0;
    for (var i=0; i< list.length ; i++) {
        if (list[i] % 2 ==0) {
            result += list[i];
        }
    }
    return result;
}

但 Haskell 里没有变量, 也没循环. 一个不用循环而完成相同结果的解法是用递归.

注意:
在命令式语言里递归通常被认为缓慢. 但在函数式编程里通常不是那种状况. 大多数情况下 Haskell 会把递归函数处理得很高效

这是个 C 版本的递归函数. 注意简单起见, 我设定第一个 0 值表示 int list 的结尾.

int evenSum(int *list) {
    return accumSum(0,list);
}

int accumSum(int n, int *list) {
    int x;
    int *xs;
    if (*list == 0) { // if the list is empty
        return n;
    } else {
        x = list[0]; // let x be the first element of the list
        xs = list+1; // let xs be the list without x
        if ( 0 == (x%2) ) { // if x is even
            return accumSum(n+x, xs);
        } else {
            return accumSum(n, xs);
        }
    }
}

考虑下这段代码. 我们会把它转成 Haskell. 但这之前, 我要介绍下我们会用到的三个简单却使用的函数:

even :: Integral a => a -> Bool
head :: [a] -> a
tail :: [a] -> [a]

even 验证数字是否偶数.

even :: Integral a => a -> Bool
even 3False
even 2True

head 返回 list 第一个元素:

head :: [a] -> a
head [1,2,3] ⇒ 1
head []      ⇒ ERROR

tail 返回第一个元素以外所有元素:

tail :: [a] -> [a]
tail [1,2,3] ⇒ [2,3]
tail [3]     ⇒ []
tail []      ⇒ ERROR

记下, 对任一非空 list l, l ⇔ (head l):(tail l)


02_Hard_Part/11_Functions.lhs

Haskell 第一个解法. 函数 evenSum 返回列表里所有偶数的和:

-- Version 1
evenSum :: [Integer] -> Integer

evenSum l = accumSum 0 l

accumSum n l = if l == []
                  then n
                  else let x = head l
                           xs = tail l
                       in if even x
                              then accumSum (n+x) xs
                              else accumSum n xs

对函数的测试可以用 ghci:

% ghci
GHCi, version 7.0.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> :load 11_Functions.lhs
[1 of 1] Compiling Main             ( 11_Functions.lhs, interpreted )
Ok, modules loaded: Main.
*Main> evenSum [1..5]
6

一个执行的例子2:

*Main> evenSum [1..5]
accumSum 0 [1,2,3,4,5]
1 is odd
accumSum 0 [2,3,4,5]
2 is even
accumSum (0+2) [3,4,5]
3 is odd
accumSum (0+2) [4,5]
4 is even
accumSum (0+2+4) [5]
5 is odd
accumSum (0+2+4) []
l == []
0+2+4
0+6
6

从命令式编程译过来看起来应该都没问题. 其实有很多能改进的. 首先, 可以将类型一般化.

evenSum :: Integral a => [a] -> a
main = do print $ evenSum [1..10]

02_Hard_Part/11_Functions.lhs


02_Hard_Part/12_Functions.lhs

接着, 可以用 sub function 用 wherelet. 借此避免 accumSum 函数污染全局变量.

-- Version 2
evenSum :: Integral a => [a] -> a

evenSum l = accumSum 0 l
    where accumSum n l =
            if l == []
                then n
                else let x = head l
                         xs = tail l
                     in if even x
                            then accumSum (n+x) xs
                            else accumSum n xs
main = print $ evenSum [1..10]

02_Hard_Part/12_Functions.lhs


02_Hard_Part/13_Functions.lhs

接下来, 用类型匹配.

-- Version 3
evenSum l = accumSum 0 l
    where
        accumSum n [] = n
        accumSum n (x:xs) =
             if even x
                then accumSum (n+x) xs
                else accumSum n xs

什么是模式匹配? 用值替代普通函数名3.

除了去说: foo l = if l == [] then <x> else <y> 简单地表示:

foo [] =  <x>
foo l  =  <y>

不过模式匹配想得更远.. 而且可以探测复杂数值内部的数据. 可以替换

foo l =  let x  = head l
             xs = tail l
         in if even x
             then foo (n+x) xs
             else foo n xs

foo (x:xs) = if even x
                 then foo (n+x) xs
                 else foo n xs

这是个很有用的功能. 这让代码读起来更简洁更轻松

main = print $ evenSum [1..10]

02_Hard_Part/13_Functions.lhs


02_Hard_Part/14_Functions.lhs

Haskell 里可以用 η-reducing 简化函数定义. 比如, 不必写:

f x = (some expresion) x

简单地写

f = some expression

用这种办法可以移除 l:

-- Version 4
evenSum :: Integral a => [a] -> a

evenSum = accumSum 0
    where
        accumSum n [] = n
        accumSum n (x:xs) =
             if even x
                then accumSum (n+x) xs
                else accumSum n xs
main = print $ evenSum [1..10]

02_Hard_Part/14_Functions.lhs


02_Hard_Part/15_Functions.lhs

高阶函数

Escher

为这能更好需要用到高阶函数. 那又是什么呢? 高阶函数是以函数作为参数的函数.

看写例子:

filter :: (a -> Bool) -> [a] -> [a]
map :: (a -> b) -> [a] -> [b]
foldl :: (a -> b -> a) -> a -> [b] -> a

用小一点的步子开始执行.

-- Version 5
evenSum l = mysum 0 (filter even l)
    where
      mysum n [] = n
      mysum n (x:xs) = mysum (n+x) xs

其中

filter even [1..10] ⇔  [2,4,6,8,10]

函数 filter 接收一个类型为 (a -> Bool) 的函数和一个类型为 [a] 的 list. 返回一个只含能让函数返回 true 的元素的 list.

下一步是用一别的办法模拟一个 loop. 我发将用 foldl 函数来累积一个值. 函数 foldl 涵盖了一种广泛的编程模式:

myfunc list = foo initialValue list
    foo accumulated []     = accumulated
    foo tmpValue    (x:xs) = foo (bar tmpValue x) xs

可以替换为:

myfunc list = foldl bar initialValue list

如果你很想了解魔术是怎么实现的. 这里有个 foldl 的函数定义.

foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
foldl f z [x1,...xn]
⇔  f (... (f (f z x1) x2) ...) xn

但由于 Haskell 的惰性, 她不会把代码 (f z x) 执行和推进栈里. 因此会广泛用 foldl' 代替 foldl; foldl'foldlstrict 版本. 如果不懂惰性和strict 的意思, 别担心, 继续看代码就好了就当 foldlfoldl' 是一致的.

那么 evenSum 新的版本有了:

-- Version 6
-- foldl' isn't accessible by default
-- we need to import it from the module Data.List
import Data.List
evenSum l = foldl' mysum 0 (filter even l)
  where mysum acc value = acc + value

这个版本可以直接用 lambda 表达式来简化. 这样就不用借助临时的 mysum 了.

-- Version 7
-- Generally it is considered a good practice
-- to import only the necessary function(s)
import Data.List (foldl')
evenSum l = foldl' (\x y -> x+y) 0 (filter even l)

而且当然, 注意到

(\x y -> x+y) ⇔ (+)
main = print $ evenSum [1..10]

02_Hard_Part/15_Functions.lhs


02_Hard_Part/16_Functions.lhs

最终

-- Version 8
import Data.List (foldl')
evenSum :: Integral a => [a] -> a
evenSum l = foldl' (+) 0 (filter even l)

foldl' 直观上并不是最简单的函数. 如果你对它不太习惯, 那该琢磨一下.

为了更容易明白这里发生了什么, 一步一步地求值:

  evenSum [1,2,3,4]
⇒ foldl' (+) 0 (filter even [1,2,3,4])
⇒ foldl' (+) 0 [2,4]foldl' (+) (0+2) [4]foldl' (+) 2 [4]foldl' (+) (2+4) []foldl' (+) 6 []6

另一个有用的高阶函数是 (.). 函数 (.) 相当于数学上的函数复合.

(f . g . h) x ⇔  f ( g (h x))

可以借助这个运算符对函数进行 η-reduce :

-- Version 9
import Data.List (foldl')
evenSum :: Integral a => [a] -> a
evenSum = (foldl' (+) 0) . (filter even)

而且, 一些重命名之后更清爽了:

-- Version 10 
import Data.List (foldl')
sum' :: (Num a) => [a] -> a
sum' = foldl' (+) 0
evenSum :: Integral a => [a] -> a
evenSum = sum' . (filter even)

现在该讨论下了. 高阶函数能给我们带来什么?

首先, 可以说它简练. 但其实, 好好想想还有很多. 假设要对我们的函数做轻微的修改, 想获取 list 中所有奇数的平方和.

[1,2,3,4][1,4,9,16][4,16] ▷ 20

修改版本 10 超简单:

squareEvenSum = sum' . (filter even) . (map (^2))
squareEvenSum' = evenSum . (map (^2))
squareEvenSum'' = sum' . (map (^2)) . (filter even)

只需要添加另一个“变换函数”4.

map (^2) [1,2,3,4][1,4,9,16]

函数 map 直白地把函数施加到 list 里每个元素.

不需要修改函数定义 内部 任何代码. 看起来更模块化. 不过同时你对于函数的认识会更数学化. 你能像用别的函数一样使用这个函数. 可以符合, map, fold, filter 这个函数.

留给读者一个习题, 去修改版本 1. ☺.

如果你认为已经一般化到了几点, 那实际上你非常得错误. 比如有一条是把函数作用于任何递归类型, 而不只用在 list 里. 如果你想了解更多, 建议看下这篇有趣的文章: Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire by Meijer, Fokkinga and Paterson.

例子应该已经想你展示纯函数式编程能有多棒了. 不幸的是, 纯函数编程并不能很好地适用于所有情形. 或者至少还没有那样一门语言.

Haskell 一项强大之处是用于创建 DSL (Domain Specific Language) 使之更易于改变编程范式.

实际上, Haskell 在命令式编程时也很棒. 我当初学 Haskell 很男想明白这一点. 做了大量的努力解释了函数式的解法佛么优越, 然后去用 Haskell 写命令式, 这真很令人费解.

不过讲 Haskell 的能力前要讲下另一个基本的部分: 类型.

main = print $ evenSum [1..10]

02_Hard_Part/16_Functions.lhs

类型

Dali, the madonna of port Lligat

tl;dr:

  • type Name = AnotherType 这只是个名称,对编译器来说 NameAnotherType 没啥不同.
  • data Name = NameConstructor AnotherType 不大一样.
  • data 可以构建可递归的类型.
  • deriving 的魔法用来为你创建函数.

Haskell 里, 是强类型, 是静态类型.

为什么这很重要? 因为这可以 很大程度上 帮你规避错误. Haskell 大多 bug 可以在崇虚编译阶段捕获到. 其主要原因就是编译期间的类型推断. 比方你用了错误的参数在错误的位置就很容易检测了.

类型推断

要实现快速执行, 静态类型普遍很重要. 但大多静态语言不善于推广概念. Haskell 完美解决了通过类型推断.

下面是例子. Haskell 里的 square 函数:

square x = x * x

函数能 square 所有类型的数值. 用 square 能算 Int, 算 Integer, 算 FloatFractional 甚至 Complex. 例子证明下:

% ghci
GHCi, version 7.0.4:
...
Prelude> let square x = x*x
Prelude> square 2
4
Prelude> square 2.1
4.41
Prelude> -- load the Data.Complex module
Prelude> :m Data.Complex
Prelude Data.Complex> square (2 :+ 1)
3.0 :+ 4.0

x :+ y 是复数 (x + ib) 的表示法.

再和 C 所需的代码量对比下:

int     int_square(int x) { return x*x; }

float   float_square(float x) {return x*x; }

complex complex_square (complex z) {
    complex tmp; 
    tmp.real = z.real * z.real - z.img * z.img;
    tmp.img = 2 * z.img * z.real;
}

complex x,y;
y = complex_square(x);

每个类型你都要重新写函数. 剩下的解法只有去用一些元编程技巧. 比如用 pre-processor. C++ 里办法好点, C++ templates:

#include <iostream>
#include <complex>
using namespace std;

template<typename T>
T square(T x)
{
    return x*x;
}

int main() {
    // int
    int sqr_of_five = square(5);
    cout << sqr_of_five << endl;
    // double
    cout << (double)square(5.3) << endl;
    // complex
    cout << square( complex<double>(5,3) )
         << endl;
    return 0;
}

C++ 比 C 干得要好很多. 更复杂的函数里语法可以很难习惯: 看 这篇文章 的例子.

C++ 里必须声明函数可以运行不同的类型. Haskell 里相反. 函数默认会尽可能一般化.

类型推断让 Haskell 有那种动态语言才有的自由感. 只是跟动态语言不同的是, 大多错误在代码执行钱就能捕获. 通常对于 Haskell:

“只要编译成功了, 她就该是你想的那样运行.”


02_Hard_Part/21_Types.lhs

类型构造

你可以自己构建类型. 首先你可以创建别名或同义词类型.

type Name   = String
type Color  = String

showInfos :: Name ->  Color -> String
showInfos name color =  "Name: " ++ name
                        ++ ", Color: " ++ color
name :: Name
name = "Robin"
color :: Color
color = "Blue"
main = putStrLn $ showInfos name color

02_Hard_Part/21_Types.lhs


02_Hard_Part/22_Types.lhs

但这并不把你维护得多好. 尝试下调换 showInfos 两个参数的顺序再运行程序:

    putStrLn $ showInfos color name

编译和运行照常. 实际上你可以替换掉任何地方的 Name, Color 以及 String. 编译器看待她们完全相同.

另一个创建类型的方法是用 data 关键字.

data Name   = NameConstr String
data Color  = ColorConstr String

showInfos :: Name ->  Color -> String
showInfos (NameConstr name) (ColorConstr color) =
      "Name: " ++ name ++ ", Color: " ++ color

name  = NameConstr "Robin"
color = ColorConstr "Blue"
main = putStrLn $ showInfos name color

现在如果调换 showInfos 两个参数的顺序, 编译器会报错! 那个可能出现的错误你也不会再犯. 代价是代码因此冗长.

同时要注意 contructor 是函数:

NameConstr  :: String -> Name
ColorConstr :: String -> Color

data 语法的主体是:

data TypeName =   ConstructorName  [types]
                | ConstructorName2 [types]
                | ...

一般的用法是在 DataTypeName 和 DataTypeConstructor 使用一致的名称.

例如:

data Complex = Num a => Complex a a

可以使用记录的(record?)语法:

data DataTypeName = DataConstructor {
                      field1 :: [type of field1]
                    , field2 :: [type of field2]
                    ...
                    , fieldn :: [type of fieldn] }

并且有多种存取的方式. 而且赋值时可以使用其他的顺序.

例如:

data Complex = Num a => Complex { real :: a, img :: a}
c = Complex 1.0 2.0
z = Complex { real = 3, img = 4 }
real c ⇒ 1.0
img z ⇒ 4

02_Hard_Part/22_Types.lhs


02_Hard_Part/23_Types.lhs

递归类型

你已经见过一种递归类型: list. list 类型可以重新创建, 需要用到冗长的语法:

data List a = Empty | Cons a (List a)

要想用更简单的语法你需要对 constructor 使用中缀的名称.

infixr 5 :::
data List a = Nil | a ::: (List a)

infixr 后的数字表示优先级.

为了让定义的数据结构能进行 print (Show), read (Read), 检测 equality (Eq) and compare (Ord) 可以告诉 Haskell 替你派生出适当的函数.

infixr 5 :::
data List a = Nil | a ::: (List a) 
              deriving (Show,Read,Eq,Ord)

在类型声明加上 deriving (Show) 之后, Haskell 会为你创建一个 show 函数. 接下来很快可以看到怎么去用你的 show 函数.

convertList [] = Nil
convertList (x:xs) = x ::: convertList xs
main = do
      print (0 ::: 1 ::: Nil)
      print (convertList [0,1])

这会输出:

0 ::: (1 ::: Nil)
0 ::: (1 ::: Nil)

02_Hard_Part/23_Types.lhs


02_Hard_Part/30_Trees.lhs

Trees

Magritte, l'Arbre

下面会直接给出标准案例: binary trees.

import Data.List

data BinTree a = Empty 
                 | Node a (BinTree a) (BinTree a)
                              deriving (Show)

下面创建一个函数用来把一个 list 转化为一个有序的 binary tree.

treeFromList :: (Ord a) => [a] -> BinTree a
treeFromList [] = Empty
treeFromList (x:xs) = Node x (treeFromList (filter (<x) xs))
                             (treeFromList (filter (>x) xs))

看看这个函数多优雅. 用白话说:

  • 空的 list 会被转化成空的 tree.
  • list (x:xs) 会被转化为 tree , 满足:
    • root 是 x
    • 左子树是 list xs 中严格小于 x 的成员生成的 tree
    • 右子树是 list xs 中严格大于 x 的成员生成的 tree.
main = print $ treeFromList [7,2,4,8]

得到结果如下:

Node 7 (Node 2 Empty (Node 4 Empty Empty)) (Node 8 Empty Empty)

这是 tree 一种可读的但不漂亮的一种表示.

02_Hard_Part/30_Trees.lhs


02_Hard_Part/31_Trees.lhs

为了更有趣, 给 tree 编写一个更漂亮的表示形式. 我简单地就是很喜欢给 tree 做通用而且漂亮的函数来显示. 如果这部分对你来说太难, 跳过这部分没有问题.

我们要做一些改变. 从 BinTree 类型的声明中移除 deriving (Show). 让 BinTree 类型成为 (EqOrd) 的实例也会比较有用. 我们将可以比较和测试两棵树.

data BinTree a = Empty 
                 | Node a (BinTree a) (BinTree a)
                  deriving (Eq,Ord)

没有了 deriving (Show), Haskell 不会为我们创建 show 方法. 我们将自己创建一个 show. 为了将它实现, 必须么声明我们新创建的类型 BinTree a 是类型类 Show 的一个实例. 一般的语法是:

instance Show (BinTree a) where
   show t = ... -- You declare your function here

下面是我写的展示二叉树的代码. 别担心表面看去的复杂性. 为显示甚至古怪些的对象, 我做了不少的优化.

-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
  -- will start by a '<' before the root
  -- and put a : a begining of line
  show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
    where
    -- treeshow pref Tree 
    --   shows a tree and starts each line with pref
    -- We don't display the Empty tree
    treeshow pref Empty = ""
    -- Leaf
    treeshow pref (Node x Empty Empty) =
                  (pshow pref x)

    -- Right branch is empty
    treeshow pref (Node x left Empty) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " left)

    -- Left branch is empty
    treeshow pref (Node x Empty right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    -- Tree with left and right children non empty
    treeshow pref (Node x left right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "|--" "|  " left) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    -- shows a tree using some prefixes to make it nice
    showSon pref before next t =
                  pref ++ before ++ treeshow (pref ++ next) t

    -- pshow replaces "\n" by "\n"++pref
    pshow pref x = replace '\n' ("\n"++pref) (show x)

    -- replaces one char by another string
    replace c new string =
      concatMap (change c new) string
      where
          change c new x
              | x == c = new
              | otherwise = x:[] -- "x"

其中 treeFromList 方法仍然一致.

treeFromList :: (Ord a) => [a] -> BinTree a
treeFromList [] = Empty
treeFromList (x:xs) = Node x (treeFromList (filter (<x) xs))
                             (treeFromList (filter (>x) xs))

现在我们来演示:

main = do
  putStrLn "Int binary tree:"
  print $ treeFromList [7,2,4,8,1,3,6,21,12,23]
Int binary tree:
< 7
: |--2
: |  |--1
: |  `--4
: |     |--3
: |     `--6
: `--8
:    `--21
:       |--12
:       `--23

现在好多了. 根部一行用 < 符号开头. 接下来的没一行以 : 开头. 另外也可以用在另一种类型.

  putStrLn "\nString binary tree:"
  print $ treeFromList ["foo","bar","baz","gor","yog"]
String binary tree:
< "foo"
: |--"bar"
: |  `--"baz"
: `--"gor"
:    `--"yog"

因为能检测 tree 是否相等和顺序, 我们还可以从树创建树!

  putStrLn "\nBinary tree of Char binary trees:"
  print ( treeFromList
           (map treeFromList ["baz","zara","bar"]))
Binary tree of Char binary trees:
< < 'b'
: : |--'a'
: : `--'z'
: |--< 'b'
: |  : |--'a'
: |  : `--'r'
: `--< 'z'
:    : `--'a'
:    :    `--'r'

这就是我让没每一行以 : 开头的原因(根部以外).

Yo Dawg Tree

  putStrLn "\nTree of Binary trees of Char binary trees:"
  print $ (treeFromList . map (treeFromList . map treeFromList))
             [ ["YO","DAWG"]
             , ["I","HEARD"]
             , ["I","HEARD"]
             , ["YOU","LIKE","TREES"] ]

上面的等价于

print ( treeFromList (
          map treeFromList
             [ map treeFromList ["YO","DAWG"]
             , map treeFromList ["I","HEARD"]
             , map treeFromList ["I","HEARD"]
             , map treeFromList ["YOU","LIKE","TREES"] ]))

输出:

Binary tree of Binary trees of Char binary trees:
< < < 'Y'
: : : `--'O'
: : `--< 'D'
: :    : |--'A'
: :    : `--'W'
: :    :    `--'G'
: |--< < 'I'
: |  : `--< 'H'
: |  :    : |--'E'
: |  :    : |  `--'A'
: |  :    : |     `--'D'
: |  :    : `--'R'
: `--< < 'Y'
:    : : `--'O'
:    : :    `--'U'
:    : `--< 'L'
:    :    : `--'I'
:    :    :    |--'E'
:    :    :    `--'K'
:    :    `--< 'T'
:    :       : `--'R'
:    :       :    |--'E'
:    :       :    `--'S'

注意重复的 tree 为什么没有被插入; 上面只有一棵树对应 "I","HEARD". 我们很轻松地将其实现了, 因为我们已经声明了 Tree 是 Eq 类型的实例.

看看这个结构多漂亮. 我们可以创建的 tree 不仅包含整数, 字符串和字符, 还有其他的 tree. 而且我们海南生成包含含有其他 tree 的 tree 的 tree.

02_Hard_Part/31_Trees.lhs


02_Hard_Part/40_Infinites_Structures.lhs

无限结构

Escher

Haskell 经常被说是 惰性.

事实上, 如果你迂腐点, 你会说 Haskell 是 non-strict. 惰性是 non-strict 语言一种通常的实现方式.

那么 not-strict 什么意思? 根据 Haskell wiki:

递归 (求值的数学术语) 从外部开始执行.

因此给定 (a+(b*c)) 那么 + 会首先约简, 然后内部的 (b*c) 再被约简

比如 Haskell 里你可以这么做:

-- numbers = [1,2,..]
numbers :: [Integer]
numbers = 0:map (1+) numbers

take' n [] = []
take' 0 l = []
take' n (x:xs) = x:take' (n-1) xs

main = print $ take' 10 numbers

而且它可以停下来.

怎么做到?

不是去尝试对 numbers 整个地求值, 只有被需要的元素才会被求值.

而且, 注意 Haskell 里有无限列表的表示法

[1..][1,2,3,4...]
[1,3..][1,3,5,7,9,11...]

绝大多数函数都可以正常处理无限列表. 然后, 有个内置函数 take 和我们的 take' 是等价的.

02_Hard_Part/40_Infinites_Structures.lhs


02_Hard_Part/41_Infinites_Structures.lhs

This code is mostly the same as the previous one.
import Debug.Trace (trace)
import Data.List
data BinTree a = Empty 
                 | Node a (BinTree a) (BinTree a)
                  deriving (Eq,Ord)
-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
  -- will start by a '<' before the root
  -- and put a : a begining of line
  show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
    where
    treeshow pref Empty = ""
    treeshow pref (Node x Empty Empty) =
                  (pshow pref x)

    treeshow pref (Node x left Empty) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " left)

    treeshow pref (Node x Empty right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    treeshow pref (Node x left right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "|--" "|  " left) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    -- show a tree using some prefixes to make it nice
    showSon pref before next t =
                  pref ++ before ++ treeshow (pref ++ next) t

    -- pshow replace "\n" by "\n"++pref
    pshow pref x = replace '\n' ("\n"++pref) (" " ++ show x)

    -- replace on char by another string
    replace c new string =
      concatMap (change c new) string
      where
          change c new x
              | x == c = new
              | otherwise = x:[] -- "x"

假设我们不介意用到有序的二叉树. 这里有棵无限的二叉树:

nullTree = Node 0 nullTree nullTree

一棵完整的二叉树, 其中每个节点均为 0. 现在我将证明你能用下面的函数操作这个对象:

-- take all element of a BinTree 
-- up to some depth
treeTakeDepth _ Empty = Empty
treeTakeDepth 0 _     = Empty
treeTakeDepth n (Node x left right) = let
          nl = treeTakeDepth (n-1) left
          nr = treeTakeDepth (n-1) right
          in
              Node x nl nr

看看这段程序得到了什么:

main = print $ treeTakeDepth 4 nullTree

这段代码编译, 执行, 停止, 给出了下面的结果:

<  0
: |-- 0
: |  |-- 0
: |  |  |-- 0
: |  |  `-- 0
: |  `-- 0
: |     |-- 0
: |     `-- 0
: `-- 0
:    |-- 0
:    |  |-- 0
:    |  `-- 0
:    `-- 0
:       |-- 0
:       `-- 0

打起精神, 我们再写个更有趣点的 tree:

iTree = Node 0 (dec iTree) (inc iTree)
        where
           dec (Node x l r) = Node (x-1) (dec l) (dec r)
           inc (Node x l r) = Node (x+1) (inc l) (inc r)

另一种这样产生树的方法是用高阶函数. 函数会相似于 map, 但会作用于 BinTree 而不是 list. 下面是这样一个函数:

-- apply a function to each node of Tree
treeMap :: (a -> b) -> BinTree a -> BinTree b
treeMap f Empty = Empty
treeMap f (Node x left right) = Node (f x)
                                     (treeMap f left)
                                     (treeMap f right)

提示: 关于这些我不会说太多. 如果你对生成其他数据结构的 map 感兴趣, 可以搜索 functor 和 fmap.

我们现在定义是:

infTreeTwo :: BinTree Int
infTreeTwo = Node 0 (treeMap (\x -> x-1) infTreeTwo)
                    (treeMap (\x -> x+1) infTreeTwo)

看下它的结果

main = print $ treeTakeDepth 4 infTreeTwo
<  0
: |-- -1
: |  |-- -2
: |  |  |-- -3
: |  |  `-- -1
: |  `-- 0
: |     |-- -1
: |     `-- 1
: `-- 1
:    |-- 0
:    |  |-- -1
:    |  `-- 1
:    `-- 2
:       |-- 1
:       `-- 3
main = do
  print $ treeTakeDepth 4 nullTree
  print $ treeTakeDepth 4 infTreeTwo

02_Hard_Part/41_Infinites_Structures.lhs

超难的地方

祝贺你看了这么远! 现在一些真的难啃的东西可以开始了

如果你像我这样, 你应该已经懂函数式的风格了. 你应该也多懂了一点默认采用惰性计算的优势. 但你也还没弄明白怎么完成一个真正的程序. 其中特别是:

  • 你怎么处理副作用?
  • 为什么还有一套命令式的写法的来处理 IO?

准备一下, 答案可能比较复杂. 但是答案能带来很多的收获.


03_Hell/01_IO/01_progressive_io_example.lhs

处理 IO

Magritte, Carte blanche

tl;dr:

一个典型的处理 IO 的函数看起来很像是一个命令式的程序:

f :: IO a
f = do
  x <- action1
  action2 x
  y <- action3
  action4 x y
  • 我们用 <- 把值设置到对象.
  • 每一行的类型是 IO *; 这个例子里:
    • action1 :: IO b
    • action2 x :: IO ()
    • action3 :: IO c
    • action4 x y :: IO a
    • x :: b, y :: c
  • 很少的对象的类型是 IO a, 这可以帮你做选择. 特别地, 这里不能直接用纯函数代码. 比如对应纯函数你需要写 action2 (purefunction x).

这个章节里, 我将解释怎样使用 IO, 但不是 IO 怎样工作. 你可以看到 Haskell 是怎样把纯函数和不纯的程序分离开的.

别停下, 下面你会理解语法的细节. 答案会在下面的章节.

获取什么?

让一个用户输入数字的列表. 打印所有数字的和

toList :: String -> [Integer]
toList input = read ("[" ++ input ++ "]")

main = do
  putStrLn "Enter a list of numbers (separated by comma):"
  input <- getLine
  print $ sum (toList input)

这个程序里理解其行为应该比较直接. 让我们从细节理解分析下类型.

putStrLn :: String -> IO ()
getLine  :: IO String
print    :: Show a => a -> IO ()

更有趣的是, 可以注意到每个 do 表达式有着 IO a 的类型.

main = do
  putStrLn "Enter ... " :: IO ()
  getLine               :: IO String
  print Something       :: IO ()

我们应该注意一下 <- 符号的作用.

do
 x <- something

如果有 something :: IO a 那么有 x :: a.

使用 IO 还有个重要的注意点. do block 里每一行必须是下面两种形式之一:

action1             :: IO a
                    -- in this case, generally a = ()

或者

value <- action2    -- where
                    -- bar z t :: IO b
                    -- value   :: b

这两种写法将对应两种将操作串行的方法. 这一点将在下一个章节结束前会更加清晰.

03_Hell/01_IO/01_progressive_io_example.lhs


03_Hell/01_IO/02_progressive_io_example.lhs

现在来看看这个程序怎么跑. 比如, 用户遇到奇怪的事情会怎么样? 试一下:

    % runghc 02_progressive_io_example.lhs
    Enter a list of numbers (separated by comma):
    foo
    Prelude.read: no parse

哎呀! 一个不幸的错误, 程序运行出错! 第一个可以改进的地方是用一条友好的信息来提示.

为了给出提示, 首先要探测到错误. 有这样的办法去做. 使用 Maybe 类型. 这是 Haskell 里很普通的类型.

import Data.Maybe

这是什么东西? Maybe 是一个类型, 可以接收一个类型. 它的定义是:

data Maybe a = Nothing | Just a

This is a nice way to tell there was an error while trying to create/compute a value. The maybeRead function is a great example of this. This is a function similar to the function read5, but if something goes wrong the returned value is Nothing. If the value is right, it returns Just <the value>. Don’t try to understand too much of this function. I use a lower level function than read; reads.

maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
                  [(x,"")]    -> Just x
                  _           -> Nothing

Now to be a bit more readable, we define a function which goes like this: If the string has the wrong format, it will return Nothing. Otherwise, for example for “1,2,3”, it will return Just [1,2,3].

getListFromString :: String -> Maybe [Integer]
getListFromString str = maybeRead $ "[" ++ str ++ "]"

We simply have to test the value in our main function.

main :: IO ()
main = do
  putStrLn "Enter a list of numbers (separated by comma):"
  input <- getLine
  let maybeList = getListFromString input in
      case maybeList of
          Just l  -> print (sum l)
          Nothing -> error "Bad format. Good Bye."

In case of error, we display a nice error message.

Note that the type of each expression in the main’s do block remains of the form IO a. The only strange construction is error. I’ll say error msg will simply take the needed type (here IO ()).

One very important thing to note is the type of all the functions defined so far. There is only one function which contains IO in its type: main. This means main is impure. But main uses getListFromString which is pure. It is then clear just by looking at declared types which functions are pure and which are impure.

Why does purity matter? I certainly forget many advantages, but the three main reasons are:

  • It is far easier to think about pure code than impure one.
  • Purity protects you from all the hard to reproduce bugs due to side effects.
  • You can evaluate pure functions in any order or in parallel without risk.

This is why you should generally put as most code as possible inside pure functions.

03_Hell/01_IO/02_progressive_io_example.lhs


03_Hell/01_IO/03_progressive_io_example.lhs

Our next evolution will be to prompt the user again and again until she enters a valid answer.

We keep the first part:

import Data.Maybe

maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
                  [(x,"")]    -> Just x
                  _           -> Nothing
getListFromString :: String -> Maybe [Integer]
getListFromString str = maybeRead $ "[" ++ str ++ "]"

Now, we create a function which will ask the user for an list of integers until the input is right.

askUser :: IO [Integer]
askUser = do
  putStrLn "Enter a list of numbers (separated by comma):"
  input <- getLine
  let maybeList = getListFromString input in
      case maybeList of
          Just l  -> return l
          Nothing -> askUser

This function is of type IO [Integer]. Such a type means that we retrieved a value of type [Integer] through some IO actions. Some people might explain while waving their hands:

«This is an [Integer] inside an IO»

If you want to understand the details behind all of this, you’ll have to read the next section. But sincerely, if you just want to use IO. Just practice a little and remember to think about the type.

Finally our main function is quite simpler:

main :: IO ()
main = do
  list <- askUser
  print $ sum list

We have finished with our introduction to IO. This was quite fast. Here are the main things to remember:

  • in the do bloc, each expression must have the type IO a. You are then limited in the number of expressions available. For example, getLine, print, putStrLn, etc…
  • Try to externalize the pure functions as much as possible.
  • the IO a type means: an IO action which returns an element of type a. IO represents actions; under the hood, IO a is the type of a function. Read the next section if you are curious.

If you practice a bit, you should be able to use IO.

Exercises:

  • Make a program that sums all of its arguments. Hint: use the function getArgs.

03_Hell/01_IO/03_progressive_io_example.lhs

IO 手法解析

Magritte, ceci n'est pas une pipe

Here is a tl;dr: for this section.

To separate pure and impure parts, main is defined as a function which modifies the state of the world

main :: World -> World

A function is guaranteed to have side effects only if it has this type. But look at a typical main function:

main w0 =
    let (v1,w1) = action1 w0 in
    let (v2,w2) = action2 v1 w1 in
    let (v3,w3) = action3 v2 w2 in
    action4 v3 w3

We have a lot of temporary elements (here w1, w2 and w3) which must be passed on to the next action.

We create a function bind or (>>=). With bind we don’t need temporary names anymore.

main =
  action1 >>= action2 >>= action3 >>= action4

Bonus: Haskell has syntactical sugar for us:

main = do
  v1 <- action1
  v2 <- action2 v1
  v3 <- action3 v2
  action4 v3

Why did we use this strange syntax, and what exactly is this IO type? It looks a bit like magic.

For now let’s just forget all about the pure parts of our program, and focus on the impure parts:

askUser :: IO [Integer]
askUser = do
  putStrLn "Enter a list of numbers (separated by commas):"
  input <- getLine
  let maybeList = getListFromString input in
      case maybeList of
          Just l  -> return l
          Nothing -> askUser

main :: IO ()
main = do
  list <- askUser
  print $ sum list

First remark; it looks like an imperative structure. Haskell is powerful enough to make impure code look imperative. For example, if you wish you could create a while in Haskell. In fact, for dealing with IO, imperative style is generally more appropriate.

But you should had noticed the notation is a bit unusual. Here is why, in detail.

In an impure language, the state of the world can be seen as a huge hidden global variable. This hidden variable is accessible by all functions of your language. For example, you can read and write a file in any function. The fact that a file exists or not can be seen as different states of the world.

For Haskell this state is not hidden. It is explicitly said main is a function that potentially changes the state of the world. Its type is then something like:

main :: World -> World

Not all functions may have access to this variable. Those which have access to this variable are impure. Functions to which the world variable isn’t provided are pure6.

Haskell considers the state of the world as an input variable to main. But the real type of main is closer to this one7:

main :: World -> ((),World)

The () type is the null type. Nothing to see here.

Now let’s rewrite our main function with this in mind:

main w0 =
    let (list,w1) = askUser w0 in
    let (x,w2) = print (sum list,w1) in
    x

First, we note that all functions which have side effects must have the type:

World -> (a,World)

Where a is the type of the result. For example, a getChar function should have the type World -> (Char,World).

Another thing to note is the trick to fix the order of evaluation. In Haskell, in order to evaluate f a b, you have many choices:

  • first eval a then b then f a b
  • first eval b then a then f a b.
  • eval a and b in parallel then f a b

This is true, because we should work in a pure language.

Now, if you look at the main function, it is clear you must eval the first line before the second one since, to evaluate the second line you have to get a parameter given by the evaluation of the first line.

Such trick works nicely. The compiler will at each step provide a pointer to a new real world id. Under the hood, print will evaluate as:

  • print something on the screen
  • modify the id of the world
  • evaluate as ((),new world id).

Now, if you look at the style of the main function, it is clearly awkward. Let’s try to do the same to the askUser function:

askUser :: World -> ([Integer],World)

Before:

askUser :: IO [Integer]
askUser = do
  putStrLn "Enter a list of numbers:"
  input <- getLine
  let maybeList = getListFromString input in
      case maybeList of
          Just l  -> return l
          Nothing -> askUser

After:

askUser w0 =
    let (_,w1)     = putStrLn "Enter a list of numbers:" in
    let (input,w2) = getLine w1 in
    let (l,w3)     = case getListFromString input of
                      Just l   -> (l,w2)
                      Nothing  -> askUser w2
    in
        (l,w3)

This is similar, but awkward. Look at all these temporary w? names.

The lesson, is, naive IO implementation in Pure functional languages is awkward!

Fortunately, there is a better way to handle this problem. We see a pattern. Each line is of the form:

let (y,w') = action x w in

Even if for some line the first x argument isn’t needed. The output type is a couple, (answer, newWorldValue). Each function f must have a type similar to:

f :: World -> (a,World)

Not only this, but we can also note that we always follow the same usage pattern:

let (y,w1) = action1 w0 in
let (z,w2) = action2 w1 in
let (t,w3) = action3 w2 in
...

Each action can take from 0 to n parameters. And in particular, each action can take a parameter from the result of a line above.

For example, we could also have:

let (_,w1) = action1 x w0   in
let (z,w2) = action2 w1     in
let (_,w3) = action3 x z w2 in
...

And of course actionN w :: (World) -> (a,World).

IMPORTANT, there are only two important patterns to consider:

let (x,w1) = action1 w0 in
let (y,w2) = action2 x w1 in

and

let (_,w1) = action1 w0 in
let (y,w2) = action2 w1 in

Jocker pencil trick

Now, we will do a magic trick. We will make the temporary world symbol “disappear”. We will bind the two lines. Let’s define the bind function. Its type is quite intimidating at first:

bind :: (World -> (a,World))
        -> (a -> (World -> (b,World)))
        -> (World -> (b,World))

But remember that (World -> (a,World)) is the type for an IO action. Now let’s rename it for clarity:

type IO a = World -> (a, World)

Some example of functions:

getLine :: IO String
print :: Show a => a -> IO ()

getLine is an IO action which takes a world as parameter and returns a couple (String,World). Which can be summarized as: getLine is of type IO String. Which we also see as, an IO action which will return a String “embeded inside an IO”.

The function print is also interesting. It takes one argument which can be shown. In fact it takes two arguments. The first is the value to print and the other is the state of world. It then returns a couple of type ((),World). This means it changes the state of the world, but doesn’t yield anymore data.

This type helps us simplify the type of bind:

bind :: IO a
        -> (a -> IO b)
        -> IO b

It says that bind takes two IO actions as parameter and return another IO action.

Now, remember the important patterns. The first was:

let (x,w1) = action1 w0 in
let (y,w2) = action2 x w1 in
(y,w2)

Look at the types:

action1  :: IO a
action2  :: a -> IO b
(y,w2)   :: IO b

Doesn’t it seem familiar?

(bind action1 action2) w0 =
    let (x, w1) = action1 w0
        (y, w2) = action2 x w1
    in  (y, w2)

The idea is to hide the World argument with this function. Let’s go: As an example imagine if we wanted to simulate:

let (line1,w1) = getLine w0 in
let ((),w2) = print line1 in
((),w2)

Now, using the bind function:

(res,w2) = (bind getLine (\l -> print l)) w0

As print is of type (World → ((),World)), we know res = () (null type). If you didn’t see what was magic here, let’s try with three lines this time.

let (line1,w1) = getLine w0 in
let (line2,w2) = getLine w1 in
let ((),w3) = print (line1 ++ line2) in
((),w3)

Which is equivalent to:

(res,w3) = bind getLine (\line1 ->
             bind getLine (\line2 ->
               print (line1 ++ line2)))

Didn’t you notice something? Yes, no temporary World variables are used anywhere! This is MA. GIC.

We can use a better notation. Let’s use (>>=) instead of bind. (>>=) is an infix function like (+); reminder 3 + 4 ⇔ (+) 3 4

(res,w3) = getLine >>=
           \line1 -> getLine >>=
           \line2 -> print (line1 ++ line2)

Ho Ho Ho! Happy Christmas Everyone! Haskell has made syntactical sugar for us:

do
  x <- action1
  y <- action2
  z <- action3
  ...

Is replaced by:

action1 >>= \x ->
action2 >>= \y ->
action3 >>= \z ->
...

Note you can use x in action2 and x and y in action3.

But what about the lines not using the <-? Easy, another function blindBind:

blindBind :: IO a -> IO b -> IO b
blindBind action1 action2 w0 =
    bind action (\_ -> action2) w0

I didn’t simplify this definition for clarity purpose. Of course we can use a better notation, we’ll use the (>>) operator.

And

do
    action1
    action2
    action3

Is transformed into

action1 >>
action2 >>
action3

Also, another function is quite useful.

putInIO :: a -> IO a
putInIO x = IO (\w -> (x,w))

This is the general way to put pure values inside the “IO context”. The general name for putInIO is return. This is quite a bad name when you learn Haskell. return is very different from what you might be used to.


03_Hell/01_IO/21_Detailled_IO.lhs

To finish, let’s translate our example:


askUser :: IO [Integer]
askUser = do
  putStrLn "Enter a list of numbers (separated by commas):"
  input <- getLine
  let maybeList = getListFromString input in
      case maybeList of
          Just l  -> return l
          Nothing -> askUser

main :: IO ()
main = do
  list <- askUser
  print $ sum list

Is translated into:

import Data.Maybe

maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
                  [(x,"")]    -> Just x
                  _           -> Nothing
getListFromString :: String -> Maybe [Integer]
getListFromString str = maybeRead $ "[" ++ str ++ "]"
askUser :: IO [Integer]
askUser =
    putStrLn "Enter a list of numbers (sep. by commas):" >>
    getLine >>= \input ->
    let maybeList = getListFromString input in
      case maybeList of
        Just l -> return l
        Nothing -> askUser

main :: IO ()
main = askUser >>=
  \list -> print $ sum list

You can compile this code to verify it keeps working.

Imagine what it would look like without the (>>) and (>>=).

03_Hell/01_IO/21_Detailled_IO.lhs


03_Hell/02_Monads/10_Monads.lhs

Monads

Dali, reve. It represents a weapon out of the mouth of a tiger, itself out of the mouth of another tiger, itself out of the mouth of a fish itself out of a grenade. I could have choosen a picture of the Human centipede as it is a very good representation of what a monad really is. But just to thing about it, I find this disgusting and that wasn't the purpose of this document.

Now the secret can be revealed: IO is a monad. Being a monad means you have access to some syntactical sugar with the do notation. But mainly, you have access to a coding pattern which will ease the flow of your code.

Important remarks:

  • Monad are not necessarily about effects! There are a lot of pure monads.
  • Monad are more about sequencing

For the Haskell language Monad is a type class. To be an instance of this type class, you must provide the functions (>>=) and return. The function (>>) will be derived from (>>=). Here is how the type class Monad is declared (mostly):

class Monad m  where
  (>>=) :: m a -> (a -> m b) -> m b
  return :: a -> m a

  (>>) :: m a -> m b -> m b
  f >> g = f >>= \_ -> g

  -- You should generally safely ignore this function
  -- which I believe exists for historical reason
  fail :: String -> m a
  fail = error

Remarks:

  • the keyword class is not your friend. A Haskell class is not a class like in object model. A Haskell class has a lot of similarities with Java interfaces. A better word should have been typeclass. That means a set of types. For a type to belong to a class, all functions of the class must be provided for this type.
  • In this particular example of type class, the type m must be a type that takes an argument. for example IO a, but also Maybe a, [a], etc…
  • To be a useful monad, your function must obey some rules. If your construction does not obey these rules strange things might happens:

    return a >>= k  ==  k a
    m >>= return  ==  m
    m >>= (\x -> k x >>= h)  ==  (m >>= k) >>= h
    

Maybe 是 monad

There are a lot of different types that are instance of Monad. One of the easiest to describe is Maybe. If you have a sequence of Maybe values, you can use monads to manipulate them. It is particularly useful to remove very deep if..then..else.. constructions.

Imagine a complex bank operation. You are eligible to gain about 700€ only if you can afford to follow a list of operations without being negative.

deposit  value account = account + value
withdraw value account = account - value

eligible :: (Num a,Ord a) => a -> Bool
eligible account =
  let account1 = deposit 100 account in
    if (account1 < 0)
    then False
    else
      let account2 = withdraw 200 account1 in
      if (account2 < 0)
      then False
      else
        let account3 = deposit 100 account2 in
        if (account3 < 0)
        then False
        else
          let account4 = withdraw 300 account3 in
          if (account4 < 0)
          then False
          else
            let account5 = deposit 1000 account4 in
            if (account5 < 0)
            then False
            else
              True

main = do
  print $ eligible 300 -- True
  print $ eligible 299 -- False

03_Hell/02_Monads/10_Monads.lhs


03_Hell/02_Monads/11_Monads.lhs

Now, let’s make it better using Maybe and the fact that it is a Monad

deposit :: (Num a) => a -> a -> Maybe a
deposit value account = Just (account + value)

withdraw :: (Num a,Ord a) => a -> a -> Maybe a
withdraw value account = if (account < value)
                         then Nothing
                         else Just (account - value)

eligible :: (Num a, Ord a) => a -> Maybe Bool
eligible account = do
  account1 <- deposit 100 account
  account2 <- withdraw 200 account1
  account3 <- deposit 100 account2
  account4 <- withdraw 300 account3
  account5 <- deposit 1000 account4
  Just True

main = do
  print $ eligible 300 -- Just True
  print $ eligible 299 -- Nothing

03_Hell/02_Monads/11_Monads.lhs


03_Hell/02_Monads/12_Monads.lhs

Not bad, but we can make it even better:

deposit :: (Num a) => a -> a -> Maybe a
deposit value account = Just (account + value)

withdraw :: (Num a,Ord a) => a -> a -> Maybe a
withdraw value account = if (account < value)
                         then Nothing
                         else Just (account - value)

eligible :: (Num a, Ord a) => a -> Maybe Bool
eligible account =
  deposit 100 account >>=
  withdraw 200 >>=
  deposit 100  >>=
  withdraw 300 >>=
  deposit 1000 >>
  return True

main = do
  print $ eligible 300 -- Just True
  print $ eligible 299 -- Nothing

We have proven that Monads are a good way to make our code more elegant. Note this idea of code organization, in particular for Maybe can be used in most imperative language. In fact, this is the kind of construction we make naturally.

An important remark:

The first element in the sequence being evaluated to Nothing will stop the complete evaluation. This means you don’t execute all lines. You have this for free, thanks to laziness.

The Maybe monad proved to be useful while being a very simple example. We saw the utility of the IO monad. But now a cooler example, lists.

03_Hell/02_Monads/12_Monads.lhs


03_Hell/02_Monads/13_Monads.lhs

List monad

Golconde de Magritte

The list monad helps us to simulate non deterministic computations. Here we go:

import Control.Monad (guard)

allCases = [1..10]

resolve :: [(Int,Int,Int)]
resolve = do
              x <- allCases
              y <- allCases
              z <- allCases
              guard $ 4*x + 2*y < z
              return (x,y,z)

main = do
  print resolve

MA. GIC. :

[(1,1,7),(1,1,8),(1,1,9),(1,1,10),(1,2,9),(1,2,10)]

For the list monad, there is also a syntactical sugar:

  print $ [ (x,y,z) | x <- allCases,
                      y <- allCases,
                      z <- allCases,
                      4*x + 2*y < z ]

I won’t list all the monads, but there are many monads. Using monads simplifies the manipulation of several notions in pure languages. In particular, monad are very useful for:

  • IO,
  • non deterministic computation,
  • generating pseudo random numbers,
  • keeping configuration state,
  • writing state,

If you have followed me until here, then you’ve done it! You know monads8!

03_Hell/02_Monads/13_Monads.lhs

附录

This section is not so much about learning Haskell. It is just here to discuss some details further.


04_Appendice/01_More_on_infinite_trees/10_Infinite_Trees.lhs

更多关于 Infinite Tree

In the section Infinite Structures we saw some simple constructions. Unfortunately we removed two properties from our tree:

  1. no duplicate node value
  2. well ordered tree

In this section we will try to keep the first property. Concerning the second one, we must relax it but we’ll discuss how to keep it as much as possible.

This code is mostly the same as the one in the [tree section](#trees).
import Data.List
data BinTree a = Empty 
                 | Node a (BinTree a) (BinTree a)
                  deriving (Eq,Ord)

-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
  -- will start by a '<' before the root
  -- and put a : a begining of line
  show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
    where
    treeshow pref Empty = ""
    treeshow pref (Node x Empty Empty) =
                  (pshow pref x)

    treeshow pref (Node x left Empty) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " left)

    treeshow pref (Node x Empty right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    treeshow pref (Node x left right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "|--" "|  " left) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    -- show a tree using some prefixes to make it nice
    showSon pref before next t =
                  pref ++ before ++ treeshow (pref ++ next) t

    -- pshow replace "\n" by "\n"++pref
    pshow pref x = replace '\n' ("\n"++pref) (show x)

    -- replace on char by another string
    replace c new string =
      concatMap (change c new) string
      where
          change c new x
              | x == c = new
              | otherwise = x:[] -- "x"

Our first step is to create some pseudo-random number list:

shuffle = map (\x -> (x*3123) `mod` 4331) [1..]

Just as a reminder, here is the definition of treeFromList

treeFromList :: (Ord a) => [a] -> BinTree a
treeFromList []    = Empty
treeFromList (x:xs) = Node x (treeFromList (filter (<x) xs))
                             (treeFromList (filter (>x) xs))

and treeTakeDepth:

treeTakeDepth _ Empty = Empty
treeTakeDepth 0 _     = Empty
treeTakeDepth n (Node x left right) = let
          nl = treeTakeDepth (n-1) left
          nr = treeTakeDepth (n-1) right
          in
              Node x nl nr

See the result of:

main = do
      putStrLn "take 10 shuffle"
      print $ take 10 shuffle
      putStrLn "\ntreeTakeDepth 4 (treeFromList shuffle)"
      print $ treeTakeDepth 4 (treeFromList shuffle)
% runghc 02_Hard_Part/41_Infinites_Structures.lhs
take 10 shuffle
[3123,1915,707,3830,2622,1414,206,3329,2121,913]
treeTakeDepth 4 (treeFromList shuffle)

< 3123
: |--1915
: |  |--707
: |  |  |--206
: |  |  `--1414
: |  `--2622
: |     |--2121
: |     `--2828
: `--3830
:    |--3329
:    |  |--3240
:    |  `--3535
:    `--4036
:       |--3947
:       `--4242

Yay! It ends! Beware though, it will only work if you always have something to put into a branch.

For example

treeTakeDepth 4 (treeFromList [1..])

will loop forever. Simply because it will try to access the head of filter (<1) [2..]. But filter is not smart enought to understand that the result is the empty list.

Nonetheless, it is still a very cool example of what non strict programs have to offer.

Left as an exercise to the reader:

  • Prove the existence of a number n so that treeTakeDepth n (treeFromList shuffle) will enter an infinite loop.
  • Find an upper bound for n.
  • Prove there is no shuffle list so that, for any depth, the program ends.

04_Appendice/01_More_on_infinite_trees/10_Infinite_Trees.lhs


04_Appendice/01_More_on_infinite_trees/11_Infinite_Trees.lhs

This code is mostly the same as the preceding one.
import Debug.Trace (trace)
import Data.List
data BinTree a = Empty 
                 | Node a (BinTree a) (BinTree a)
                  deriving (Eq,Ord)
-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
  -- will start by a '<' before the root
  -- and put a : a begining of line
  show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
    where
    treeshow pref Empty = ""
    treeshow pref (Node x Empty Empty) =
                  (pshow pref x)

    treeshow pref (Node x left Empty) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " left)

    treeshow pref (Node x Empty right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    treeshow pref (Node x left right) =
                  (pshow pref x) ++ "\n" ++
                  (showSon pref "|--" "|  " left) ++ "\n" ++
                  (showSon pref "`--" "   " right)

    -- show a tree using some prefixes to make it nice
    showSon pref before next t =
                  pref ++ before ++ treeshow (pref ++ next) t

    -- pshow replace "\n" by "\n"++pref
    pshow pref x = replace '\n' ("\n"++pref) (" " ++ show x)

    -- replace on char by another string
    replace c new string =
      concatMap (change c new) string
      where
          change c new x
              | x == c = new
              | otherwise = x:[] -- "x"

treeTakeDepth _ Empty = Empty
treeTakeDepth 0 _     = Empty
treeTakeDepth n (Node x left right) = let
          nl = treeTakeDepth (n-1) left
          nr = treeTakeDepth (n-1) right
          in
              Node x nl nr

In order to resolve these problem we will modify slightly our treeFromList and shuffle function.

A first problem, is the lack of infinite different number in our implementation of shuffle. We generated only 4331 different numbers. To resolve this we make a slightly better shuffle function.

shuffle = map rand [1..]
          where
              rand x = ((p x) `mod` (x+c)) - ((x+c) `div` 2)
              p x = m*x^2 + n*x + o -- some polynome
              m = 3123
              n = 31
              o = 7641
              c = 1237

This shuffle function has the property (hopefully) not to have an upper nor lower bound. But having a better shuffle list isn’t enough not to enter an infinite loop.

Generally, we cannot decide whether filter (<x) xs is empty. Then to resolve this problem, I’ll authorize some error in the creation of our binary tree. This new version of code can create binary tree which don’t have the following property for some of its nodes:

Any element of the left (resp. right) branch must all be strictly inferior (resp. superior) to the label of the root.

Remark it will remains mostly an ordered binary tree. Furthermore, by construction, each node value is unique in the tree.

Here is our new version of treeFromList. We simply have replaced filter by safefilter.

treeFromList :: (Ord a, Show a) => [a] -> BinTree a
treeFromList []    = Empty
treeFromList (x:xs) = Node x left right
          where
              left = treeFromList $ safefilter (<x) xs
              right = treeFromList $ safefilter (>x) xs

This new function safefilter is almost equivalent to filter but don’t enter infinite loop if the result is a finite list. If it cannot find an element for which the test is true after 10000 consecutive steps, then it considers to be the end of the search.

safefilter :: (a -> Bool) -> [a] -> [a]
safefilter f l = safefilter' f l nbTry
  where
      nbTry = 10000
      safefilter' _ _ 0 = []
      safefilter' _ [] _ = []
      safefilter' f (x:xs) n =
                  if f x
                     then x : safefilter' f xs nbTry
                     else safefilter' f xs (n-1)

Now run the program and be happy:

main = do
      putStrLn "take 10 shuffle"
      print $ take 10 shuffle
      putStrLn "\ntreeTakeDepth 8 (treeFromList shuffle)"
      print $ treeTakeDepth 8 (treeFromList $ shuffle)

You should realize the time to print each value is different. This is because Haskell compute each value when it needs it. And in this case, this is when asked to print it on the screen.

Impressively enough, try to replace the depth from 8 to 100. It will work without killing your RAM! The flow and the memory management is done naturally by Haskell.

Left as an exercise to the reader:

  • Even with large constant value for deep and nbTry, it seems to work nicely. But in the worst case, it can be exponential. Create a worst case list to give as parameter to treeFromList.
    hint: think about ([0,-1,-1,....,-1,1,-1,...,-1,1,...]).
  • I first tried to implement safefilter as follow:
    safefilter' f l = if filter f (take 10000 l) == []
                      then []
                      else filter f l
    

    Explain why it doesn’t work and can enter into an infinite loop.

  • Suppose that shuffle is real random list with growing bounds. If you study a bit this structure, you’ll discover that with probability 1, this structure is finite. Using the following code (suppose we could use safefilter' directly as if was not in the where of safefilter) find a definition of f such that with probability 1, treeFromList’ shuffle is infinite. And prove it. Disclaimer, this is only a conjecture.
treeFromList' []  n = Empty
treeFromList' (x:xs) n = Node x left right
    where
        left = treeFromList' (safefilter' (<x) xs (f n)
        right = treeFromList' (safefilter' (>x) xs (f n)
        f = ???

04_Appendice/01_More_on_infinite_trees/11_Infinite_Trees.lhs

Thanks

Thanks to /r/haskell and /r/programming. Your comment were most than welcome.

Particularly, I want to thank Emm a thousand times for the time he spent on correcting my English. Thank you man.


  1. Even if most recent languages try to hide them, they are present.

  2. I know I’m cheating. But I will talk about non-strict later.

  3. For the brave, a more complete explanation of pattern matching can be found here.

  4. You should remark squareEvenSum'' is more efficient that the two other versions. The order of (.) is important.

  5. Which itself is very similar to the javascript eval on a string containing JSON).

  6. There are some unsafe exceptions to this rule. But you shouldn’t see such use on a real application except maybe for debugging purpose.

  7. For the curious the real type is data IO a = IO {unIO :: State# RealWorld -> (# State# RealWorld, a #)}. All the # as to do with optimisation and I swapped the fields in my example. But mostly, the idea is exactly the same.

  8. Well, you’ll certainly need to practice a bit to get used to them and to understand when you can use them and create your own. But you already made a big step in this direction.

Follow @yogsototh
Copyright ©, Yann Esposito
Created: 02/08/2012 Modified: 06/05/2012
Entirely done with Vim and nanoc