her.esy.fun/src/posts/0010-Haskell-Now/index.org

3565 lines
95 KiB
Org Mode
Raw Normal View History

2019-12-15 16:05:57 +00:00
#+title: Learn Haskell Now!
#+subtitle: A dense Haskell learning material for the brave
#+date: [2019-12-15 Sun]
#+author: Yann Esposito
#+EMAIL: yann@esposito.host
#+keywords: Haskell, programming, functional, tutorial |
2019-12-15 16:05:57 +00:00
#+DESCRIPTION: A very dense introduction and Haskell tutorial. Brace yourself.
#+OPTIONS: auto-id:t toc:t
#+begin_notes
A very short and intense introduction to Haskell.
This is an update of my old (2012) article.
A lot of things have changed since then.
And I took the time to read it again.
#+end_notes
#+begin_quote
*Prelude*
In 2012, I really believed that every developer should learn Haskell.
2019-12-15 16:05:57 +00:00
This is why I wrote my old article.
This is the end of 2019 and I still strongly believe that, yes, you must at
least be able to understand enough Haskell to write a simple tool.
But a few things have changed in the Haskell world.
1. Project building has a few working solution. When I wrote this article I
had a few web application that I can no longer build today.
I mean, if I really want to invest some time, I'm sure I could make
those project build again. But this is not worth the hassle.
Now we have =stack=, =nix=, =cabal new-build= and I'm sure some other
solutions.
2. GHC is able to do a lot more magic than then.
This is beyond the scope of an introductory material in my opinion.
But, while the learning curve is as steep as before the highest point of
learning just jumped higher than before with each new GHC release.
3. Still no real consencus about how to work, learn, and use Haskell.
In my opinion there are three different perspective on Haskell that
could definitively change how you make decisions about different aspect
of Haskell programming. I belive the main groups of ideolgies are
application developers, library developers and even language (mostly
GHC) developers.
I kind of find those tensions a proof of an healthy environment.
There are different solutions to the same problems and that is perfectly
fine.
This is quite different when you compare to other language ecosystems
where decisions are more controlled or enforced.
I feel fine with both approaches.
But you must understand that there is not really any central mindset
within Haskeller unlike I can find in some other programming language
communities.
4. Haskell has become a lot more serious programming language now.
There are a lot more big projects written in Haskell not just toy projects.
Also I myself have certainly matured on my take on Haskell.
I am paid to work in Clojure since 2013 now, and most of my personal side
project are written either in Haskell or in Purescript (an Haskell inspired
language mostly focused on frontend development).
As such I can follow two functional programming communities growth and
evolution.
I am kind of confident that my Haskell understanding is a lot better than
before.
But I still think, the ability to learn new Haskell subject is infinite.
I want someday to write an article about my team philosophy about how we
program.
Mostly, our rule is to use as few features of a programming language as
possible to achieve your goal.
This is a kind of merge between minimalism and pragmatism that in the end
provide a tremendous amount of benefits.
This is why, even if I like to try the latest trend/hype in Haskell.
I generally program without those latest nice features because, with just a
very few amount of Haskell features you will already be in enviromnent with
a *lot* of benefits as compared to other programming languages ecosystem.
So enough talk, here is my old article new again, with just a few changes
and cleanup.
Also, I will try to go a bit further than before.
By the end of this article you should be autonomous if you want to create a
new product in Haskell.
Be it a simple command line tool or a web application.
If you are going toward GUI programming, this is a whole subject on its own
and I do not really mention it.
My .02 for "Single Page Application" is to use Purescript with the halogen
framework.
Purescript is really awesome as well as halogen.
#+end_quote
I really believe that every developer should learn Haskell.
I don't think every dev needs to be a super Haskell ninja, but they should
at least discover what Haskell has to offer.
Learning Haskell opens your mind.
Mainstream languages share the same foundations:
- variables
- loops
- pointers[fn:1]
- data structures, objects and classes (for most)
Haskell is very different.
The language uses a lot of concepts I had never heard about before.
Many of those concepts will help you become a better programmer.
2019-12-15 16:05:57 +00:00
But learning Haskell can be (and will certainly be) hard.
It was for me.
In this article I try to provide as much help as possible to accelerate
your learning.
This article will certainly be hard to follow.
This is on purpose.
There is no shortcut to learning Haskell.
It is hard and challenging.
But I believe this is a good thing.
It is because it is hard that Haskell is interesting and rewarding.
Today, I could not really provide a conventional path to learn Haskell.
So I think the best I can do is point you to the [[https://www.haskell.org/documentation/][haskell.org]] documentation
website.
And you will see that most path involve a quite long learning process.
By that, I mean that you should read a long book and invest a lot of hours
and certainly days before having a good idea about what Haskell is all about.
In contrast, this article is a very brief and dense overview of all
major aspects of Haskell.
I also added some information I lacked while I learned Haskell.
The article contains five parts:
- Introduction: a short example to show Haskell can be friendly.
- Basic Haskell: Haskell syntax, and some essential notions.
2019-12-15 16:05:57 +00:00
- Normal Difficulty Part:
- Functional style; a progressive example, from imperative to
functional style
- Types; types and a standard binary tree example
- Infinite Structure; manipulate an infinite binary tree!
2019-12-15 16:05:57 +00:00
- Nightmare Difficulty Part:
- Deal with IO; A very minimal example
- IO trick explained; the hidden detail I lacked to understand IO
- Monads; incredible how we can generalize
2019-12-15 16:05:57 +00:00
- Hell Difficulty Part:
- Write a real world command line application
- Write a real world full featured REST API
- Appendix:
- More on infinite tree; a more math oriented discussion about
infinite trees
* Introduction
:PROPERTIES:
:CUSTOM_ID: introduction
:END:
** Install
:PROPERTIES:
:CUSTOM_ID: install
:END:
#+CAPTION: Haskell logo
[[./Haskell-logo.png]]
2019-12-15 16:05:57 +00:00
There are multiple way to install Haskell and I don't think there is a full
consensus between developer about what is the best method.
2019-12-15 16:05:57 +00:00
For this tutorial, I expect you to have either installed the [[https://nixos.org/nix][nix]] package manager.
Or to have installed [[https://haskellstack.org][=stack=]].
2019-12-15 16:05:57 +00:00
With those two method I can provide you a bang patter prefix to create self
executable script that will use the Haskell compiler I expect and hopefully
all the code example should still work for a _very_ long time.
There are other way to install Haskell on your system you could visit,
you can learn more about it by visiting [[https://haskell.org][haskell.org]].
2019-12-15 16:05:57 +00:00
The environment in which you will learn Haskell will be quite different
from an environment to use Haskell seriously for a new project.
This is because, there are too much choices for that.
2019-12-16 12:07:02 +00:00
Mainly, you can start by writing your code in a file and executing it by
putting one of the following at the top of your file:
If you chose Nix: https://nixos.org/nix/
#+BEGIN_EXAMPLE
#! /usr/bin/env nix-shell
#! nix-shell -i runghc
#! nix-shell -p "ghc.withPackages (ps: [ ps.protolude ])"
#! nix-shell -I nixpkgs="https://github.com/NixOS/nixpkgs/archive/19.09.tar.gz"
#+END_EXAMPLE
If you chose Stack: https://haskellstack.org
#+BEGIN_EXAMPLE haskell
#!/usr/bin/env stack
{- stack script
--resolver lts-14.16
--install-ghc
--package protolude
-}
#+END_EXAMPLE
*** code :noexport:
:PROPERTIES:
:CUSTOM_ID: code
:END:
#+name nixb
#+begin_src elisp :export none
(defun nixb ()
"#! /usr/bin/env nix-shell\n#! nix-shell -i runghc\n#! nix-shell -p \"ghc.withPackages (ps: [ ps.protolude ])\"\n#! nix-shell -I nixpkgs=\"https://github.com/NixOS/nixpkgs/archive/19.09.tar.gz\"")
#+end_src
#+RESULTS:
: nixb
** Don't be afraid
:PROPERTIES:
:CUSTOM_ID: don't-be-afraid
:END:
#+CAPTION: The Scream
[[./munch_TheScream.jpg]]
Many books/articles about Haskell start by introducing some esoteric
formula (quick sort, Fibonacci, etc...). I will do the exact opposite.
At first I won't show you any Haskell super power. I will start with
similarities between Haskell and other programming languages. Let's jump
to the mandatory "Hello World".
2019-12-16 12:07:02 +00:00
[[./hello.hs]]
#+NAME: hello.hs
#+BEGIN_SRC haskell :tangle hello.hs :shebang (nixb)
main = putStrLn "Hello World!"
#+END_SRC
#+BEGIN_EXAMPLE
2019-12-16 12:07:02 +00:00
> chmod +x hello.hs
> ./hello.hs
Hello World!
#+END_EXAMPLE
#+BEGIN_EXAMPLE
2019-12-16 12:07:02 +00:00
> stack ghc -- hello.hs
> ./hello
Hello World!
#+END_EXAMPLE
Now, a program asking your name and replying "Hello" using the name you
entered:
2019-12-16 12:07:02 +00:00
#+NAME: name.hs
#+BEGIN_SRC haskell :tangle name.hs :shebang (nixb)
main = do
print "What is your name?"
name <- getLine
print ("Hello " ++ name ++ "!")
#+END_SRC
First, let us compare this with similar programs in a few imperative
languages:
#+BEGIN_SRC python
# Python
print "What is your name?"
name = raw_input()
print "Hello %s!" % name
#+END_SRC
#+BEGIN_SRC ruby
# Ruby
puts "What is your name?"
name = gets.chomp
puts "Hello #{name}!"
#+END_SRC
#+BEGIN_SRC C
// In C
#include <stdio.h>
int main (int argc, char **argv) {
char name[666]; // <- An Evil Number!
// What if my name is more than 665 character long?
printf("What is your name?\n");
scanf("%s", name);
printf("Hello %s!\n", name);
return 0;
}
#+END_SRC
The structure is the same, but there are some syntax differences. The
main part of this tutorial will be dedicated to explaining why.
In Haskell there is a =main= function and every object has a type. The
type of =main= is =IO ()=. This means =main= will cause side effects.
Just remember that Haskell can look a lot like mainstream imperative
languages.
-----
** Very basic Haskell
:PROPERTIES:
:CUSTOM_ID: very-basic-haskell
:END:
#+CAPTION: Picasso minimal owl
[[./picasso_owl.jpg]]
Before continuing you need to be warned about some essential properties
of Haskell.
/Functional/
Haskell is a functional language. If you have an imperative language
background, you'll have to learn a lot of new things. Hopefully many of
these new concepts will help you to program even in imperative
languages.
/Smart Static Typing/
Instead of being in your way like in =C=, =C++= or =Java=, the type
system is here to help you.
/Purity/
Generally your functions won't modify anything in the outside world.
This means they can't modify the value of a variable, can't get user
input, can't write on the screen, can't launch a missile. On the other
hand, parallelism will be very easy to achieve. Haskell makes it clear
where effects occur and where your code is pure. Also, it will be far
easier to reason about your program. Most bugs will be prevented in the
pure parts of your program.
Furthermore, pure functions follow a fundamental law in Haskell:
#+BEGIN_QUOTE
Applying a function with the same parameters always returns the same value.
#+END_QUOTE
/Laziness/
Laziness by default is a very uncommon language design. By default,
Haskell evaluates something only when it is needed. In consequence, it
provides a very elegant way to manipulate infinite structures, for
example.
A last warning about how you should read Haskell code. For me, it is
like reading scientific papers. Some parts are very clear, but when you
see a formula, just focus and read slower. Also, while learning Haskell,
it /really/ doesn't matter much if you don't understand syntax details.
If you meet a =>>==, =<$>=, =<-= or any other weird symbol, just ignore
them and follows the flow of the code.
*** Function declaration
:PROPERTIES:
:CUSTOM_ID: function-declaration
:END:
You might be used to declaring functions like this:
In =C=:
#+BEGIN_SRC C
int f(int x, int y) {
return x*x + y*y;
}
#+END_SRC
In JavaScript:
#+BEGIN_SRC javascript
function f(x,y) {
return x*x + y*y;
}
#+END_SRC
in Python:
#+BEGIN_SRC python
def f(x,y):
return x*x + y*y
#+END_SRC
in Ruby:
#+BEGIN_SRC ruby
def f(x,y)
x*x + y*y
end
#+END_SRC
In Scheme:
#+BEGIN_SRC scheme
(define (f x y)
(+ (* x x) (* y y)))
#+END_SRC
Finally, the Haskell way is:
#+BEGIN_SRC haskell
f x y = x*x + y*y
#+END_SRC
Very clean. No parenthesis, no =def=.
Don't forget, Haskell uses functions and types a lot. It is thus very
easy to define them. The syntax was particularly well thought out for
these objects.
*** A Type Example
:PROPERTIES:
:CUSTOM_ID: a-type-example
:END:
Although it is not mandatory, type information for functions is usually
made explicit. It's not mandatory because the compiler is smart enough
to discover it for you. It's a good idea because it indicates intent and
understanding.
Let's play a little. We declare the type using =::=
#+BEGIN_SRC haskell
f :: Int -> Int -> Int
f x y = x*x + y*y
main = print (f 2 3)
#+END_SRC
#+BEGIN_EXAMPLE
~ runhaskell 20_very_basic.lhs
13
#+END_EXAMPLE
-----
Now try
#+BEGIN_SRC haskell
f :: Int -> Int -> Int
f x y = x*x + y*y
main = print (f 2.3 4.2)
#+END_SRC
You should get this error:
#+BEGIN_EXAMPLE
21_very_basic.lhs:6:23:
No instance for (Fractional Int)
arising from the literal `4.2'
Possible fix: add an instance declaration for (Fractional Int)
In the second argument of `f', namely `4.2'
In the first argument of `print', namely `(f 2.3 4.2)'
In the expression: print (f 2.3 4.2)
#+END_EXAMPLE
The problem: =4.2= isn't an Int.
-----
The solution: don't declare a type for =f= for the moment and let
Haskell infer the most general type for us:
#+BEGIN_SRC haskell
f x y = x*x + y*y
main = print (f 2.3 4.2)
#+END_SRC
It works! Luckily, we don't have to declare a new function for every
single type. For example, in =C=, you'll have to declare a function for
=int=, for =float=, for =long=, for =double=, etc...
But, what type should we declare? To discover the type Haskell has found
for us, just launch ghci:
#+BEGIN_SRC
% ghci
GHCi, version 7.0.4: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Prelude> let f x y = x*x + y*y
Prelude> :type f
f :: Num a => a -> a -> a
#+END_SRC
Uh? What is this strange type?
#+BEGIN_EXAMPLE
Num a => a -> a -> a
#+END_EXAMPLE
First, let's focus on the right part =a -> a -> a=. To understand it,
just look at a list of progressive examples:
| The written type | Its meaning |
|--------------------+---------------------------------------------------------------------------|
| =Int= | the type =Int= |
| =Int -> Int= | the type function from =Int= to =Int= |
| =Float -> Int= | the type function from =Float= to =Int= |
| =a -> Int= | the type function from any type to =Int= |
| =a -> a= | the type function from any type =a= to the same type =a= |
| =a -> a -> a= | the type function of two arguments of any type =a= to the same type =a= |
In the type =a -> a -> a=, the letter =a= is a /type variable/. It means
=f= is a function with two arguments and both arguments and the result
have the same type. The type variable =a= could take many different type
values. For example =Int=, =Integer=, =Float=...
So instead of having a forced type like in =C= and having to declare a
function for =int=, =long=, =float=, =double=, etc., we declare only one
function like in a dynamically typed language.
This is sometimes called parametric polymorphism. It's also called
having your cake and eating it too.
Generally =a= can be any type, for example a =String= or an =Int=, but
also more complex types, like =Trees=, other functions, etc. But here
our type is prefixed with =Num a =>=.
=Num= is a /type class/. A type class can be understood as a set of
types. =Num= contains only types which behave like numbers. More
precisely, =Num= is class containing types which implement a specific
list of functions, and in particular =(+)= and =(*)=.
Type classes are a very powerful language construct. We can do some
incredibly powerful stuff with this. More on this later.
Finally, =Num a => a -> a -> a= means:
Let =a= be a type belonging to the =Num= type class. This is a function
from type =a= to (=a -> a=).
Yes, strange. In fact, in Haskell no function really has two arguments.
Instead all functions have only one argument. But we will note that
taking two arguments is equivalent to taking one argument and returning
a function taking the second argument as a parameter.
More precisely =f 3 4= is equivalent to =(f 3) 4=. Note =f 3= is a
function:
#+BEGIN_SRC haskell
f :: Num a => a -> a -> a
g :: Num a => a -> a
g = f 3
g y ⇔ 3*3 + y*y
#+END_SRC
Another notation exists for functions. The lambda notation allows us to
create functions without assigning them a name. We call them anonymous
functions. We could also have written:
#+BEGIN_SRC haskell
g = \y -> 3*3 + y*y
#+END_SRC
The =\= is used because it looks like =λ= and is ASCII.
If you are not used to functional programming your brain should be
starting to heat up. It is time to make a real application.
-----
But just before that, we should verify the type system works as
expected:
#+BEGIN_SRC haskell
f :: Num a => a -> a -> a
f x y = x*x + y*y
main = print (f 3 2.4)
#+END_SRC
It works, because, =3= is a valid representation both for Fractional
numbers like Float and for Integer. As =2.4= is a Fractional number, =3=
is then interpreted as being also a Fractional number.
-----
If we force our function to work with different types, it will fail:
#+BEGIN_SRC haskell
f :: Num a => a -> a -> a
f x y = x*x + y*y
x :: Int
x = 3
y :: Float
y = 2.4
-- won't work because type x ≠ type y
main = print (f x y)
#+END_SRC
The compiler complains. The two parameters must have the same type.
If you believe that this is a bad idea, and that the compiler should
make the transformation from one type to another for you, you should
really watch this great (and funny) video:
[[https://www.destroyallsoftware.com/talks/wat][WAT]]
* Essential Haskell
:PROPERTIES:
:CUSTOM_ID: essential-haskell
:END:
#+CAPTION: Kandinsky Gugg
[[./kandinsky_gugg.jpg]]
I suggest that you skim this part. Think of it as a reference. Haskell
has a lot of features. A lot of information is missing here. Come back
here if the notation feels strange.
I use the =⇔= symbol to state that two expression are equivalent. It is
a meta notation, =⇔= does not exists in Haskell. I will also use =⇒= to
show what the return value of an expression is.
** Notations
:PROPERTIES:
:CUSTOM_ID: notations
:END:
**** Arithmetic
:PROPERTIES:
:CUSTOM_ID: arithmetic
:END:
#+BEGIN_SRC
3 + 2 * 6 / 3 ⇔ 3 + ((2*6)/3)
#+END_SRC
**** Logic
:PROPERTIES:
:CUSTOM_ID: logic
:END:
#+BEGIN_SRC
True || False ⇒ True
True && False ⇒ False
True == False ⇒ False
True /= False ⇒ True (/=) is the operator for different
#+END_SRC
**** Powers
:PROPERTIES:
:CUSTOM_ID: powers
:END:
#+BEGIN_SRC
x^n for n an integral (understand Int or Integer)
x**y for y any kind of number (Float for example)
#+END_SRC
=Integer= has no limit except the capacity of your machine:
#+BEGIN_EXAMPLE
4^103
102844034832575377634685573909834406561420991602098741459288064
#+END_EXAMPLE
Yeah! And also rational numbers FTW! But you need to import the module
=Data.Ratio=:
#+BEGIN_EXAMPLE
$ ghci
....
Prelude> :m Data.Ratio
Data.Ratio> (11 % 15) * (5 % 3)
11 % 9
#+END_EXAMPLE
**** Lists
:PROPERTIES:
:CUSTOM_ID: lists
:END:
#+BEGIN_EXAMPLE
[] ⇔ empty list
[1,2,3] ⇔ List of integral
["foo","bar","baz"] ⇔ List of String
1:[2,3] ⇔ [1,2,3], (:) prepend one element
1:2:[] ⇔ [1,2]
[1,2] ++ [3,4] ⇔ [1,2,3,4], (++) concatenate
[1,2,3] ++ ["foo"] ⇔ ERROR String ≠ Integral
[1..4] ⇔ [1,2,3,4]
[1,3..10] ⇔ [1,3,5,7,9]
[2,3,5,7,11..100] ⇔ ERROR! I am not so smart!
[10,9..1] ⇔ [10,9,8,7,6,5,4,3,2,1]
#+END_EXAMPLE
**** Strings
:PROPERTIES:
:CUSTOM_ID: strings
:END:
In Haskell strings are list of =Char=.
#+BEGIN_EXAMPLE
'a' :: Char
"a" :: [Char]
"" ⇔ []
"ab" ⇔ ['a','b'] ⇔ 'a':"b" ⇔ 'a':['b'] ⇔ 'a':'b':[]
"abc" ⇔ "ab"++"c"
#+END_EXAMPLE
#+BEGIN_QUOTE
/Remark/: In real code you shouldn't use list of char to represent
text. You should mostly use =Data.Text= instead. If you want to
represent a stream of ASCII char, you should use =Data.ByteString=.
#+END_QUOTE
**** Tuples
:PROPERTIES:
:CUSTOM_ID: tuples
:END:
The type of couple is =(a,b)=. Elements in a tuple can have different
types.
#+BEGIN_EXAMPLE
-- All these tuples are valid
(2,"foo")
(3,'a',[2,3])
((2,"a"),"c",3)
fst (x,y) ⇒ x
snd (x,y) ⇒ y
fst (x,y,z) ⇒ ERROR: fst :: (a,b) -> a
snd (x,y,z) ⇒ ERROR: snd :: (a,b) -> b
#+END_EXAMPLE
**** Deal with parentheses
:PROPERTIES:
:CUSTOM_ID: deal-with-parentheses
:END:
To remove some parentheses you can use two functions: =($)= and =(.)=.
#+BEGIN_EXAMPLE
-- By default:
f g h x ⇔ (((f g) h) x)
-- the $ replace parenthesis from the $
-- to the end of the expression
f g $ h x ⇔ f g (h x) ⇔ (f g) (h x)
f $ g h x ⇔ f (g h x) ⇔ f ((g h) x)
f $ g $ h x ⇔ f (g (h x))
-- (.) the composition function
(f . g) x ⇔ f (g x)
(f . g . h) x ⇔ f (g (h x))
#+END_EXAMPLE
-----
** Useful notations for functions
:PROPERTIES:
:CUSTOM_ID: useful-notations-for-functions
:END:
Just a reminder:
#+BEGIN_EXAMPLE
x :: Int ⇔ x is of type Int
x :: a ⇔ x can be of any type
x :: Num a => a ⇔ x can be any type a
such that a belongs to Num type class
f :: a -> b ⇔ f is a function from a to b
f :: a -> b -> c ⇔ f is a function from a to (b→c)
f :: (a -> b) -> c ⇔ f is a function from (a→b) to c
#+END_EXAMPLE
Remember that defining the type of a function before its declaration
isn't mandatory. Haskell infers the most general type for you. But it is
considered a good practice to do so.
/Infix notation/
#+BEGIN_SRC haskell
square :: Num a => a -> a
square x = x^2
#+END_SRC
Note =^= uses infix notation. For each infix operator there its
associated prefix notation. You just have to put it inside parenthesis.
#+BEGIN_SRC haskell
square' x = (^) x 2
square'' x = (^2) x
#+END_SRC
We can remove =x= in the left and right side! It's called η-reduction.
#+BEGIN_SRC haskell
square''' = (^2)
#+END_SRC
Note we can declare functions with ='= in their name. Here:
#+BEGIN_QUOTE
=square==square'==square''==square'''=
#+END_QUOTE
/Tests/
An implementation of the absolute function.
#+BEGIN_SRC haskell
absolute :: (Ord a, Num a) => a -> a
absolute x = if x >= 0 then x else -x
#+END_SRC
Note: the =if .. then .. else= Haskell notation is more like the =¤?¤:¤=
C operator. You cannot forget the =else=.
Another equivalent version:
#+BEGIN_SRC haskell
absolute' x
| x >= 0 = x
| otherwise = -x
#+END_SRC
#+BEGIN_QUOTE
Notation warning: indentation is /important/ in Haskell. Like in
Python, bad indentation can break your code!
#+END_QUOTE
#+BEGIN_SRC haskell
main = do
print $ square 10
print $ square' 10
print $ square'' 10
print $ square''' 10
print $ absolute 10
print $ absolute (-10)
print $ absolute' 10
print $ absolute' (-10)
#+END_SRC
2019-12-15 16:05:57 +00:00
* Difficulty: Normal
:PROPERTIES:
:CUSTOM_ID: hard-part
:END:
The hard part can now begin.
** Functional style
:PROPERTIES:
:CUSTOM_ID: functional-style
:END:
#+CAPTION: Biomechanical Landscape by H.R. Giger
[[./hr_giger_biomechanicallandscape_500.jpg]]
In this section, I will give a short example of the impressive
refactoring ability provided by Haskell. We will select a problem and
solve it in a standard imperative way. Then I will make the code evolve.
The end result will be both more elegant and easier to adapt.
Let's solve the following problem:
#+BEGIN_QUOTE
Given a list of integers, return the sum of the even numbers in the
list.
example: =[1,2,3,4,5] ⇒ 2 + 4 ⇒ 6=
#+END_QUOTE
To show differences between functional and imperative approaches, I'll
start by providing an imperative solution (in JavaScript):
#+BEGIN_SRC javascript
function evenSum(list) {
var result = 0;
for (var i=0; i< list.length ; i++) {
if (list[i] % 2 ==0) {
result += list[i];
}
}
return result;
}
#+END_SRC
In Haskell, by contrast, we don't have variables or a for loop. One
solution to achieve the same result without loops is to use recursion.
#+BEGIN_QUOTE
/Remark/: Recursion is generally perceived as slow in imperative
languages. But this is generally not the case in functional
programming. Most of the time Haskell will handle recursive functions
efficiently.
#+END_QUOTE
Here is a =C= version of the recursive function. Note that for
simplicity I assume the int list ends with the first =0= value.
#+BEGIN_SRC C
int evenSum(int *list) {
return accumSum(0,list);
}
int accumSum(int n, int *list) {
int x;
int *xs;
if (*list == 0) { // if the list is empty
return n;
} else {
x = list[0]; // let x be the first element of the list
xs = list+1; // let xs be the list without x
if ( 0 == (x%2) ) { // if x is even
return accumSum(n+x, xs);
} else {
return accumSum(n, xs);
}
}
}
#+END_SRC
Keep this code in mind. We will translate it into Haskell. First,
however, I need to introduce three simple but useful functions we will
use:
#+BEGIN_SRC haskell
even :: Integral a => a -> Bool
head :: [a] -> a
tail :: [a] -> [a]
#+END_SRC
=even= verifies if a number is even.
#+BEGIN_SRC haskell
even :: Integral a => a -> Bool
even 3 ⇒ False
even 2 ⇒ True
#+END_SRC
=head= returns the first element of a list:
#+BEGIN_SRC haskell
head :: [a] -> a
head [1,2,3] ⇒ 1
head [] ⇒ ERROR
#+END_SRC
=tail= returns all elements of a list, except the first:
#+BEGIN_SRC haskell
tail :: [a] -> [a]
tail [1,2,3] ⇒ [2,3]
tail [3] ⇒ []
tail [] ⇒ ERROR
#+END_SRC
Note that for any non empty list =l=, =l ⇔ (head l):(tail l)=
-----
The first Haskell solution. The function =evenSum= returns the sum of
all even numbers in a list:
#+BEGIN_SRC haskell
-- Version 1
evenSum :: [Integer] -> Integer
evenSum l = accumSum 0 l
accumSum n l = if l == []
then n
else let x = head l
xs = tail l
in if even x
then accumSum (n+x) xs
else accumSum n xs
#+END_SRC
To test a function you can use =ghci=:
#+BEGIN_HTML
% ghci
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> :load 11_Functions.lhs
[1 of 1] Compiling Main ( 11_Functions.lhs, interpreted )
Ok, modules loaded: Main.
*Main> evenSum [1..5]
6
#+END_HTML
Here is an example of execution[fn:2]:
#+BEGIN_HTML
*Main> evenSum [1..5]
accumSum 0 [1,2,3,4,5]
1 is odd
accumSum 0 [2,3,4,5]
2 is even
accumSum (0+2) [3,4,5]
3 is odd
accumSum (0+2) [4,5]
2 is even
accumSum (0+2+4) [5]
5 is odd
accumSum (0+2+4) []
l == []
0+2+4
0+6
6
#+END_HTML
Coming from an imperative language all should seem right. In fact, many
things can be improved here. First, we can generalize the type.
#+BEGIN_SRC haskell
evenSum :: Integral a => [a] -> a
#+END_SRC
#+BEGIN_SRC haskell
main = do print $ evenSum [1..10]
#+END_SRC
-----
Next, we can use sub functions using =where= or =let=.
This way our =accumSum= function won't pollute the namespace of our module.
#+BEGIN_SRC haskell
-- Version 2
evenSum :: Integral a => [a] -> a
evenSum l = accumSum 0 l
where accumSum n l =
if l == []
then n
else let x = head l
xs = tail l
in if even x
then accumSum (n+x) xs
else accumSum n xs
#+END_SRC
#+BEGIN_SRC haskell
main = print $ evenSum [1..10]
#+END_SRC
-----
Next, we can use pattern matching.
#+BEGIN_SRC haskell
-- Version 3
evenSum l = accumSum 0 l
where
accumSum n [] = n
accumSum n (x:xs) =
if even x
then accumSum (n+x) xs
else accumSum n xs
#+END_SRC
What is pattern matching? Use values instead of general parameter
names[fn:3].
Instead of saying: =foo l = if l == [] then <x> else <y>= you simply
state:
#+BEGIN_SRC haskell
foo [] = <x>
foo l = <y>
#+END_SRC
But pattern matching goes even further. It is also able to inspect the
inner data of a complex value. We can replace
#+BEGIN_SRC haskell
foo l = let x = head l
xs = tail l
in if even x
then foo (n+x) xs
else foo n xs
#+END_SRC
with
#+BEGIN_SRC haskell
foo (x:xs) = if even x
then foo (n+x) xs
else foo n xs
#+END_SRC
This is a very useful feature.
It makes our code both terser and easier to read.
#+BEGIN_SRC haskell
main = print $ evenSum [1..10]
#+END_SRC
-----
In Haskell you can simplify function definitions by η-reducing them.
For example, instead of writing:
#+BEGIN_SRC haskell
f x = (some expresion) x
#+END_SRC
you can simply write
#+BEGIN_SRC haskell
f = some expression
#+END_SRC
We use this method to remove the =l=:
#+BEGIN_SRC haskell
-- Version 4
evenSum :: Integral a => [a] -> a
evenSum = accumSum 0
where
accumSum n [] = n
accumSum n (x:xs) =
if even x
then accumSum (n+x) xs
else accumSum n xs
#+END_SRC
#+BEGIN_SRC haskell
main = print $ evenSum [1..10]
#+END_SRC
-----
*** Higher Order Functions
:PROPERTIES:
:CUSTOM_ID: higher-order-functions
:END:
#+CAPTION: Escher
[[./escher_polygon.png]]
To make things even better we should use higher order functions. What
are these beasts? Higher order functions are functions taking functions
as parameters.
Here are some examples:
#+BEGIN_SRC haskell
filter :: (a -> Bool) -> [a] -> [a]
map :: (a -> b) -> [a] -> [b]
foldl :: (a -> b -> a) -> a -> [b] -> a
#+END_SRC
Let's proceed by small steps.
#+BEGIN_SRC haskell
-- Version 5
evenSum l = mysum 0 (filter even l)
where
mysum n [] = n
mysum n (x:xs) = mysum (n+x) xs
#+END_SRC
where
#+BEGIN_SRC haskell
filter even [1..10] ⇔ [2,4,6,8,10]
#+END_SRC
The function =filter= takes a function of type (=a -> Bool=) and a list of
type =[a]=.
It returns a list containing only elements for which the function returned
=true=.
Our next step is to use another technique to accomplish the same thing as a
loop.
We will use the =foldl= function to accumulate a value as we pass through
the list.
The function =foldl= captures a general coding pattern:
#+BEGIN_SRC haskell
myfunc list = foo initialValue list
foo accumulated [] = accumulated
foo tmpValue (x:xs) = foo (bar tmpValue x) xs
#+END_SRC
Which can be replaced by:
#+BEGIN_SRC haskell
myfunc list = foldl bar initialValue list
#+END_SRC
If you really want to know how the magic works, here is the definition of
=foldl=:
#+BEGIN_SRC haskell
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
#+END_SRC
#+BEGIN_SRC haskell
foldl f z [x1,...xn]
⇔ f (... (f (f z x1) x2) ...) xn
#+END_SRC
But as Haskell is lazy, it doesn't evaluate =(f z x)= and simply pushes it
onto the stack.
This is why we generally use =foldl'= instead of =foldl=; =foldl'= is a
/strict/ version of =foldl=.
If you don't understand what lazy and strict means, don't worry, just
follow the code as if =foldl= and =foldl'= were identical.
Now our new version of =evenSum= becomes:
#+BEGIN_SRC haskell
-- Version 6
-- foldl' isn't accessible by default
-- we need to import it from the module Data.List
import Data.List
evenSum l = foldl' mysum 0 (filter even l)
where mysum acc value = acc + value
#+END_SRC
We can also simplify this by using directly a lambda notation.
This way we don't have to create the temporary name =mysum=.
#+BEGIN_SRC haskell
-- Version 7
-- Generally it is considered a good practice
-- to import only the necessary function(s)
import Data.List (foldl')
evenSum l = foldl' (\x y -> x+y) 0 (filter even l)
#+END_SRC
And of course, we note that
#+BEGIN_SRC haskell
(\x y -> x+y) ⇔ (+)
#+END_SRC
#+BEGIN_SRC haskell
main = print $ evenSum [1..10]
#+END_SRC
-----
Finally
#+BEGIN_SRC haskell
-- Version 8
import Data.List (foldl')
evenSum :: Integral a => [a] -> a
evenSum l = foldl' (+) 0 (filter even l)
#+END_SRC
=foldl'= isn't the easiest function to grasp.
If you are not used to it, you should study it a bit.
To help you understand what's going on here, let's look at a step by step
evaluation:
#+BEGIN_SRC haskell
evenSum [1,2,3,4]
⇒ foldl' (+) 0 (filter even [1,2,3,4])
⇒ foldl' (+) 0 [2,4]
⇒ foldl' (+) (0+2) [4]
⇒ foldl' (+) 2 [4]
⇒ foldl' (+) (2+4) []
⇒ foldl' (+) 6 []
⇒ 6
#+END_SRC
Another useful higher order function is =(.)=.
The =(.)= function corresponds to mathematical composition.
#+BEGIN_SRC haskell
(f . g . h) x ⇔ f ( g (h x))
#+END_SRC
We can take advantage of this operator to η-reduce our function:
#+BEGIN_SRC haskell
-- Version 9
import Data.List (foldl')
evenSum :: Integral a => [a] -> a
evenSum = (foldl' (+) 0) . (filter even)
#+END_SRC
Also, we could rename some parts to make it clearer:
#+BEGIN_SRC haskell
-- Version 10
import Data.List (foldl')
sum' :: (Num a) => [a] -> a
sum' = foldl' (+) 0
evenSum :: Integral a => [a] -> a
evenSum = sum' . (filter even)
#+END_SRC
It is time to discuss the direction our code has moved as we introduced
more functional idioms.
What did we gain by using higher order functions?
At first, you might think the main difference is terseness.
But in fact, it has more to do with better thinking. Suppose we want to
modify our function slightly, for example, to get the sum of all even
squares of elements of the list.
#+BEGIN_EXAMPLE
[1,2,3,4] ▷ [1,4,9,16] ▷ [4,16] ▷ 20
#+END_EXAMPLE
Updating version 10 is extremely easy:
#+BEGIN_SRC haskell
squareEvenSum = sum' . (filter even) . (map (^2))
squareEvenSum' = evenSum . (map (^2))
#+END_SRC
We just had to add another "transformation function"[^0216].
#+BEGIN_EXAMPLE
map (^2) [1,2,3,4] ⇔ [1,4,9,16]
#+END_EXAMPLE
The =map= function simply applies a function to all the elements of a
list.
We didn't have to modify anything /inside/ the function definition.
This makes the code more modular.
But in addition you can think more mathematically about your function.
You can also use your function interchangably with others, as needed.
That is, you can compose, map, fold, filter using your new function.
Modifying version 1 is left as an exercise to the reader ☺.
If you believe we have reached the end of generalization, then know you are
very wrong.
For example, there is a way to not only use this function on lists but on
any recursive type.
If you want to know how, I suggest you to read this quite fun article:
[[http://eprints.eemcs.utwente.nl/7281/01/db-utwente-40501F46.pdf][Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire by Meijer, Fokkinga and Paterson]].
This example should show you how great pure functional programming is.
Unfortunately, using pure functional programming isn't well suited to
all usages. Or at least such a language hasn't been found yet.
One of the great powers of Haskell is the ability to create DSLs (Domain
Specific Language) making it easy to change the programming paradigm.
In fact, Haskell is also great when you want to write imperative style
programming. Understanding this was really hard for me to grasp when
first learning Haskell. A lot of effort tends to go into explaining the
superiority of the functional approach. Then when you start using an
imperative style with Haskell, it can be hard to understand when and how
to use it.
But before talking about this Haskell super-power, we must talk about
another essential aspect of Haskell: /Types/.
#+BEGIN_SRC haskell
main = print $ evenSum [1..10]
#+END_SRC
** Types
:PROPERTIES:
:CUSTOM_ID: types
:END:
#+CAPTION: Dali, the madonna of port Lligat
[[./salvador-dali-the-madonna-of-port-lligat.jpg]]
#+BEGIN_QUOTE
%tldr
- =type Name = AnotherType= is just an alias and the compiler doesn't
mark any difference between =Name= and =AnotherType=.
- =data Name = NameConstructor AnotherType= does mark a difference.
- =data= can construct structures which can be recursives.
- =deriving= is magic and creates functions for you.
#+END_QUOTE
In Haskell, types are strong and static.
Why is this important? It will help you /greatly/ to avoid mistakes. In
Haskell, most bugs are caught during the compilation of your program.
And the main reason is because of the type inference during compilation.
Type inference makes it easy to detect where you used the wrong
parameter at the wrong place, for example.
*** Type inference
:PROPERTIES:
:CUSTOM_ID: type-inference
:END:
Static typing is generally essential for fast execution. But most
statically typed languages are bad at generalizing concepts. Haskell's
saving grace is that it can /infer/ types.
Here is a simple example, the =square= function in Haskell:
#+BEGIN_SRC haskell
square x = x * x
#+END_SRC
This function can =square= any Numeral type. You can provide =square=
with an =Int=, an =Integer=, a =Float= a =Fractional= and even
=Complex=. Proof by example:
#+BEGIN_EXAMPLE
% ghci
GHCi, version 7.0.4:
...
Prelude> let square x = x*x
Prelude> square 2
4
Prelude> square 2.1
4.41
Prelude> -- load the Data.Complex module
Prelude> :m Data.Complex
Prelude Data.Complex> square (2 :+ 1)
3.0 :+ 4.0
#+END_EXAMPLE
=x :+ y= is the notation for the complex (x + iy).
Now compare with the amount of code necessary in C:
#+BEGIN_SRC C
int int_square(int x) { return x*x; }
float float_square(float x) {return x*x; }
complex complex_square (complex z) {
complex tmp;
tmp.real = z.real * z.real - z.img * z.img;
tmp.img = 2 * z.img * z.real;
}
complex x,y;
y = complex_square(x);
#+END_SRC
For each type, you need to write a new function. The only way to work
around this problem is to use some meta-programming trick, for example
using the pre-processor. In C++ there is a better way, C++ templates:
#+BEGIN_SRC c++
#include <iostream>
#include <complex>
using namespace std;
template<typename T>
T square(T x)
{
return x*x;
}
int main() {
// int
int sqr_of_five = square(5);
cout << sqr_of_five << endl;
// double
cout << (double)square(5.3) << endl;
// complex
cout << square( complex<double>(5,3) )
<< endl;
return 0;
}
#+END_SRC
C++ does a far better job than C in this regard. But for more complex
functions the syntax can be hard to follow: see [[http://bartoszmilewski.com/2009/10/21/what-does-haskell-have-to-do-with-c/][this article]] for example.
In C++ you must declare that a function can work with different types.
In Haskell, the opposite is the case. The function will be as general as
possible by default.
Type inference gives Haskell the feeling of freedom that dynamically
typed languages provide. But unlike dynamically typed languages, most
errors are caught before run time. Generally, in Haskell:
#+BEGIN_QUOTE
"if it compiles it certainly does what you intended"
#+END_QUOTE
-----
*** Type construction
:PROPERTIES:
:CUSTOM_ID: type-construction
:END:
You can construct your own types. First, you can use aliases or type
synonyms.
#+BEGIN_SRC haskell
type Name = String
type Color = String
showInfos :: Name -> Color -> String
showInfos name color = "Name: " ++ name
++ ", Color: " ++ color
name :: Name
name = "Robin"
color :: Color
color = "Blue"
main = putStrLn $ showInfos name color
#+END_SRC
-----
But it doesn't protect you much. Try to swap the two parameter of
=showInfos= and run the program:
#+BEGIN_SRC haskell
putStrLn $ showInfos color name
#+END_SRC
It will compile and execute.
In fact you can replace Name, Color and String everywhere. The compiler
will treat them as completely identical.
Another method is to create your own types using the keyword =data=.
#+BEGIN_SRC haskell
data Name = NameConstr String
data Color = ColorConstr String
showInfos :: Name -> Color -> String
showInfos (NameConstr name) (ColorConstr color) =
"Name: " ++ name ++ ", Color: " ++ color
name = NameConstr "Robin"
color = ColorConstr "Blue"
main = putStrLn $ showInfos name color
#+END_SRC
Now if you switch parameters of =showInfos=, the compiler complains! So
this is a potential mistake you will never make again and the only price
is to be more verbose.
Also notice that constructors are functions:
#+BEGIN_SRC haskell
NameConstr :: String -> Name
ColorConstr :: String -> Color
#+END_SRC
The syntax of =data= is mainly:
#+BEGIN_SRC haskell
data TypeName = ConstructorName [types]
| ConstructorName2 [types]
| ...
#+END_SRC
Generally the usage is to use the same name for the DataTypeName and
DataTypeConstructor.
Example:
#+BEGIN_SRC haskell
data Complex a = Num a => Complex a a
#+END_SRC
Also you can use the record syntax:
#+BEGIN_SRC haskell
data DataTypeName = DataConstructor {
field1 :: [type of field1]
, field2 :: [type of field2]
...
, fieldn :: [type of fieldn] }
#+END_SRC
And many accessors are made for you. Furthermore you can use another
order when setting values.
Example:
#+BEGIN_SRC haskell
data Complex a = Num a => Complex { real :: a, img :: a}
c = Complex 1.0 2.0
z = Complex { real = 3, img = 4 }
real c ⇒ 1.0
img z ⇒ 4
#+END_SRC
-----
*** Recursive type
:PROPERTIES:
:CUSTOM_ID: recursive-type
:END:
You already encountered a recursive type: lists. You can re-create
lists, but with a more verbose syntax:
#+BEGIN_SRC haskell
data List a = Empty | Cons a (List a)
#+END_SRC
If you really want to use an easier syntax you can use an infix name for
constructors.
#+BEGIN_SRC haskell
infixr 5 :::
data List a = Nil | a ::: (List a)
#+END_SRC
The number after =infixr= gives the precedence.
If you want to be able to print (=Show=), read (=Read=), test equality
(=Eq=) and compare (=Ord=) your new data structure you can tell Haskell
to derive the appropriate functions for you.
#+BEGIN_SRC haskell
infixr 5 :::
data List a = Nil | a ::: (List a)
deriving (Show,Read,Eq,Ord)
#+END_SRC
When you add =deriving (Show)= to your data declaration, Haskell creates
a =show= function for you. We'll see soon how you can use your own
=show= function.
#+BEGIN_SRC haskell
convertList [] = Nil
convertList (x:xs) = x ::: convertList xs
#+END_SRC
#+BEGIN_SRC haskell
main = do
print (0 ::: 1 ::: Nil)
print (convertList [0,1])
#+END_SRC
This prints:
#+BEGIN_EXAMPLE
0 ::: (1 ::: Nil)
0 ::: (1 ::: Nil)
#+END_EXAMPLE
-----
*** Trees
:PROPERTIES:
:CUSTOM_ID: trees
:END:
#+CAPTION: Magritte, l'Arbre
[[./magritte-l-arbre.jpg]]
We'll just give another standard example: binary trees.
#+BEGIN_SRC haskell
import Data.List
data BinTree a = Empty
| Node a (BinTree a) (BinTree a)
deriving (Show)
#+END_SRC
We will also create a function which turns a list into an ordered binary
tree.
#+BEGIN_SRC haskell
treeFromList :: (Ord a) => [a] -> BinTree a
treeFromList [] = Empty
treeFromList (x:xs) = Node x (treeFromList (filter (<x) xs))
(treeFromList (filter (>x) xs))
#+END_SRC
Look at how elegant this function is. In plain English:
- an empty list will be converted to an empty tree.
- a list =(x:xs)= will be converted to a tree where:
- The root is =x=
- Its left subtree is the tree created from members of the list =xs=
which are strictly inferior to =x= and
- the right subtree is the tree created from members of the list =xs=
which are strictly superior to =x=.
#+BEGIN_SRC haskell
main = print $ treeFromList [7,2,4,8]
#+END_SRC
You should obtain the following:
#+BEGIN_EXAMPLE
Node 7 (Node 2 Empty (Node 4 Empty Empty)) (Node 8 Empty Empty)
#+END_EXAMPLE
This is an informative but quite unpleasant representation of our tree.
-----
Just for fun, let's code a better display for our trees. I simply had
fun making a nice function to display trees in a general way. You can
safely skip this part if you find it too difficult to follow.
We have a few changes to make. We remove the =deriving (Show)= from the
declaration of our =BinTree= type. And it might also be useful to make
our BinTree an instance of (=Eq= and =Ord=) so we will be able to test
equality and compare trees.
#+BEGIN_SRC haskell
data BinTree a = Empty
| Node a (BinTree a) (BinTree a)
deriving (Eq,Ord)
#+END_SRC
Without the =deriving (Show)=, Haskell doesn't create a =show= method
for us. We will create our own version of =show=. To achieve this, we
must declare that our newly created type =BinTree a= is an instance of
the type class =Show=. The general syntax is:
#+BEGIN_SRC haskell
instance Show (BinTree a) where
show t = ... -- You declare your function here
#+END_SRC
Here is my version of how to show a binary tree. Don't worry about the
apparent complexity. I made a lot of improvements in order to display
even stranger objects.
#+BEGIN_SRC haskell
-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
-- will start by a '<' before the root
-- and put a : a begining of line
show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
where
-- treeshow pref Tree
-- shows a tree and starts each line with pref
-- We don't display the Empty tree
treeshow pref Empty = ""
-- Leaf
treeshow pref (Node x Empty Empty) =
(pshow pref x)
-- Right branch is empty
treeshow pref (Node x left Empty) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " left)
-- Left branch is empty
treeshow pref (Node x Empty right) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " right)
-- Tree with left and right children non empty
treeshow pref (Node x left right) =
(pshow pref x) ++ "\n" ++
(showSon pref "|--" "| " left) ++ "\n" ++
(showSon pref "`--" " " right)
-- shows a tree using some prefixes to make it nice
showSon pref before next t =
pref ++ before ++ treeshow (pref ++ next) t
-- pshow replaces "\n" by "\n"++pref
pshow pref x = replace '\n' ("\n"++pref) (show x)
-- replaces one char by another string
replace c new string =
concatMap (change c new) string
where
change c new x
| x == c = new
| otherwise = x:[] -- "x"
#+END_SRC
The =treeFromList= method remains identical.
#+BEGIN_SRC haskell
treeFromList :: (Ord a) => [a] -> BinTree a
treeFromList [] = Empty
treeFromList (x:xs) = Node x (treeFromList (filter (<x) xs))
(treeFromList (filter (>x) xs))
#+END_SRC
And now, we can play:
#+BEGIN_SRC haskell
main = do
putStrLn "Int binary tree:"
print $ treeFromList [7,2,4,8,1,3,6,21,12,23]
#+END_SRC
#+BEGIN_EXAMPLE
Int binary tree:
< 7
: |--2
: | |--1
: | `--4
: | |--3
: | `--6
: `--8
: `--21
: |--12
: `--23
#+END_EXAMPLE
Now it is far better! The root is shown by starting the line with the
=<= character. And each following line starts with a =:=. But we could
also use another type.
#+BEGIN_SRC haskell
putStrLn "\nString binary tree:"
print $ treeFromList ["foo","bar","baz","gor","yog"]
#+END_SRC
#+BEGIN_EXAMPLE
String binary tree:
< "foo"
: |--"bar"
: | `--"baz"
: `--"gor"
: `--"yog"
#+END_EXAMPLE
As we can test equality and order trees, we can make tree of trees!
#+BEGIN_SRC haskell
putStrLn "\nBinary tree of Char binary trees:"
print ( treeFromList
(map treeFromList ["baz","zara","bar"]))
#+END_SRC
#+BEGIN_EXAMPLE
Binary tree of Char binary trees:
< < 'b'
: : |--'a'
: : `--'z'
: |--< 'b'
: | : |--'a'
: | : `--'r'
: `--< 'z'
: : `--'a'
: : `--'r'
#+END_EXAMPLE
This is why I chose to prefix each line of tree display by =:= (except
for the root).
#+CAPTION: Yo Dawg Tree
[[./yo_dawg_tree.jpg]]
#+BEGIN_SRC haskell
putStrLn "\nTree of Binary trees of Char binary trees:"
print $ (treeFromList . map (treeFromList . map treeFromList))
[ ["YO","DAWG"]
, ["I","HEARD"]
, ["I","HEARD"]
, ["YOU","LIKE","TREES"] ]
#+END_SRC
Which is equivalent to
#+BEGIN_SRC haskell
print ( treeFromList (
map treeFromList
[ map treeFromList ["YO","DAWG"]
, map treeFromList ["I","HEARD"]
, map treeFromList ["I","HEARD"]
, map treeFromList ["YOU","LIKE","TREES"] ]))
#+END_SRC
and gives:
#+BEGIN_EXAMPLE
Binary tree of Binary trees of Char binary trees:
< < < 'Y'
: : : `--'O'
: : `--< 'D'
: : : |--'A'
: : : `--'W'
: : : `--'G'
: |--< < 'I'
: | : `--< 'H'
: | : : |--'E'
: | : : | `--'A'
: | : : | `--'D'
: | : : `--'R'
: `--< < 'Y'
: : : `--'O'
: : : `--'U'
: : `--< 'L'
: : : `--'I'
: : : |--'E'
: : : `--'K'
: : `--< 'T'
: : : `--'R'
: : : |--'E'
: : : `--'S'
#+END_EXAMPLE
Notice how duplicate trees aren't inserted; there is only one tree
corresponding to ="I","HEARD"=. We have this for (almost) free, because
we have declared Tree to be an instance of =Eq=.
See how awesome this structure is: We can make trees containing not only
integers, strings and chars, but also other trees. And we can even make
a tree containing a tree of trees!
** Infinite Structures
:PROPERTIES:
:CUSTOM_ID: infinite-structures
:END:
#+CAPTION: Escher
[[./escher_infinite_lizards.jpg]]
It is often said that Haskell is /lazy/.
In fact, if you are a bit pedantic, you should say that
[[http://www.haskell.org/haskellwiki/Lazy_vs._non-strict][Haskell is /non-strict/]].
Laziness is just a common implementation for non-strict languages.
Then what does "not-strict" mean? From the Haskell wiki:
#+BEGIN_QUOTE
Reduction (the mathematical term for evaluation) proceeds from the
outside in.
so if you have =(a+(b*c))= then you first reduce =+= first, then you
reduce the inner =(b*c)=
#+END_QUOTE
For example in Haskell you can do:
#+BEGIN_SRC haskell
-- numbers = [1,2,..]
numbers :: [Integer]
numbers = 0:map (1+) numbers
take' n [] = []
take' 0 l = []
take' n (x:xs) = x:take' (n-1) xs
main = print $ take' 10 numbers
#+END_SRC
And it stops.
How?
Instead of trying to evaluate =numbers= entirely, it evaluates elements
only when needed.
Also, note in Haskell there is a notation for infinite lists
#+BEGIN_EXAMPLE
[1..] ⇔ [1,2,3,4...]
[1,3..] ⇔ [1,3,5,7,9,11...]
#+END_EXAMPLE
and most functions will work with them. Also, there is a built-in
function =take= which is equivalent to our =take'=.
-----
This code is mostly the same as the previous one.
#+BEGIN_SRC haskell
import Debug.Trace (trace)
import Data.List
data BinTree a = Empty
| Node a (BinTree a) (BinTree a)
deriving (Eq,Ord)
#+END_SRC
#+BEGIN_SRC haskell
-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
-- will start by a '<' before the root
-- and put a : a begining of line
show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
where
treeshow pref Empty = ""
treeshow pref (Node x Empty Empty) =
(pshow pref x)
treeshow pref (Node x left Empty) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " left)
treeshow pref (Node x Empty right) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " right)
treeshow pref (Node x left right) =
(pshow pref x) ++ "\n" ++
(showSon pref "|--" "| " left) ++ "\n" ++
(showSon pref "`--" " " right)
-- show a tree using some prefixes to make it nice
showSon pref before next t =
pref ++ before ++ treeshow (pref ++ next) t
-- pshow replace "\n" by "\n"++pref
pshow pref x = replace '\n' ("\n"++pref) (" " ++ show x)
-- replace on char by another string
replace c new string =
concatMap (change c new) string
where
change c new x
| x == c = new
| otherwise = x:[] -- "x"
#+END_SRC
Suppose we don't mind having an ordered binary tree. Here is an infinite
binary tree:
#+BEGIN_SRC haskell
nullTree = Node 0 nullTree nullTree
#+END_SRC
A complete binary tree where each node is equal to 0. Now I will prove
you can manipulate this object using the following function:
#+BEGIN_SRC haskell
-- take all element of a BinTree
-- up to some depth
treeTakeDepth _ Empty = Empty
treeTakeDepth 0 _ = Empty
treeTakeDepth n (Node x left right) = let
nl = treeTakeDepth (n-1) left
nr = treeTakeDepth (n-1) right
in
Node x nl nr
#+END_SRC
See what occurs for this program:
#+BEGIN_SRC haskell
main = print $ treeTakeDepth 4 nullTree
#+END_SRC
This code compiles, runs and stops giving the following result:
#+BEGIN_EXAMPLE
< 0
: |-- 0
: | |-- 0
: | | |-- 0
: | | `-- 0
: | `-- 0
: | |-- 0
: | `-- 0
: `-- 0
: |-- 0
: | |-- 0
: | `-- 0
: `-- 0
: |-- 0
: `-- 0
#+END_EXAMPLE
Just to heat up your neurones a bit more, let's make a slightly more
interesting tree:
#+BEGIN_SRC haskell
iTree = Node 0 (dec iTree) (inc iTree)
where
dec (Node x l r) = Node (x-1) (dec l) (dec r)
inc (Node x l r) = Node (x+1) (inc l) (inc r)
#+END_SRC
Another way to create this tree is to use a higher order function. This
function should be similar to =map=, but should work on =BinTree=
instead of list. Here is such a function:
#+BEGIN_SRC haskell
-- apply a function to each node of Tree
treeMap :: (a -> b) -> BinTree a -> BinTree b
treeMap f Empty = Empty
treeMap f (Node x left right) = Node (f x)
(treeMap f left)
(treeMap f right)
#+END_SRC
/Hint/: I won't talk more about this here. If you are interested in the
generalization of =map= to other data structures, search for functor and
=fmap=.
Our definition is now:
#+BEGIN_SRC haskell
infTreeTwo :: BinTree Int
infTreeTwo = Node 0 (treeMap (\x -> x-1) infTreeTwo)
(treeMap (\x -> x+1) infTreeTwo)
#+END_SRC
Look at the result for
#+BEGIN_SRC haskell
main = print $ treeTakeDepth 4 infTreeTwo
#+END_SRC
#+BEGIN_EXAMPLE
< 0
: |-- -1
: | |-- -2
: | | |-- -3
: | | `-- -1
: | `-- 0
: | |-- -1
: | `-- 1
: `-- 1
: |-- 0
: | |-- -1
: | `-- 1
: `-- 2
: |-- 1
: `-- 3
#+END_EXAMPLE
#+BEGIN_SRC haskell
main = do
print $ treeTakeDepth 4 nullTree
print $ treeTakeDepth 4 infTreeTwo
#+END_SRC
2019-12-15 16:05:57 +00:00
* Difficulty: Nightmare
:PROPERTIES:
:CUSTOM_ID: hell-difficulty-part
:END:
Congratulations for getting so far! Now, some of the really hardcore
stuff can start.
If you are like me, you should get the functional style. You should also
understand a bit more the advantages of laziness by default. But you
also don't really understand where to start in order to make a real
program. And in particular:
- How do you deal with effects?
- Why is there a strange imperative-like notation for dealing with IO?
Be prepared, the answers might be complex. But they are all very
rewarding.
** Deal With IO
:PROPERTIES:
:CUSTOM_ID: deal-with-io
:END:
#+CAPTION: Magritte, Carte blanche
[[./magritte_carte_blanche.jpg]]
#+BEGIN_QUOTE
%tldr
A typical function doing =IO= looks a lot like an imperative program:
#+BEGIN_SRC haskell
f :: IO a
f = do
x <- action1
action2 x
y <- action3
action4 x y
#+END_SRC
- To set a value to an object we use =<-= .
- The type of each line is =IO *=; in this example:
- =action1 :: IO b=
- =action2 x :: IO ()=
- =action3 :: IO c=
- =action4 x y :: IO a=
- =x :: b=, =y :: c=
- Few objects have the type =IO a=, this should help you choose. In
particular you cannot use pure functions directly here. To use pure
functions you could do =action2 (purefunction x)= for example.
#+END_QUOTE
In this section, I will explain how to use IO, not how it works. You'll
see how Haskell separates the pure from the impure parts of the program.
Don't stop because you're trying to understand the details of the
syntax. Answers will come in the next section.
What to achieve?
#+BEGIN_QUOTE
Ask a user to enter a list of numbers. Print the sum of the numbers
#+END_QUOTE
#+BEGIN_SRC haskell
toList :: String -> [Integer]
toList input = read ("[" ++ input ++ "]")
main = do
putStrLn "Enter a list of numbers (separated by comma):"
input <- getLine
print $ sum (toList input)
#+END_SRC
It should be straightforward to understand the behavior of this program.
Let's analyze the types in more detail.
#+BEGIN_SRC haskell
putStrLn :: String -> IO ()
getLine :: IO String
print :: Show a => a -> IO ()
#+END_SRC
Or more interestingly, we note that each expression in the =do= block
has a type of =IO a=.
#+BEGIN_SRC haskell
main = do
putStrLn "Enter ... " :: IO ()
getLine :: IO String
print Something :: IO ()
#+END_SRC
We should also pay attention to the effect of the =<-= symbol.
#+BEGIN_SRC haskell
do
x <- something
#+END_SRC
If =something :: IO a= then =x :: a=.
Another important note about using =IO=: All lines in a do block must be
of one of the two forms:
#+BEGIN_SRC haskell
action1 :: IO a
-- in this case, generally a = ()
#+END_SRC
or
#+BEGIN_SRC haskell
value <- action2 -- where
-- action2 :: IO b
-- value :: b
#+END_SRC
These two kinds of line will correspond to two different ways of
sequencing actions. The meaning of this sentence should be clearer by
the end of the next section.
-----
Now let's see how this program behaves. For example, what happens if the
user enters something strange? Let's try:
#+BEGIN_EXAMPLE
% runghc 02_progressive_io_example.lhs
Enter a list of numbers (separated by comma):
foo
Prelude.read: no parse
#+END_EXAMPLE
Argh! An evil error message and a crash! Our first improvement will
simply be to answer with a more friendly message.
In order to do this, we must detect that something went wrong. Here is
one way to do this: use the type =Maybe=. This is a very common type in
Haskell.
#+BEGIN_SRC haskell
import Data.Maybe
#+END_SRC
What is this thing? =Maybe= is a type which takes one parameter. Its
definition is:
#+BEGIN_SRC haskell
data Maybe a = Nothing | Just a
#+END_SRC
This is a nice way to tell there was an error while trying to
create/compute a value. The =maybeRead= function is a great example of
this. This is a function similar to the function =read=[fn:4], but if
something goes wrong the returned value is =Nothing=. If the value is
right, it returns =Just <the value>=. Don't try to understand too much
of this function. I use a lower level function than =read=: =reads=.
#+BEGIN_SRC haskell
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
[(x,"")] -> Just x
_ -> Nothing
#+END_SRC
Now to be a bit more readable, we define a function which goes like
this: If the string has the wrong format, it will return =Nothing=.
Otherwise, for example for "1,2,3", it will return =Just [1,2,3]=.
#+BEGIN_SRC haskell
getListFromString :: String -> Maybe [Integer]
getListFromString str = maybeRead $ "[" ++ str ++ "]"
#+END_SRC
We simply have to test the value in our main function.
#+BEGIN_SRC haskell
main :: IO ()
main = do
putStrLn "Enter a list of numbers (separated by comma):"
input <- getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> print (sum l)
Nothing -> error "Bad format. Good Bye."
#+END_SRC
In case of error, we display a nice error message.
Note that the type of each expression in the main's =do= block remains
of the form =IO a=. The only strange construction is =error=. I'll just
say here that =error msg= takes the needed type (here =IO ()=).
One very important thing to note is the type of all the functions
defined so far. There is only one function which contains =IO= in its
type: =main=. This means main is impure. But main uses
=getListFromString= which is pure. So it's clear just by looking at
declared types which functions are pure and which are impure.
Why does purity matter? Among the many advantages, here are three:
- It is far easier to think about pure code than impure code.
- Purity protects you from all the hard-to-reproduce bugs that are due
to side effects.
- You can evaluate pure functions in any order or in parallel without
risk.
This is why you should generally put as most code as possible inside
pure functions.
-----
Our next iteration will be to prompt the user again and again until she
enters a valid answer.
We keep the first part:
#+BEGIN_SRC haskell
import Data.Maybe
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
[(x,"")] -> Just x
_ -> Nothing
getListFromString :: String -> Maybe [Integer]
getListFromString str = maybeRead $ "[" ++ str ++ "]"
#+END_SRC
Now we create a function which will ask the user for an list of integers
until the input is right.
#+BEGIN_SRC haskell
askUser :: IO [Integer]
askUser = do
putStrLn "Enter a list of numbers (separated by comma):"
input <- getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
#+END_SRC
This function is of type =IO [Integer]=. Such a type means that we
retrieved a value of type =[Integer]= through some IO actions. Some
people might explain while waving their hands:
#+BEGIN_QUOTE
«This is an =[Integer]= inside an =IO=»
#+END_QUOTE
If you want to understand the details behind all of this, you'll have to
read the next section. But really, if you just want to /use/ IO just
practice a little and remember to think about the type.
Finally our main function is much simpler:
#+BEGIN_SRC haskell
main :: IO ()
main = do
list <- askUser
print $ sum list
#+END_SRC
We have finished with our introduction to =IO=. This was quite fast.
Here are the main things to remember:
- in the =do= block, each expression must have the type =IO a=. You are
then limited with regard to the range of expressions available. For
example, =getLine=, =print=, =putStrLn=, etc...
- Try to externalize the pure functions as much as possible.
- the =IO a= type means: an IO /action/ which returns an element of type
=a=. =IO= represents actions; under the hood, =IO a= is the type of a
function. Read the next section if you are curious.
If you practice a bit, you should be able to /use/ =IO=.
#+BEGIN_QUOTE
/Exercises/:
- Make a program that sums all of its arguments. Hint: use the
function =getArgs=.
#+END_QUOTE
** IO trick explained
:PROPERTIES:
:CUSTOM_ID: io-trick-explained
:END:
#+CAPTION: Magritte, ceci n'est pas une pipe
[[./magritte_pipe.jpg]]
#+BEGIN_QUOTE
Here is a %tldr for this section.
To separate pure and impure parts, =main= is defined as a function
which modifies the state of the world.
#+BEGIN_EXAMPLE
main :: World -> World
#+END_EXAMPLE
A function is guaranteed to have side effects only if it has this
type. But look at a typical main function:
#+BEGIN_EXAMPLE
main w0 =
let (v1,w1) = action1 w0 in
let (v2,w2) = action2 v1 w1 in
let (v3,w3) = action3 v2 w2 in
action4 v3 w3
#+END_EXAMPLE
We have a lot of temporary elements (here =w1=, =w2= and =w3=) which
must be passed on to the next action.
We create a function =bind= or =(>>=)=. With =bind= we don't need
temporary names anymore.
#+BEGIN_EXAMPLE
main =
action1 >>= action2 >>= action3 >>= action4
#+END_EXAMPLE
Bonus: Haskell has syntactical sugar for us:
#+BEGIN_EXAMPLE
main = do
v1 <- action1
v2 <- action2 v1
v3 <- action3 v2
action4 v3
#+END_EXAMPLE
#+END_QUOTE
Why did we use this strange syntax, and what exactly is this =IO= type?
It looks a bit like magic.
For now let's just forget all about the pure parts of our program, and
focus on the impure parts:
#+BEGIN_SRC haskell
askUser :: IO [Integer]
askUser = do
putStrLn "Enter a list of numbers (separated by commas):"
input <- getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
main :: IO ()
main = do
list <- askUser
print $ sum list
#+END_SRC
First remark: this looks imperative. Haskell is powerful enough to make
impure code look imperative. For example, if you wish you could create a
=while= in Haskell. In fact, for dealing with =IO=, an imperative style
is generally more appropriate.
But you should have noticed that the notation is a bit unusual. Here is
why, in detail.
In an impure language, the state of the world can be seen as a huge
hidden global variable. This hidden variable is accessible by all
functions of your language. For example, you can read and write a file
in any function. Whether a file exists or not is a difference in the
possible states that the world can take.
In Haskell the current state of the world is not hidden. Rather, it is
/explicitly/ said that =main= is a function that /potentially/ changes
the state of the world. Its type is then something like:
#+BEGIN_SRC haskell
main :: World -> World
#+END_SRC
Not all functions may access this variable. Those which have access to
this variable are impure. Functions to which the world variable isn't
provided are pure[fn:5].
Haskell considers the state of the world as an input variable to =main=.
But the real type of main is closer to this one[fn:6]:
#+BEGIN_SRC haskell
main :: World -> ((),World)
#+END_SRC
The =()= type is the unit type. Nothing to see here.
Now let's rewrite our main function with this in mind:
#+BEGIN_SRC haskell
main w0 =
let (list,w1) = askUser w0 in
let (x,w2) = print (sum list,w1) in
x
#+END_SRC
First, we note that all functions which have side effects must have the
type:
#+BEGIN_SRC haskell
World -> (a,World)
#+END_SRC
where =a= is the type of the result. For example, a =getChar= function
should have the type =World -> (Char, World)=.
Another thing to note is the trick to fix the order of evaluation. In
Haskell, in order to evaluate =f a b=, you have many choices:
- first eval =a= then =b= then =f a b=
- first eval =b= then =a= then =f a b=.
- eval =a= and =b= in parallel then =f a b=
This is true because we're working in a pure part of the language.
Now, if you look at the main function, it is clear you must eval the
first line before the second one since to evaluate the second line you
have to get a parameter given by the evaluation of the first line.
This trick works like a charm. The compiler will at each step provide a
pointer to a new real world id. Under the hood, =print= will evaluate
as:
- print something on the screen
- modify the id of the world
- evaluate as =((),new world id)=.
Now, if you look at the style of the main function, it is clearly
awkward. Let's try to do the same to the =askUser= function:
#+BEGIN_SRC haskell
askUser :: World -> ([Integer],World)
#+END_SRC
Before:
#+BEGIN_SRC haskell
askUser :: IO [Integer]
askUser = do
putStrLn "Enter a list of numbers:"
input <- getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
#+END_SRC
After:
#+BEGIN_SRC haskell
askUser w0 =
let (_,w1) = putStrLn "Enter a list of numbers:" in
let (input,w2) = getLine w1 in
let (l,w3) = case getListFromString input of
Just l -> (l,w2)
Nothing -> askUser w2
in
(l,w3)
#+END_SRC
This is similar, but awkward. Look at all these temporary =w?= names.
The lesson is: naive IO implementation in Pure functional languages is
awkward!
Fortunately, there is a better way to handle this problem. We see a
pattern. Each line is of the form:
#+BEGIN_SRC haskell
let (y,w') = action x w in
#+END_SRC
Even if for some lines the first =x= argument isn't needed. The output
type is a couple, =(answer, newWorldValue)=. Each function =f= must have
a type similar to:
#+BEGIN_SRC haskell
f :: World -> (a,World)
#+END_SRC
Not only this, but we can also note that we always follow the same usage
pattern:
#+BEGIN_SRC haskell
let (y,w1) = action1 w0 in
let (z,w2) = action2 w1 in
let (t,w3) = action3 w2 in
...
#+END_SRC
Each action can take from 0 to n parameters. And in particular, each
action can take a parameter from the result of a line above.
For example, we could also have:
#+BEGIN_SRC haskell
let (_,w1) = action1 x w0 in
let (z,w2) = action2 w1 in
let (_,w3) = action3 z w2 in
...
#+END_SRC
With, of course: =actionN w :: (World) -> (a,World)=.
#+BEGIN_QUOTE
IMPORTANT: there are only two important patterns to consider:
#+BEGIN_SRC haskell
let (x,w1) = action1 w0 in
let (y,w2) = action2 x w1 in
#+END_SRC
and
#+BEGIN_SRC haskell
let (_,w1) = action1 w0 in
let (y,w2) = action2 w1 in
#+END_SRC
#+END_QUOTE
#+CAPTION: Jocker pencil trick
[[./jocker_pencil_trick.jpg]]
Now, we will do a magic trick. We will make the temporary world symbols
"disappear". We will =bind= the two lines. Let's define the =bind=
function. Its type is quite intimidating at first:
#+BEGIN_SRC haskell
bind :: (World -> (a,World))
-> (a -> (World -> (b,World)))
-> (World -> (b,World))
#+END_SRC
But remember that =(World -> (a,World))= is the type for an IO action.
Now let's rename it for clarity:
#+BEGIN_SRC haskell
type IO a = World -> (a, World)
#+END_SRC
Some examples of functions:
#+BEGIN_SRC haskell
getLine :: IO String
print :: Show a => a -> IO ()
#+END_SRC
=getLine= is an IO action which takes world as a parameter and returns a
couple =(String, World)=. This can be summarized as: =getLine= is of
type =IO String=, which we also see as an IO action which will return a
String "embeded inside an IO".
The function =print= is also interesting. It takes one argument which
can be shown. In fact it takes two arguments. The first is the value to
print and the other is the state of world. It then returns a couple of
type =((), World)=. This means that it changes the state of the world,
but doesn't yield any more data.
This new =IO a= type helps us simplify the type of =bind=:
#+BEGIN_SRC haskell
bind :: IO a
-> (a -> IO b)
-> IO b
#+END_SRC
It says that =bind= takes two IO actions as parameters and returns
another IO action.
Now, remember the /important/ patterns. The first was:
#+BEGIN_SRC haskell
pattern1 w0 =
let (x,w1) = action1 w0 in
let (y,w2) = action2 x w1 in
(y,w2)
#+END_SRC
Look at the types:
#+BEGIN_SRC haskell
action1 :: IO a
action2 :: a -> IO b
pattern1 :: IO b
#+END_SRC
Doesn't it seem familiar?
#+BEGIN_SRC haskell
(bind action1 action2) w0 =
let (x, w1) = action1 w0
(y, w2) = action2 x w1
in (y, w2)
#+END_SRC
The idea is to hide the World argument with this function. Let's go: As
an example imagine if we wanted to simulate:
#+BEGIN_SRC haskell
let (line1, w1) = getLine w0 in
let ((), w2) = print line1 in
((), w2)
#+END_SRC
Now, using the =bind= function:
#+BEGIN_SRC haskell
(res, w2) = (bind getLine print) w0
#+END_SRC
As print is of type =Show a => a -> (World -> ((), World))=, we know
=res = ()= (=unit= type). If you didn't see what was magic here, let's
try with three lines this time.
#+BEGIN_SRC haskell
let (line1,w1) = getLine w0 in
let (line2,w2) = getLine w1 in
let ((),w3) = print (line1 ++ line2) in
((),w3)
#+END_SRC
Which is equivalent to:
#+BEGIN_SRC haskell
(res,w3) = (bind getLine (\line1 ->
(bind getLine (\line2 ->
print (line1 ++ line2))))) w0
#+END_SRC
Didn't you notice something? Yes, no temporary World variables are used
anywhere! This is /MA/. /GIC/.
We can use a better notation. Let's use =(>>=)= instead of =bind=.
=(>>=)= is an infix function like =(+)=; reminder =3 + 4 ⇔ (+) 3 4=
#+BEGIN_SRC haskell
(res,w3) = (getLine >>=
(\line1 -> getLine >>=
(\line2 -> print (line1 ++ line2)))) w0
#+END_SRC
fr; Haskell a confectionné du sucre syntaxique pour vous : Ho Ho Ho!
Merry Christmas Everyone! Haskell has made syntactical sugar for us:
#+BEGIN_SRC haskell
do
x <- action1
y <- action2
z <- action3
...
#+END_SRC
Is replaced by:
#+BEGIN_SRC haskell
action1 >>= (\x ->
action2 >>= (\y ->
action3 >>= (\z ->
...
)))
#+END_SRC
Note that you can use =x= in =action2= and =x= and =y= in =action3=.
But what about the lines not using the =<-=? Easy, another function
=blindBind=:
#+BEGIN_SRC haskell
blindBind :: IO a -> IO b -> IO b
blindBind action1 action2 w0 =
bind action (\_ -> action2) w0
#+END_SRC
I didn't simplify this definition for the purposes of clarity. Of
course, we can use a better notation: we'll use the =(>>)= operator.
And
#+BEGIN_SRC haskell
do
action1
action2
action3
#+END_SRC
Is transformed into
#+BEGIN_SRC haskell
action1 >>
action2 >>
action3
#+END_SRC
Also, another function is quite useful.
#+BEGIN_SRC haskell
putInIO :: a -> IO a
putInIO x = IO (\w -> (x,w))
#+END_SRC
This is the general way to put pure values inside the "IO context". The
general name for =putInIO= is =return=. This is quite a bad name when
you learn Haskell. =return= is very different from what you might be
used to.
-----
To finish, let's translate our example:
#+BEGIN_SRC haskell
askUser :: IO [Integer]
askUser = do
putStrLn "Enter a list of numbers (separated by commas):"
input <- getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
main :: IO ()
main = do
list <- askUser
print $ sum list
#+END_SRC
Is translated into:
#+BEGIN_SRC haskell
import Data.Maybe
maybeRead :: Read a => String -> Maybe a
maybeRead s = case reads s of
[(x,"")] -> Just x
_ -> Nothing
getListFromString :: String -> Maybe [Integer]
getListFromString str = maybeRead $ "[" ++ str ++ "]"
askUser :: IO [Integer]
askUser =
putStrLn "Enter a list of numbers (sep. by commas):" >>
getLine >>= \input ->
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
main :: IO ()
main = askUser >>=
\list -> print $ sum list
#+END_SRC
You can compile this code to verify that it works.
Imagine what it would look like without the =(>>)= and =(>>=)=.
** Monads
:PROPERTIES:
:CUSTOM_ID: monads
:END:
#+begin_comment
#+CAPTION: Dali, reve. It represents a weapon out of the
mouth of a tiger, itself out of the mouth of another tiger, itself out
of the mouth of a fish itself out of a grenade. I could have choosen a
picture of the Human centipede as it is a very good representation of
what a monad really is. But just to think about it, I find this
disgusting and that wasn't the purpose of this document.
[[./dali_reve.jpg]]
#+end_comment
Now the secret can be revealed: =IO= is a /monad/. Being a monad means
you have access to some syntactical sugar with the =do= notation. But
mainly, you have access to a coding pattern which will ease the flow of
your code.
#+BEGIN_QUOTE
*Important remarks*:
- Monad are not necessarily about effects! There are a lot of /pure/
monads.
- Monad are more about sequencing
#+END_QUOTE
In Haskell, =Monad= is a type class. To be an instance of this type
class, you must provide the functions =(>>=)= and =return=. The function
=(>>)= is derived from =(>>=)=. Here is how the type class =Monad= is
declared (basically):
#+BEGIN_SRC haskell
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
(>>) :: m a -> m b -> m b
f >> g = f >>= \_ -> g
-- You should generally safely ignore this function
-- which I believe exists for historical reasons
fail :: String -> m a
fail = error
#+END_SRC
#+BEGIN_QUOTE
Remarks:
- the keyword =class= is not your friend. A Haskell class is /not/ a
class of the kind you will find in object-oriented programming. A
Haskell class has a lot of similarities with Java interfaces. A
better word would have been =typeclass=, since that means a set of
types. For a type to belong to a class, all functions of the class
must be provided for this type.
- In this particular example of type class, the type =m= must be a
type that takes an argument. for example =IO a=, but also =Maybe a=,
=[a]=, etc...
- To be a useful monad, your function must obey some rules. If your
construction does not obey these rules strange things might happens:
#+END_QUOTE
#+BEGIN_SRC haskell
return a >>= k == k a
m >>= return == m
m >>= (\x -> k x >>= h) == (m >>= k) >>= h
#+END_SRC
*** Maybe is a monad
:PROPERTIES:
:CUSTOM_ID: maybe-is-a-monad
:END:
There are a lot of different types that are instances of =Monad=. One of
the easiest to describe is =Maybe=. If you have a sequence of =Maybe=
values, you can use monads to manipulate them. It is particularly useful
to remove very deep =if..then..else..= constructions.
Imagine a complex bank operation. You are eligible to gain about 700€
only if you can afford to follow a list of operations without your
balance dipping below zero.
#+BEGIN_SRC haskell
deposit value account = account + value
withdraw value account = account - value
eligible :: (Num a,Ord a) => a -> Bool
eligible account =
let account1 = deposit 100 account in
if (account1 < 0)
then False
else
let account2 = withdraw 200 account1 in
if (account2 < 0)
then False
else
let account3 = deposit 100 account2 in
if (account3 < 0)
then False
else
let account4 = withdraw 300 account3 in
if (account4 < 0)
then False
else
let account5 = deposit 1000 account4 in
if (account5 < 0)
then False
else
True
main = do
print $ eligible 300 -- True
print $ eligible 299 -- False
#+END_SRC
-----
Now, let's make it better using Maybe and the fact that it is a Monad
#+BEGIN_SRC haskell
deposit :: (Num a) => a -> a -> Maybe a
deposit value account = Just (account + value)
withdraw :: (Num a,Ord a) => a -> a -> Maybe a
withdraw value account = if (account < value)
then Nothing
else Just (account - value)
eligible :: (Num a, Ord a) => a -> Maybe Bool
eligible account = do
account1 <- deposit 100 account
account2 <- withdraw 200 account1
account3 <- deposit 100 account2
account4 <- withdraw 300 account3
account5 <- deposit 1000 account4
Just True
main = do
print $ eligible 300 -- Just True
print $ eligible 299 -- Nothing
#+END_SRC
-----
Not bad, but we can make it even better:
#+BEGIN_SRC haskell
deposit :: (Num a) => a -> a -> Maybe a
deposit value account = Just (account + value)
withdraw :: (Num a,Ord a) => a -> a -> Maybe a
withdraw value account = if (account < value)
then Nothing
else Just (account - value)
eligible :: (Num a, Ord a) => a -> Maybe Bool
eligible account =
deposit 100 account >>=
withdraw 200 >>=
deposit 100 >>=
withdraw 300 >>=
deposit 1000 >>
return True
main = do
print $ eligible 300 -- Just True
print $ eligible 299 -- Nothing
#+END_SRC
We have proven that Monads are a good way to make our code more elegant.
Note this idea of code organization, in particular for =Maybe= can be
used in most imperative languages. In fact, this is the kind of
construction we make naturally.
#+BEGIN_QUOTE
An important remark:
The first element in the sequence being evaluated to =Nothing= will
stop the complete evaluation. This means you don't execute all lines.
You get this for free, thanks to laziness.
#+END_QUOTE
You could also replay these example with the definition of =(>>=)= for
=Maybe= in mind:
#+BEGIN_SRC haskell
instance Monad Maybe where
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
Nothing >>= _ = Nothing
(Just x) >>= f = f x
return x = Just x
#+END_SRC
The =Maybe= monad proved to be useful while being a very simple example.
We saw the utility of the =IO= monad. But now for a cooler example,
lists.
*** The list monad
:PROPERTIES:
:CUSTOM_ID: the-list-monad
:END:
#+CAPTION: Golconde de Magritte
[[./golconde.jpg]]
The list monad helps us to simulate non-deterministic computations. Here
we go:
#+BEGIN_SRC haskell
import Control.Monad (guard)
allCases = [1..10]
resolve :: [(Int,Int,Int)]
resolve = do
x <- allCases
y <- allCases
z <- allCases
guard $ 4*x + 2*y < z
return (x,y,z)
main = do
print resolve
#+END_SRC
MA. GIC. :
#+BEGIN_EXAMPLE
[(1,1,7),(1,1,8),(1,1,9),(1,1,10),(1,2,9),(1,2,10)]
#+END_EXAMPLE
For the list monad, there is also this syntactic sugar:
#+BEGIN_SRC haskell
print $ [ (x,y,z) | x <- allCases,
y <- allCases,
z <- allCases,
4*x + 2*y < z ]
#+END_SRC
I won't list all the monads, since there are many of them. Using monads
simplifies the manipulation of several notions in pure languages. In
particular, monads are very useful for:
- IO,
- non-deterministic computation,
- generating pseudo random numbers,
- keeping configuration state,
- writing state,
- ...
If you have followed me until here, then you've done it! You know
monads[fn:7]!
2019-12-15 16:05:57 +00:00
* Difficulty: Hell
:PROPERTIES:
:CUSTOM_ID: difficulty--hell
:END:
So when I said that the learning curve is steep.
If you come this far, you can really congratulate yourself.
This is already what I would personnaly call a tremendous achievement.
But now, be prepared, it will be a *lot* harder.
So brace yourself, be ready for the big jump.
I am pretty sure this part is so hard, that you will have a hard time
understanding it without looking at other resources.
This is intended.
Do not hesitate to read previous sections again, to read external
resources, ask questions in all Haskell communities platforms.
Sorry to make it as is, but, really I don't think I can make a dense
Haskell introduction and not make it ultra hard.
Do not feel discouraged though, most Haskeller I know had to dig into
Haskell at least two or three times before it really clicked for them.
** Command line application
:PROPERTIES:
:CUSTOM_ID: command-line-application
:END:
** Web Application
:PROPERTIES:
:CUSTOM_ID: web-application
:END:
* Appendix
:PROPERTIES:
:CUSTOM_ID: appendix
:END:
This section is not so much about learning Haskell.
It is just here to discuss some details further.
** More on Infinite Tree
:PROPERTIES:
:CUSTOM_ID: more-on-infinite-tree
:END:
In the section Infinite Structures we saw some
simple constructions. Unfortunately we removed two properties from our
tree:
1. no duplicate node value
2. well ordered tree
In this section we will try to keep the first property. Concerning the
second one, we must relax it but we'll discuss how to keep it as much as
possible.
This code is mostly the same as the one in the tree section.
#+BEGIN_SRC haskell
import Data.List
data BinTree a = Empty
| Node a (BinTree a) (BinTree a)
deriving (Eq,Ord)
-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
-- will start by a '<' before the root
-- and put a : a begining of line
show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
where
treeshow pref Empty = ""
treeshow pref (Node x Empty Empty) =
(pshow pref x)
treeshow pref (Node x left Empty) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " left)
treeshow pref (Node x Empty right) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " right)
treeshow pref (Node x left right) =
(pshow pref x) ++ "\n" ++
(showSon pref "|--" "| " left) ++ "\n" ++
(showSon pref "`--" " " right)
-- show a tree using some prefixes to make it nice
showSon pref before next t =
pref ++ before ++ treeshow (pref ++ next) t
-- pshow replace "\n" by "\n"++pref
pshow pref x = replace '\n' ("\n"++pref) (show x)
-- replace on char by another string
replace c new string =
concatMap (change c new) string
where
change c new x
| x == c = new
| otherwise = x:[] -- "x"
#+END_SRC
Our first step is to create some pseudo-random number list:
#+BEGIN_SRC haskell
shuffle = map (\x -> (x*3123) `mod` 4331) [1..]
#+END_SRC
Just as a reminder, here is the definition of =treeFromList=
#+BEGIN_SRC haskell
treeFromList :: (Ord a) => [a] -> BinTree a
treeFromList [] = Empty
treeFromList (x:xs) = Node x (treeFromList (filter (<x) xs))
(treeFromList (filter (>x) xs))
#+END_SRC
and =treeTakeDepth=:
#+BEGIN_SRC haskell
treeTakeDepth _ Empty = Empty
treeTakeDepth 0 _ = Empty
treeTakeDepth n (Node x left right) = let
nl = treeTakeDepth (n-1) left
nr = treeTakeDepth (n-1) right
in
Node x nl nr
#+END_SRC
See the result of:
#+BEGIN_SRC haskell
main = do
putStrLn "take 10 shuffle"
print $ take 10 shuffle
putStrLn "\ntreeTakeDepth 4 (treeFromList shuffle)"
print $ treeTakeDepth 4 (treeFromList shuffle)
#+END_SRC
#+BEGIN_EXAMPLE
% runghc 02_Hard_Part/41_Infinites_Structures.lhs
take 10 shuffle
[3123,1915,707,3830,2622,1414,206,3329,2121,913]
treeTakeDepth 4 (treeFromList shuffle)
< 3123
: |--1915
: | |--707
: | | |--206
: | | `--1414
: | `--2622
: | |--2121
: | `--2828
: `--3830
: |--3329
: | |--3240
: | `--3535
: `--4036
: |--3947
: `--4242
#+END_EXAMPLE
Yay! It ends! Beware though, it will only work if you always have
something to put into a branch.
For example
#+BEGIN_SRC haskell
treeTakeDepth 4 (treeFromList [1..])
#+END_SRC
will loop forever. Simply because it will try to access the head of
=filter (<1) [2..]=. But =filter= is not smart enought to understand
that the result is the empty list.
Nonetheless, it is still a very cool example of what non strict programs
have to offer.
Left as an exercise to the reader:
- Prove the existence of a number =n= so that
=treeTakeDepth n (treeFromList shuffle)= will enter an infinite loop.
- Find an upper bound for =n=.
- Prove there is no =shuffle= list so that, for any depth, the program
ends.
-----
This code is mostly the same as the preceding one.
#+BEGIN_SRC haskell
import Debug.Trace (trace)
import Data.List
data BinTree a = Empty
| Node a (BinTree a) (BinTree a)
deriving (Eq,Ord)
#+END_SRC
#+BEGIN_SRC haskell
-- declare BinTree a to be an instance of Show
instance (Show a) => Show (BinTree a) where
-- will start by a '<' before the root
-- and put a : a begining of line
show t = "< " ++ replace '\n' "\n: " (treeshow "" t)
where
treeshow pref Empty = ""
treeshow pref (Node x Empty Empty) =
(pshow pref x)
treeshow pref (Node x left Empty) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " left)
treeshow pref (Node x Empty right) =
(pshow pref x) ++ "\n" ++
(showSon pref "`--" " " right)
treeshow pref (Node x left right) =
(pshow pref x) ++ "\n" ++
(showSon pref "|--" "| " left) ++ "\n" ++
(showSon pref "`--" " " right)
-- show a tree using some prefixes to make it nice
showSon pref before next t =
pref ++ before ++ treeshow (pref ++ next) t
-- pshow replace "\n" by "\n"++pref
pshow pref x = replace '\n' ("\n"++pref) (" " ++ show x)
-- replace on char by another string
replace c new string =
concatMap (change c new) string
where
change c new x
| x == c = new
| otherwise = x:[] -- "x"
treeTakeDepth _ Empty = Empty
treeTakeDepth 0 _ = Empty
treeTakeDepth n (Node x left right) = let
nl = treeTakeDepth (n-1) left
nr = treeTakeDepth (n-1) right
in
Node x nl nr
#+END_SRC
In order to resolve these problem we will modify slightly our
=treeFromList= and =shuffle= function.
A first problem, is the lack of infinite different number in our
implementation of =shuffle=. We generated only =4331= different numbers.
To resolve this we make a slightly better =shuffle= function.
#+BEGIN_SRC haskell
shuffle = map rand [1..]
where
rand x = ((p x) `mod` (x+c)) - ((x+c) `div` 2)
p x = m*x^2 + n*x + o -- some polynome
m = 3123
n = 31
o = 7641
c = 1237
#+END_SRC
This shuffle function has the property (hopefully) not to have an upper
nor lower bound. But having a better shuffle list isn't enough not to
enter an infinite loop.
Generally, we cannot decide whether =filter (<x) xs= is empty. Then to
resolve this problem, I'll authorize some error in the creation of our
binary tree. This new version of code can create binary tree which don't
have the following property for some of its nodes:
#+BEGIN_QUOTE
Any element of the left (resp. right) branch must all be strictly
inferior (resp. superior) to the label of the root.
#+END_QUOTE
Remark it will remains /mostly/ an ordered binary tree. Furthermore, by
construction, each node value is unique in the tree.
Here is our new version of =treeFromList=. We simply have replaced
=filter= by =safefilter=.
#+BEGIN_SRC haskell
treeFromList :: (Ord a, Show a) => [a] -> BinTree a
treeFromList [] = Empty
treeFromList (x:xs) = Node x left right
where
left = treeFromList $ safefilter (<x) xs
right = treeFromList $ safefilter (>x) xs
#+END_SRC
This new function =safefilter= is almost equivalent to =filter= but
don't enter infinite loop if the result is a finite list. If it cannot
find an element for which the test is true after 10000 consecutive
steps, then it considers to be the end of the search.
#+BEGIN_SRC haskell
safefilter :: (a -> Bool) -> [a] -> [a]
safefilter f l = safefilter' f l nbTry
where
nbTry = 10000
safefilter' _ _ 0 = []
safefilter' _ [] _ = []
safefilter' f (x:xs) n =
if f x
then x : safefilter' f xs nbTry
else safefilter' f xs (n-1)
#+END_SRC
Now run the program and be happy:
#+BEGIN_SRC haskell
main = do
putStrLn "take 10 shuffle"
print $ take 10 shuffle
putStrLn "\ntreeTakeDepth 8 (treeFromList shuffle)"
print $ treeTakeDepth 8 (treeFromList $ shuffle)
#+END_SRC
You should realize the time to print each value is different. This is
because Haskell compute each value when it needs it. And in this case,
this is when asked to print it on the screen.
Impressively enough, try to replace the depth from =8= to =100=. It will
work without killing your RAM! The flow and the memory management is
done naturally by Haskell.
Left as an exercise to the reader:
- Even with large constant value for =deep= and =nbTry=, it seems to
work nicely. But in the worst case, it can be exponential. Create a
worst case list to give as parameter to =treeFromList=.\\
/hint/: think about (=[0,-1,-1,....,-1,1,-1,...,-1,1,...]=).
- I first tried to implement =safefilter= as follow:
#+BEGIN_SRC haskell
safefilter' f l = if filter f (take 10000 l) == []
then []
else filter f l
#+END_SRC
Explain why it doesn't work and can enter into an infinite loop.
- Suppose that =shuffle= is real random list with growing bounds. If you
study a bit this structure, you'll discover that with probability 1,
this structure is finite. Using the following code (suppose we could
use =safefilter'= directly as if was not in the where of safefilter)
find a definition of =f= such that with probability =1=,
=treeFromList' shuffle= is infinite. And prove it. Disclaimer, this is
only a conjecture.
#+BEGIN_SRC haskell
treeFromList' [] n = Empty
treeFromList' (x:xs) n = Node x left right
where
left = treeFromList' (safefilter' (<x) xs (f n)
right = treeFromList' (safefilter' (>x) xs (f n)
f = ???
#+END_SRC
** Thanks
:PROPERTIES:
:CUSTOM_ID: thanks
:END:
Thanks to [[http://reddit.com/r/haskell][=/r/haskell=]] and
[[http://reddit.com/r/programming][=/r/programming=]]. Your comment were
most than welcome.
Particularly, I want to thank [[https://github.com/Emm][Emm]] a thousand
times for the time he spent on correcting my English. Thank you man.
[fn:1] Even if most recent languages try to hide them, they are present.
[fn:2] I know I'm cheating. But I will talk about non-strictness later.
[fn:3] For the brave, a more complete explanation of pattern matching
can be found
[[http://www.cs.auckland.ac.nz/references/haskell/haskell-intro-html/patterns.html][here]].
[fn:4] Which is itself very similar to the javascript =eval= function,
that is applied to a string containing JSON.
[fn:5] There are some /unsafe/ exceptions to this rule. But you
shouldn't see such use in a real application except maybe for
debugging purposes.
[fn:6] For the curious ones, the real type is
=data IO a = IO {unIO :: State# RealWorld -> (# State# RealWorld, a #)}=.
All the =#= has to do with optimisation and I swapped the fields
in my example. But this is the basic idea.
[fn:7] Well, you'll certainly need to practice a bit to get used to them
and to understand when you can use them and create your own. But
you already made a big step in this direction.