I have been having fun with the Julia language lately. It is a new programming language for scientific computing. You may wonder, why do we need a new language at all? Don’t we already have MATLAB, Mathematica, Python with NumPy and SciPy, etc., are’t those enough?
Well, it is true that those are awesome softwares for scientific computing, they have all the necessary functionalities and powerful libraries, and they are easy to learn. But there is a common (fatal) issue that prevents them from creating industrial quality codes – speed. We want faster speed! The Julia language is developed specifically for this.
1. The Julia Language
What is the Julia language? According to its official page
Julia a high-level, high-performance dynamic programming language for numerical computing.
There are two keywords here: high-performance and dynamic programming. These two words don’t usually come together! If you have written some program for numerical computations, you probably have noticed:
- Dynamic programming languages like Python and MATLAB are very handy and human-friendly. Each such language has a REPL that allows you to see the effects of your code immediately, so the workflow is interactive. You can easily explore ideas, quickly prototype a new software. The syntax is simple which allows for fast development.
- However, when you finally decide to write some software for practical use, all these dynamic programming languages suffer from slow running speeds. Then you have to switch to a high-performance programming language like C/C++ and Fortran. These languages are static, use a classic edit-compile-run-debug (ECRD) cycle in contrast to REPL. It takes a lot of time to write the code and debug.
This is called the Two-language problem: efficiency for human or efiiciency for the machine, pick one. There have been efforts trying to resolve this problem, for example, by linking the C libraries in a Python code, but such usage of multi-language programming quickly makes the code really complex and hard to maintain.
1.1 Solving the two-language problem
This is exactly where Julia kicks in.
Julia is a dynamic programming language as MATLAB or Python, making it easy to use. But Julia also runs very fast, comparable to the performance of C codes. How could a dynamic language run fast? It does so by using the LLVM-based just-in-time (JIT) compiler. In fact Julia is designed for JIT from the very beginning!
1.2 Want to try Julia?
If you would like to try out the Julia language, I recommend this quick hands-on approach: firstly, you learn the basics by going through some short tutorial found here; after having some familiarity with the syntax, you start some programming with your field of study with the help of some existing packages. For example, I work on numerical analysis and spectral methods, so I started with the ApproxFun package under the JuliaApproximation project; or if you have done some statistics programming with languages like R, there is a collection of statistics packages under the Julia Statistics project; or if you are an algebraist, you may have done some symbolic programming with languages like REDUCE, there is also a Reduce package for you. Check the list of available packages, no matter what you work on, you are likely to find something that suits your need.
Next, I am going to talk about an experience I had when coding Julia. It will be an example with very technical details, so you may stop reading at this point if all you want is just some general information about Julia.
2. Testing my code: why can’t I redefine my test functions?
(Note: the issue mentioned in this note pertains to julia version v0.5
, there will be a change/fix of the behavior in v0.6
; see the P.S. in the end.)
2.1 Redefine test functions that will be called by a high-order function: the failure
In scientific computing, we often need to test a function by feeding it different parameters to see if the behaviors are as expected.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
This is completely fine with a simple function. But if you are debugging a high-order function, i.e. a function whose input (or output) is also a function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
We see that redefining a function is okay if you just want to evaluate it, but redefining for the testing of another high-order function won’t work. Unless you also redefine that high-order function to update the dependence:
1 2 3 4 5 6 7 8 |
|
But redefining every high-order function is cumbersome, and even impractical if there is a chain of dependencies among multiple high-order functions.
2.2 The logic behind such failure: how the JIT compiler works
This issue received a long discussion (started a couple years ago and is still going on) on GitHub issue #265. This goes back to the fundamental question of how Julia’s JIT compiler works (under the hood) in real time.
- The JIT compiler would compile a custom function (high-order or not) when it is executed for the first time. So you see behavior like this
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
-
Because
f2
is compiled whenf2(f1)
is called; before the call,f1
is most recently defined to return3
, sof2(f1)
returns3
; the first definition off1
that returned2
was overwritten. -
After the
f2(f1)
call, no matter how you overwrite the definition off1
,f2(f1)
will always return3
since that’s how it was when compiled at its first call.
1 2 3 4 5 6 7 8 |
|
- Unless you also redefine
f2
, thenf2
becomes an uncompiled function again. The next timef2
is called, it will be compiled again and updates its behavior.
1 2 3 4 5 6 7 8 |
|
- This experiment shows that the compilation is down to the lowest-order function, so if a high-order function is called, the JIT will compile all the functions it calls, until it hits a simple function. (Compilation can also occur when you feed a custom function with variables of different types. This is the subject of multiple dispatch, another great advantage of Julia).
2.3 Solution: using anonymous functions
I have found a solution in a comment of issue #508, which is to use the lambda function notation ->
in Julia to define an anonymous function, and assign the anonymous function to a variable f
. Then feed this variable into the high-order function you want to test.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
All works well now! This is because the variable f
is now pointing to an anonymous function. If it is later redefined to point to a different function, this pointer value is updated because every anonymous function receives a unique label. So there are no more confusions!
P.S. Issue fixed in v0.6
I have downloaded the pre-released version julia v0.6.0-rc2
to check if the issue is well handled. Apparently, the issue is well fixed in this new version:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|