Monday, August 30, 2010

Linux: How strace works

Here is a highly instructive article on the innards of strace, a Linux utility for tracing system calls in a process and its children.  The sample code turns on ptrace for the child before starting the child, so I am a little curious about how it works when I attach it to a process that is always running.

At a previous job, I saw crashes in vfork() when using strace on make.  I'm told that it should work, so I have to assume that it was caused by yet more memory bugs in the poorly written code at that company. Now I wish I'd saved the crashing example so that I could dive into it further.

Saturday, August 28, 2010

Bug in Blogspot: Spaces after periods

I just noticed a problem with the rendering at Blogspot.  It is traditional to add 2 spaces after a period when typing (for the sake of typesetting, an ancient art form) but if I do that here, line-wrapping may carry a space to the beginning of the next line.  I'll just use a single space from now on.

Friday, August 27, 2010

Python: Comprehension loop variables

Python3 adds the nonlocal keyword, and alters list comprehensions to make their variables behave more like those in generator comprehensions.
"... the loop control variables are no longer leaked into the surrounding scope."
 Consider the following:




from __future__ import print_function

global x
x = 7

def main():
    x = 5
    def foo():
        #nonlocal x
        print("x =", x)
        return x

    foo()
    print([x for x in range(2)])
    print([foo() for x in range(2)])
    print(x)
    foo()

main()
 Python2.7 prints:


x = 5
[0, 1]
x = 0
x = 1
[0, 1]
1
x = 1

Python3.1 prints:


x = 5
[0, 1]
x = 5
x = 5
[5, 5]
5
x = 5

In other words, in Python2.7 list comprehension loop variables are part of the current scope, while Python3.1 they hide the scope, but only within the comprehension.  There is not really a lexical scope inside the comprehension.  It's a subtle distinction.

"8 things I wish I knew before starting a business"

Here are lessons for start-ups, from Don Rainey, a general partner at Grotech Ventures.

Summary:

  1. Things take longer than you imagine.
  2. Items that succeed tend to do so quickly.
  3. People will let you down.
  4. Good employees are really hard to find.
  5. Your bad employees rarely quit.
  6. You will be lucky and unlucky.
  7. Avoid the myth and misery of sunk cost.
  8. Fill the pipe, always.
#7 seems to be the same as #2, but you can read the full article yourself.

Thursday, August 26, 2010

Giving F# another look

Here is an interesting comparison among Haskell, Ocaml, and F# hashtable implementations.

F# is starting to look very good.  It used to be only a research project at Microsoft, and much slower than Ocaml.  Now, it's part of Visual Studio 2010, and the speed is clearly competitive with Ocaml for some common operations.  The clear syntax might encourage adoption at Fortune 500 cos.

I guess I should learn F# along with the .NET API.  I can use Ocaml for MacOS work.  The languages are quite similar, afterall.

Wednesday, August 25, 2010

Ocaml: Improved polymorphism

Ocaml now supports polymorphism!  Well, it always did, somewhat, but with fatal caveats.  "Jumping through hoops,"  as Janestreet coders say, or relying on macros is not my idea of "high level".

Janestreet has provided an excellent description of these new features of Ocaml.  There is a very subtle distinction between polymorphic type annotations and explicit type parameters, regarding recursive functions, and I certainly could not explain it better.

This is big news for me.  As soon as the MacPorts installation is ready, I plan to switch to Ocaml as my go-to language, whereas I had been considering Haskell.  I'll still use Python, Ruby, Go, Perl, Bash, etc. for many tasks, of course.

Maximum argument length in Linux

From a 2004 Slashdot interview with Rob Pike:
I didn't use Unix at all, really, from about 1990 until 2002, when I joined Google. (I worked entirely on Plan 9, which I still believe does a pretty good job of solving those fundamental problems.) I was surprised when I came back to Unix how many of even the little things that were annoying in 1990 continue to annoy today. In 1975, when the argument vector had to live in a 512-byte-block, the 6th Edition system would often complain, 'arg list too long'. But today, when machines have gigabytes of memory, I still see that silly message far too often. The argument list is now limited somewhere north of 100K on the Linux machines I use at work, but come on people, dynamic memory allocation is a done deal!
Pike is referring to this problem, most common when a '*' wildcard expands to too many files.  For the examples, I would say that 3a/b might as well be a Perl/Python/Ruby script.  I would also add Example 2b:
% find -X $directory1 -name '*' -depth 1 -type f | xargs mv --target-directory=$directory2
(if --target-directory is available on mv) since xargs already holds the line-length well below ARG_MAX. Or just use the little-known plus sign in find:
% find $directory1 -name '*' -depth 1 -type f -exec mv {} $directory2 +
That might be the fastest solution.

Another idea, from Lyren Brown:
for f in *foo*; do echo $f; done | tar -T/dev/stdin -cf - | tar -C/dest/path -xvf -
Apparently, the latest Linux kernel finally removes any practical limit.