Thursday, December 30, 2010

Ruby: Inconsistencies

This is a growing list:
  • String#concat
    • Should be String#concat!
    • I guess it's meant to resemble #insert, #delete, #fill, #replace, #clear, etc., but I wish those had "!" too, like #compact!, #reject!, #slice!, etc. There should be 2 versions of all these, but because there is no "!", the non-mutating versions can never be.
  • Hash#update is a synonym for Hash#merge!
    • I have to remember which one has the "!"
    • Hash#invert does *not* mutate. How do I remember all this?
    • Even worse, Set#merge (no "!") mutates, unlike Hash#merge.
  • Set#add? and Set#delete?
    • These mutate.
  • s.chomp!.upcase
    • Can fail, since String#chomp! can return nil
  • Array#fetch(i) (and Hash#fetch(k)) can raise IndexError
    • Should be Array#fetch!(i)
    • Block and multi-arg version could drop the "!"
  • String#each does not exist in Ruby1.9, while #each_char does not exist in 1.8
    • We cannot write forward-compatible code!
    • Soln: s.split("")
  • '0x10'.hex and '0x10'.oct are same, but
    • '010'.hex and '010'.oct are different
  • Given: a = [1,2,3]
    • a[2] == 3
    • a[3] == nil
    • a[4] == nil
    • a[2,1] == [3] (and a[2,2] == [3] as well)
    • but a[3,1] == []
    • while a[4,1] == nil
  • For Arrays
    • a[x,y] = [nil] substitutes the slice
    • a[x,y] = z     substitutes [z], but ...
    • a[x,y] = nil   deletes the slice! (fixed in 1.9)
  • inspect/to_s
    • There used to be a clear distinction, like Python's __repr__/__str__, but 1.9 often (though not in all cases) erases that useful distinction.
  • Array   Float   Integer   String 
    • Those are Kernel functions, not constants.
  • ...
Also see Ruby Best Practices, a great book.

Wednesday, December 29, 2010

Ruby: Introspection

Supposedly, Ruby has introspection, but some things are missing. For example:

# How to get the name of the current method?
# Add this snippet of code to your logic somewhere
 
module Kernel
  private
  # Defined in ruby 1.9
  unless defined?(__method__)
    def __method__
      caller[0] =~ /`([^']*)'/ and $1
    end
  end
end
In Python, this is not much better:

import tracebackdef asdf():
    (filename,line_number,function_name,text)=traceback.extract_stack()[-1]
    print function_name
asdf()

Update: __method__ is part of Ruby as of 1.8.7.

Here is something else rather awkward in Ruby:

# Print all modules (excluding classes)
puts Module.constants.sort.select {|x| eval(x.to_s).instance_of? Module}
In Python, we could simply do this:
import sys; print sys.modules.keys()
Ruby's introspection (also) reveals a lot about the structure of classes.

Tuesday, December 14, 2010

Web: Better URL navigation

This post made me realize that very few people are aware of a nice way to work with URLs for dynamic content.

Many sites use the hash (#) or shebang (#!) in their URLs for AJAX, and some end up breaking the *Back* button on your browser. The hash is important for letting Google index a link to a self-reloading page, but for pages that should not be indexed, there is a better way.
  •    http://foo.com/current.html
    • Contains a POST-method link to "/link".
  •    http://foo.com/link
    • On the server, the web framework interprets POST, computes new context, then redirects to the template "final.html", along with extra context.
  •    http://foo.com/final.html
    • This would be the result of the template substitution.
There are several benefits to this scheme:
  1. POST interpretation (the "controller") is separated from template substitution (the "view"). Most people instead use an if-clause in their controller.
  2. The web-designer can keep a simple redirect stub for the "link" page. That way, he can continue web-design in his static environment. He does not have to use a server or the intended framework.
  3. final.html is inherently secure, since none of the extra context is ever provided directly to that URL by the user.
The *Back* button works fine, because the `link` page was never rendered.

I'm not sure why this pattern is not better known. Maybe I am overlooking some advantages of the alternatives.

Thursday, December 9, 2010

Linux: Interesting, obscure commands


# First, the most important place for interesting commands:
http://www.commandlinefu.com/commands/browse/sort-by-votes


# Now, a bunch of cut-and-pasted stuff, from a thread...


# to fix the termincal
reset

# or try Ctrl-v Ctrl-o

Or try:
reset='echo "X[mX(BX)0OX[?5lX7X[rX8" | tr "XO" "\033\017" && /usr/bin/reset'
ESC [m (actually ESC [0m) Character Attributes: Normal (not bold f.i.)
ESC (B Select G0 Character Set: United States (USASCII)
ESC )0 Select G1 Character Set: Special Character and Line Drawing Set
O ( Ctrl-O ) Switch to Standard Character Set
ESC [?5l DEC Private Mode Reset: Normal Video
ESC 7 Save Cursor
ESC [r weird (actually 'ESC [0;0r' ? Set Scrolling Region [top;bottom] )
ESC 8 Restore Cursor

# to turn off display
xset dpms force off


# for virtual terminal
Personally, I think every Linux user should know how to use the virtual terminals. Just hit Ctrl+Alt+F1 and that should take you to a bash prompt. Usually the main one you're on with X running is F7, so you can switch back to that.
If X locks up on me, just a simple:
Ctrl+Alt+F1
login, and run
$ sudo /etc/init.d/gdm restart
Note that it could be gdm, kdm or xdm depending on your distro.
On RedHat or Ubuntu, you could instead:
$ sudo service gdm restart
 Or
$ invoke-rc.d gdm restart # for ubuntu/debian

# Others
GNU-screen (or tmux) is an excellent command (won't have to use nohup again), if you don't have it you should install it and try it.
If you're on a Red Hat based distro, yum and rpm are good to know. If it's Debian based, apt-get anddpkg for installing stuff.
pingtraceroute (or mtr --curses or nstat)ifconfig are all handy for networking stuff.
Look into htop its a much nicer version of top but you may need to enable additional 3rd party repositories if using yum or apt-get (or aptitude). Or nmon, or atop. And pgrep?

# More on screen:
screen (start a screen session)
screen -dr (detach said screen session and reattach it in current sess)
screen -ls (show active screen sessions)
screen -dr [screen session] (detach and resume a specific session)
# And for pair programming:
screen -S sessionname (start a session with a name) screen -x sessionname (attach the named session, even if it's attached elsewhere)
Those are essentially the only two I use, with the occasional "screen -ls". I much prefer -x over -r as you can attach in multiple places. So at home I always leave stuff running in screen and when I log in from work or where ever I can attach that same screen session without first detaching it form my terminal at home. Plus you could have two people working together in one "screen" which is good for pair programming. http://www.ibm.com/developerworks/aix/library/au-gnu_screen/
 # The magic SysRq key
https://secure.wikimedia.org/wikipedia/en/wiki/Magic_SysRq_key

# General stuff

Strings

  • grep
  • awk
  • uniq, sort, sort -n
  • seq
  • cut
  • wc

Files

  • rsync
  • lsof
  • find | xargs
  • locate
  • df -H
  • du -cks | sort -n
  • scp
  • strings
  • file
  • touch
  • z* (zgrep, zcat, etc)
  • tail -f, head

Administration

  • man
  • ps auxf (f only on GNU)
  • kill, -HUP, -9
  • sudo
  • screen
  • /etc/init.d/ scripts
  • id
  • ^Z, fg, jobs, &

Networking

  • nmap
  • dig
  • tcpdump
  • ifconfig

Operators

  • The knowledge that bash is a programming language that provides all your basic constructs (ifs, loops, variables, functions), but instead of having a library of functions, you execute simple programs instead
  • |
  • <, >, >>
  • - as stdin, e.g. "cat somefile.txt | vi -"
  • for i in a b c d; do echo $i; something_else $i; done
  • alias
  • All the goodies at http://samrowe.com/wordpress/advancing-in-the-bash-shell/

# And more

netstat -ano views your open TCP and UDP connections
netstat -tulp # what is listening on which port
# or lsof -i
top -b | grep processname # continuous info about a process, you have to Ctrl+C out of it though
nmap -sS -sV -O localhost # local listening ports and what versions of daemons are running.
# maybe -p 1-65535
xsel --clipboard --input # stdin to clipboard

# OSX
pbcopy # stdin to clipboard


diff this that | vim -
pgrep firef
watch sensors #?

ncdu # to find out where all space is being used
htop > ps # not a redirect

# For bigger programs
mocpalsamixerncduhtopemacsscreenfehacpidpkgconvert

diff -wyW160 this that | less  #compare side-by-side
diff -u this that >other       #write unified diff
patch 

Wednesday, December 8, 2010

JavaScript: Namespaces

http://javascriptweblog.wordpress.com/2010/12/07/namespacing-in-javascript/

The section on using this as a namespace proxy is brilliant. The origin of that idea is here (James Edwards).

Example:
var myApp = {};
(function() {
var id = 0;

this.next = function() {
return id++;
};

this.reset = function() {
id = 0;
}
}).apply(myApp)

window.console && console.log(
myApp.next(),
myApp.next(),
myApp.reset(),
myApp.next()
) //0, 1, undefined, 0
or more powerfully

var subsys1 = {}, subsys2 = {};
var nextIdMod = function(startId) {
var id = startId || 0;

this.next = function() {
return id++;
};

this.reset = function() {
id = 0;
}
};

nextIdMod.call(subsys1);
nextIdMod.call(subsys2,1000);

window.console && console.log(
subsys1.next(),
subsys1.next(),
subsys2.next(),
subsys1.reset(),
subsys2.next(),
subsys1.next()
) //0, 1, 1000, undefined, 1001, 0

Monday, December 6, 2010

C/flex: reentrant lexical analysis

There is a great deal of confusion on how to use flex/bison (lex/yacc) with reentrancy. The biggest reason, as shown here, is that flex and bison deal with reentrancy differently.

To clear some of that up, I posted an Answer at Stackoverflow. With reentrant flex, I recommend using Lemon Parser, rather than bison.

Note that flex scanners are not "as reentrant" as lex scanners. However, if bison is avoided, then the flex C++ scanner (also reentrant) is probably a good alternative to %option reentrant, maybe with a small speed penalty.