This guy hates checked exceptions in Java, with good reason. They are also a bad idea in C++. Bruce Eckel has chimed in as well.
A "checked" exception is one which is named in a "throws" annotation on a function.
void foo() throws MyException {}
Sunday, July 24, 2011
Friday, July 15, 2011
Git: Why should I use git instead of Subversion, CVS, etc?
Since the announcement that GoogleCode now supports git, many people are wondering why it's preferable to Subversion or even CVS. Here is my opinion:
The biggest advantage of git over mercurial is the [3] index, which is the genius of Linus Torvalds (at least to recognize the value). Otherwise, mercurial is very good and in some ways better.
And what's the biggest disadvantage of git? Large files can make it really slow. With default settings, it's for source-code only. If you want to store big files in git, try git-annex, which even allows the files to be stored on remotes such as rsync, the web (RESTfully), or Amazon S3. Also consider git-media. I wouldn't bother with git-bigfiles.
I saw part of an interesting [1] video in which a YUI dev claimed that her productivity went up after switching from svn to git. YMMV.
For me, the advantages are:
- Distributed repositories
- At first, a central repo seems more appealing to a Project Manager, but eventually you may prefer the Integration Manager model which a DVCS facilitates. Also, a DVCS allows one to commit while offline.
- Private branches
- Keep your dirty laundry to yourself. With svn, many devs avoid frequent commits for this reason.
- Simpler branch-merging
- When it's easy, people do it.
- Rebasing
- The "killer" feature of git. (Also available in Mercurial.) Lets you consolidate groups of commits and pretend that you did them all after the most recent update.
- The
.git
directory- Very unobtrusive, unlike
CVS/
and.svn/
. Perforce is even worse, requiring a specific directory for the check-out. With git/hg/bzr/etc., you can version-control any sub-directory in your filesystem at any time, very easily, without setting up a central repo. I sometimes rungit init
inside a working area for Subversion, for a one-day project. Remember: With Subversion you cannot hide your dirty laundry.
- Very unobtrusive, unlike
- The "stash"
- Unique to git. Syntactic sugar for temporary branching.
- "rerere" (reuse recorded resolution)
- Pure magic. Caches merge-conflict resolution, so you never have to resolve manually the identical conflict again.
The biggest advantage of git over mercurial is the [3] index, which is the genius of Linus Torvalds (at least to recognize the value). Otherwise, mercurial is very good and in some ways better.
And what's the biggest disadvantage of git? Large files can make it really slow. With default settings, it's for source-code only. If you want to store big files in git, try git-annex, which even allows the files to be stored on remotes such as rsync, the web (RESTfully), or Amazon S3. Also consider git-media. I wouldn't bother with git-bigfiles.
Wednesday, July 13, 2011
Linux: Memory usage with Exmap
Wow. I can read the output from top just fine, but this little utility is amazing.
Monday, July 11, 2011
C++0x: Does it have closures?
No. It has downward funargs, but not upward. More discussion is here.
This means that something in the style of node.js would be a bit more complicated in C++ (or Java, etc.) than in JavaScript/Perl/Python/Ruby/Lua/Go, etc.
This means that something in the style of node.js would be a bit more complicated in C++ (or Java, etc.) than in JavaScript/Perl/Python/Ruby/Lua/Go, etc.
HTTP: Truly a stateless protocol?
> Is HTTP stateful or Stateless? Also, it would be really great is you
> please do let me know where can I find more details regarding HTTP protocol?
Fundamentally, HTTP as a protocol is stateless. In general, though, a
stateless protocol can be made to act as if it were stateful, assuming
you've got help from the client. This happens by arranging for the
server to send the state (or some representative of the state) to the
client, and for the client to send it back again next time.
There are three ways this happens in HTTP. One is cookies, in which
case the state is sent and returned in HTTP headers. The second is URL
rewriting, in which case the state is sent as part of the response and
returned as part of the request URI. The third is hidden form fields,
in which the state is sent to the client as part of the response, and
returned to the server as part of a form's data (which can be in the
request URI or the POST body, depending on the form's method).
To learn more about HTTP as a protocol, see http://www.w3.org/Protocols/
--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
(I'll elaborate on this later... ~cdunn2001)
> please do let me know where can I find more details regarding HTTP protocol?
Fundamentally, HTTP as a protocol is stateless. In general, though, a
stateless protocol can be made to act as if it were stateful, assuming
you've got help from the client. This happens by arranging for the
server to send the state (or some representative of the state) to the
client, and for the client to send it back again next time.
There are three ways this happens in HTTP. One is cookies, in which
case the state is sent and returned in HTTP headers. The second is URL
rewriting, in which case the state is sent as part of the response and
returned as part of the request URI. The third is hidden form fields,
in which the state is sent to the client as part of the response, and
returned to the server as part of a form's data (which can be in the
request URI or the POST body, depending on the form's method).
To learn more about HTTP as a protocol, see http://www.w3.org/Protocols/
--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
(I'll elaborate on this later... ~cdunn2001)
Parameterized tests
There was a reddit discussion on assert vs. UnitTest-style assert_equal etc.
I really have no preference between
I really have no preference between
assert x == y
and assert_equal(x, y)
, given pytest's helpful tracebacks.My problem with pytest is the awkward support for parameterized tests. nose handles parameterized tests much better. (pytest handles the nose-style too, but that's hard to find in the docs.) This is an open issue for unittest2.
In a nutshell, when I test "purely functional" code (i.e. free of side-effects, referentially transparent) I want this:
- To list inputs and correct outputs.
- To apply each input to a specific function.
- To consider each a separate testcase, so that they will all run even when one fails.
- To learn which inputs failed, along with expected and actual results.
(ToDo: Write an example.)
GoogleTest (C++, not Python) has very good support for parameterized tests via
TEST_P
. However, GoogleTest does not allow TEST_P to be combined with TEST_F (parameterized within a fixture). That is something pytest allows, with a bit of work. Sunday, July 10, 2011
Concurrency in node.js: Objects vs. Functions -- or maybe both!
Here is an excellent article on using node.js as a simple web-server. In particular, it talks about dependency injection (with a reference to Martin Fowler's article) to handle routing, and it includes an aside on "nouns vs. verbs". Kiessling's aside refers to Yegge's 2006 article, which shows why Java is so verbose. As Yegge says,
Functional code facilitates multiprocessing by reducing dependencies. For example:
At first, this seems pleasantly scalable, but looks are deceiving. Consider how it might be called:
To be clear, message-passing is not the main point. The response object could be stored temporarily in a closure, which is typical in JavaScript. (In fact, exec() in node.js does not actually allow response to be passed to its function.) We do not need a mutex lock on the response object because this is a single-threaded program, but even that's not the point. We could have multiple threads and lock the response object. It's in-memory, so response.write() is very fast. We create new events (with threads, processes, or whatever) only for slow (aka blocking) operations, and we assign call-backs to those events so that processing can be deferred. This is a paradigm which makes concurrency simple.
I've really come around to what Perl folks were telling me 8 or 9 years ago: "Dude, not everything is an object."All the talk of imperative vs. functional code -- and message-passing vs. function-passing -- seems to miss the point: We need both objects and functions!
Functional code facilitates multiprocessing by reducing dependencies. For example:
function upload() { console.log("Request handler 'upload' was called."); return "Hello Upload"; }Except for the log message, that is functional code, which might be part of a web-server. Maybe a URL like "http://foo.com/upload" would eventually lead to this function. A more complex version of it could produce a whole web-page.
At first, this seems pleasantly scalable, but looks are deceiving. Consider how it might be called:
function route(pathname) { console.log("About to route a request for " + pathname); if (pathname == "/upload") { return upload(); } else { console.log("No request handler found for " + pathname); return "404 Not found"; } } function onRequest(request, response) { var pathname = url.parse(request.url).pathname; console.log("Request for " + pathname + " received."); response.writeHead(200, {"Content-Type": "text/plain"}); var content = route(pathname) response.write(content); response.end(); }The problem is that the entire stack -- from onRequest() to route() to upload() -- may block the server. The author of node.js, Ryan Dahl, has given many talks in which he discusses the importance of non-blocking calls for the sake of concurrency. Here is an example that can be non-blocking:
function upload(response) { console.log("Request handler 'upload' was called."); response.writeHead(200, {"Content-Type": "text/plain"}); response.write("Hello Upload"); response.end(); } function route(pathname, response) { console.log("About to route a request for " + pathname); if (pathname == '/upload') { upload(response); } else { console.log("No request handler found for " + pathname); response.writeHead(404, {"Content-Type": "text/plain"}); response.write("404 Not found"); response.end(); } } function onRequest(request, response) { var pathname = url.parse(request.url).pathname; console.log("Request for " + pathname + " received."); route(pathname, response); }Notice that we are passing the response object from function to function. That is message-passing. The handler eventually writes directly into that object, rather than returning a string. Thus, the handler has side-effects. It is no longer functional code. But because it will be passed everything it needs, we can forget the call stack. Why is this an advantage? Because it allows us to use a cheap event loop. Let's suppose that upload() is a time-consuming operation:
function upload(response) { console.log("Request handler 'upload' was called."); exec("slow-operation", function (response, error, stdout, stderr) { response.writeHead(200, {"Content-Type": "text/plain"}); response.write(stdout); response.end(); }); }The exec() call is slow, but exec() itself is non-blocking. When called, a sub-process starts, the function goes into the event queue, and upload() returns immediately. Thus, the system-call is executed concurrently with other operations. That's the essence of node.js.
To be clear, message-passing is not the main point. The response object could be stored temporarily in a closure, which is typical in JavaScript. (In fact, exec() in node.js does not actually allow response to be passed to its function.) We do not need a mutex lock on the response object because this is a single-threaded program, but even that's not the point. We could have multiple threads and lock the response object. It's in-memory, so response.write() is very fast. We create new events (with threads, processes, or whatever) only for slow (aka blocking) operations, and we assign call-backs to those events so that processing can be deferred. This is a paradigm which makes concurrency simple.
25 Most Dangerous Software Errors
FWIW, these are the 25 most dangerous software errors, according to CWE/SANS.
- 93.8% Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
- 83.3% Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
- 79.0% Buffer Copy without Checking Size of Input ('Classic Buffer Overflow')
- 77.7% Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
- 76.9% Missing Authentication for Critical Function
- 76.8% Missing Authorization
- 75.0% Use of Hard-coded Credentials
- 75.0% Missing Encryption of Sensitive Data
- 74.0% Unrestricted Upload of File with Dangerous Type
- 73.8% Reliance on Untrusted Inputs in a Security Decision
- 73.1% Execution with Unnecessary Privileges
- 70.1% Cross-Site Request Forgery (CSRF)
- 69.3% Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
- 68.5% Download of Code Without Integrity Check
- 67.8% Incorrect Authorization
- 66.0% Inclusion of Functionality from Untrusted Control Sphere
- 65.5% Incorrect Permission Assignment for Critical Resource
- 64.6% Use of Potentially Dangerous Function
- 64.1% Use of a Broken or Risky Cryptographic Algorithm
- 62.4% Incorrect Calculation of Buffer Size
- 61.5% Improper Restriction of Excessive Authentication Attempts
- 61.1% URL Redirection to Untrusted Site ('Open Redirect')
- 61.0% Uncontrolled Format String
- 60.3% Integer Overflow or Wraparound
- 59.9% Use of a One-Way Hash without a Salt
SQL: The value of NOT NULL
Most SQL resources teach only the language. It's hard to find blogs with useful advice, and even harder to find bloggers who back up their claims. Here is an excellent article:
Here are some other useful links:
"The NOT IN took over 5 times longer to execute and did thousands of times more reads."That site has lots of interesting comparisons for various SQL queries.
Here are some other useful links:
Friday, July 8, 2011
GoLang: I've created a new 'brush' for SyntaxHighlighter
I created a SyntaxHighlighter file for Go. (To set up SyntaxHighlighter, refer to this.)
This example is from the Go website.
// Copyright 2009 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. /* Write and http server to present pages in the file system, but transformed somehow. Substitutable? With Fibonacci program? */ package main import ( "bytes" "expvar" "flag" "fmt" "http" "io" "log" "os" "strconv" ) // hello world, the web server var helloRequests = expvar.NewInt("hello-requests") func HelloServer(w http.ResponseWriter, req *http.Request) { helloRequests.Add(1) io.WriteString(w, "hello, world!\n") } // Simple counter server. POSTing to it will set the value. type Counter struct { n int } // This makes Counter satisfy the expvar.Var interface, so we can export // it directly. func (ctr *Counter) String() string { return fmt.Sprintf("%d", ctr.n) } func (ctr *Counter) ServeHTTP(w http.ResponseWriter, req *http.Request) { switch req.Method { case "GET": ctr.n++ case "POST": buf := new(bytes.Buffer) io.Copy(buf, req.Body) body := buf.String() if n, err := strconv.Atoi(body); err != nil { fmt.Fprintf(w, "bad POST: %v\nbody: [%v]\n", err, body) } else { ctr.n = n fmt.Fprint(w, "counter reset\n") } } fmt.Fprintf(w, "counter = %d\n", ctr.n) } // simple flag server var booleanflag = flag.Bool("boolean", true, "another flag for testing") func FlagServer(w http.ResponseWriter, req *http.Request) { w.Header().Set("Content-Type", "text/plain; charset=utf-8") fmt.Fprint(w, "Flags:\n") flag.VisitAll(func(f *flag.Flag) { if f.Value.String() != f.DefValue { fmt.Fprintf(w, "%s = %s [default = %s]\n", f.Name, f.Value.String(), f.DefValue) } else { fmt.Fprintf(w, "%s = %s\n", f.Name, f.Value.String()) } }) } // simple argument server func ArgServer(w http.ResponseWriter, req *http.Request) { for _, s := range os.Args { fmt.Fprint(w, s, " ") } } // a channel (just for the fun of it) type Chan chan int func ChanCreate() Chan { c := make(Chan) go func(c Chan) { for x := 0; ; x++ { c <- x } }(c) return c } func (ch Chan) ServeHTTP(w http.ResponseWriter, req *http.Request) { io.WriteString(w, fmt.Sprintf("channel send #%d\n", <-ch)) } // exec a program, redirecting output func DateServer(rw http.ResponseWriter, req *http.Request) { rw.Header().Set("Content-Type", "text/plain; charset=utf-8") r, w, err := os.Pipe() if err != nil { fmt.Fprintf(rw, "pipe: %s\n", err) return } p, err := os.StartProcess("/bin/date", []string{"date"}, &os.ProcAttr{Files: []*os.File{nil, w, w}}) defer r.Close() w.Close() if err != nil { fmt.Fprintf(rw, "fork/exec: %s\n", err) return } defer p.Release() io.Copy(rw, r) wait, err := p.Wait(0) if err != nil { fmt.Fprintf(rw, "wait: %s\n", err) return } if !wait.Exited() || wait.ExitStatus() != 0 { fmt.Fprintf(rw, "date: %v\n", wait) return } } func Logger(w http.ResponseWriter, req *http.Request) { log.Print(req.URL.Raw) w.WriteHeader(404) w.Write([]byte("oops")) } var webroot = flag.String("root", "/home/rsc", "web root directory") func main() { flag.Parse() // The counter is published as a variable directly. ctr := new(Counter) http.Handle("/counter", ctr) expvar.Publish("counter", ctr) http.Handle("/", http.HandlerFunc(Logger)) http.Handle("/go/", http.StripPrefix("/go/", http.FileServer(http.Dir(*webroot)))) http.Handle("/flags", http.HandlerFunc(FlagServer)) http.Handle("/args", http.HandlerFunc(ArgServer)) http.Handle("/go/hello", http.HandlerFunc(HelloServer)) http.Handle("/chan", ChanCreate()) http.Handle("/date", http.HandlerFunc(DateServer)) err := http.ListenAndServe(":12345", nil) if err != nil { log.Panicln("ListenAndServe:", err) } }
Embedding code with SyntaxHighlighter
Alex Gorbatchev's SyntaxHighlighter is the best thing around for adding syntax highlighting to code embedded into your blogs. Readers see line numbers but can easily cut-and-paste with or without those often pesky line numbers.
After you have all the necessary JavaScript and CSS files loaded into your web-page (see below for details on that) you have two choices for wrapping your source code. The "pre" method is necessary for CSS feeds, but the "script/CDATA" method handles embedded HTML tags without escaping them. The title is optional for both.
Examples:
If you are using Google's Blogger (aka 'blogspot') as I am, you can prepare for syntax highlighting all blog posts via your Design template. Here are the steps:
Add brushes for everything you might use, or try the new autoload feature of version 3.0. Then, you can use the pre or script/CDATA blocks as I've shown at the top of this page. If you need more help, follow these directions. Good luck!
After you have all the necessary JavaScript and CSS files loaded into your web-page (see below for details on that) you have two choices for wrapping your source code. The "pre" method is necessary for CSS feeds, but the "script/CDATA" method handles embedded HTML tags without escaping them. The title is optional for both.
<script type="syntaxhighlighter" class="brush: js"><![CDATA[ /** * SyntaxHighlighter */ function foo() { convert("<body>Hello</body>"); if (counter <= 10) return; // it works! } ]]></script>
Examples:
print "Hallo"
If you are using Google's Blogger (aka 'blogspot') as I am, you can prepare for syntax highlighting all blog posts via your Design template. Here are the steps:
- Click 'Design' at the top of your blogspot page.
- Click 'Edit HTML' at the top of that page.
- Find the tag which ends the HEAD section, and before the end insert the following:
<head> ... <link href='http://alexgorbatchev.com/pub/sh/current/styles/shThemeDefault.css' rel='stylesheet' type='text/css'/> <link href='http://alexgorbatchev.com/pub/sh/current/styles/shCore.css' rel='stylesheet' type='text/css'/> <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shCore.js' type='text/javascript'/> <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js' type='text/javascript'/> <script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPython.js' type='text/javascript'/> <script type='text/javascript'> SyntaxHighlighter.config.bloggerMode = true; SyntaxHighlighter.all(); </script> </head>
Add brushes for everything you might use, or try the new autoload feature of version 3.0. Then, you can use the pre or script/CDATA blocks as I've shown at the top of this page. If you need more help, follow these directions. Good luck!
Thursday, July 7, 2011
MVC Framework: Template-inversion
In many MVC frameworks, a template is used to generate web pages. For example, here is a template for Erb (the default template language for Ruby on Rails):
When I parse this using erb from the command-line, I get this:
That is usually done at runtime, within a Rails server.
Instead, I propose inverting the template prior to deployment, so that it becomes a Ruby file, like this:
For Ruby, there is no benefit. It's an extra step, and it's harder to debug. However, for a pre-compiled language like Go, the benefit is that the inverted template can be compiled and linked into the web application. All the code inside a template is fully type-checked before the app is deployed, which is both faster and safer than the current paradigm. It also means that the templates are compiled into the executable, rather than separate files, so deployment becomes simpler (assuming that static content is served by "the cloud", not by the application server).
Note that Go lacks an "eval" function, and rightly so. With template-inversion, "eval" is not needed.
A more realistic example in Go might result in something more like this:
Subscribe to:
Posts (Atom)