CFML 101 - The importance of proper variable scoping
One thing that always surprises me is the number of CFML developers who either don't use, or don't know about, proper variable scoping. I see it almost every day, whether it be at work, on side projects that I help out with, or even many of the open source CFML applications that I download and use. The lack of proper variable scoping seems to be rampant throughout the CFML world, most especially when it comes to the VARIABLES scope. I will admit that most developers seem to have a pretty good grasp on using the FORM and URL scopes, and required persistent scopes such as SESSION or APPLICATION, but that's about as far as it goes.
This bugs me. This really, really bugs me. I don't know why, but unscoped variables are like fingernails on a chalkboard to me sometimes. Maybe it's because I've inherited so many really (REALLY!) bad applications in my day and have experienced first hand the problems caused by unscoped variables. Maybe it's a touch of OCD with me knowing that variable scoping is the "recommended" and "proper" way of doing things, so that's how I have to do it. Or, maybe it's just some of the gamma rays from the alien satellites getting though my foil beanie. Whatever the reason, I felt compelled to blog about it.
Not too long ago I was talking with another developer and we got on the topic of unscoped variables. I made the comment that it was a bad practice, and his response was that it really wasn't that big of a deal. His thoughts were that it was easier not to bother scoping anything because ColdFusion can figure it out for you. I didn't know what to say. Here was a developer who had been using CFML for several years, and he had no real understanding of one of the most basic CFML principles there is.
I guess I really shouldn't be surprised, though. We're told to scope our variables, but never really told in detail why we should do it, and when we don't do it ColdFusion just figures everything out for us anyway. Many developers treat ColdFusion's ability to track down and find their variables as a feature (i.e., I don't have to specify a scope and CF will still know what I mean), but it shouldn't be viewed that way. Rather, it should be looked at as more of a built in safeguard to try to keep bad code from throwing errors. That's right, it's there to try to figure out your screw ups, not magically interperate your lazy coding.
DISCLAIMER: A lot of what I'm about to say comes from 1) reading documentation and blog posts from those smarter than myself, and 2) my own personal understanding on how things work. You've been warned!
For those that don't know, "scoping" a variable means providing a prefix that designates where the variable belongs (I.E. "form", "url", "request"). To really understand the importance of variable scoping, we need to take a look at what ColdFusion has to do to process unscoped variables. You see, when you don't specify where your variables are, ColdFusion has to go on a hunt to try to find them. This hunt follows what we call the Order of Precedence. This is basically the order of the scopes that ColdFusion searches to try to find your variable.
In CF 8, the order of precedence is:
- Function local (UDFs and CFCs only)
- Thread local (inside threads only)
- Arguments (like when you are inside of a function)
- Variables
- Thread (as in thread.this or thread.that)
- CGI
- Cffile
- URL
- Form
- Cookie
- Client
This differs slightly from the CF9 order of precedence, which was changed to:
- Local (function-local, UDFs and CFCs only)
- Arguments
- Thread local (inside threads only)
- Query (not a true scope; variables in query loops)
- Thread
- Variables
- CGI
- Cffile
- URL
- Form
- Cookie
- Client
So, every time you reference a variable without specifying its scope, ColdFusion has to start at the top of that list and go through each one of those scopes until it finds your variable, or reaches the end of the list and throws an error. Not only does it have to search each of those scopes, but it's my understanding that it also has to take a look at every single variable inside each of those scopes. Take the following example.
<cfoutput>#myName#</cfoutput>
Whenever you don't specify a scope when setting a variable, ColdFusion automatically puts it in the VARIABLES scope. As you can see from our list above, the VARIABLES scope comes in at number 6 in the CF 9 order of precedence. So that means that ColdFusion has to search 5 other scopes (and all of the variables in those scopes) before it even gets to the scope it needs. Not only that, but I don't think ColdFusion remembers where your variable is stored once it finds it (although, I may be wrong on that), so that would mean that it has to run through these same checks each and every time you reference that variable.
So, how many scope checks does ColdFusion have to go through to process the following everyday code sample?
<cfif IsDefined("myName") and myName neq "">
<cfoutput>#myName#</cfoutput>
</cfif>
12 if you're on CF 8, and 18 if you're on CF 9. And that's not even counting the number of variables CF had to look at inside each of those scopes. While there's a good chance some of those scopes may be completely empty, it's still unnecessary processing that ColdFusion is having to go through. You could easily drop the number of checks to zero simply by properly scoping those variables like so:
<cfif IsDefined("variables.myName") and variables.myName neq "">
<cfoutput>#variables.myName#</cfoutput>
</cfif>
Now, if you take a look back at our order of precedence you'll see that FORM and URL scopes are way down on the list. Which means that every time you reference a form or url variable without prefixing it with a scope, ColdFusion has to run through all of the previous scopes (and all of the variables in those scopes) to try to find your variable. That's a lot of searching on ColdFusion's part. Also, if a variable with the same name exists in one of those other scopes, ColdFusion will just grab it and use it instead and not even bother looking in the FORM or URL scope for your variable. When searching through its Order of Precedence, ColdFusion will always choose the variables on a first-come, first-serve basis.
Using our previous example, let's say the "myName" variable is in the FORM scope instead of the VARIABLES scope.
<cfoutput>#myName#</cfoutput>
</cfif>
The above example would run 27 scope checks on CF 8, and 30 scope checks on CF 9. Now, what if you've got another 15 form variables to process on that page, and you haven't specified a scope for any of them? It starts to add up quickly, doesn't it?
So as you can see, not scoping your variables can cause a lot of unnecessary overhead on ColdFusion's part. Allaire/Macromedia/Adobe have always recommended that you fully scope all variables (although for some reason they rarely do it correctly in the documentation). It's stated in the Fast Track To ColdFusion course, Adobe's Coding Best Practices for ColdFusion Performance, the ColdFusion curriculums, LiveDocs, and hopefully it's stated in the WACK books as well (I don't have a copy handy to look it up).
In the Coding Best Practices link mentioned above, Adobe says:
LiveDocs states:
It doesn't get much clearer than that.
Now, in all honesty, despite the work ColdFusion has to do to process unscoped variables, it only takes a fraction of a second to do it. But is that really an excuse not to follow good coding practices? In my opinion, no, it's not. There's never a good excuse to write bad code. I can't help but wonder would other programming languages allow something this sloppy?
Am I borderline fanatical about scoping? Probably. Am I going overboard in my little rant here? Probably. Will properly scoping your variables make your CFML applications more efficient? Absolutely.
This post is part of a continuing series of CFML 101 articles. My intent is to produce a blog series aimed at the beginning CFML developer, one which helps to explain basic techniques and concepts to those new to the CFML world. The topics and examples covered in this series focus on the CFML programming language in general, not a specific application server. So whether you're using ColdFusion, Railo, or Blue Dragon (referred to as CF/R/BD) to run your CFML applications, these concepts still apply.


VARIABLES.x[1] = (-VARIABLES.b + sqrt(VARIABLES.b^2 - 4 * VARIABLES.a * VARIABLES.c )) / 2 * VARIABLES.a;
That's a win, is it?
Do you consider it a win to intentionally write bad code just because it's easier to read?
That right there is one of the two roots returned by the quadratic formula -- but I couldn't tell that from reading it, or not easily. All that screaming about VARIABLES obscures what the code DOES. I don't do write-only code; readability is hugely important.
Bad code that's easy to read isn't bad code. Good code you can't read isn't good code.
It all boils down to a personal preference as to whether or not it's "good" or "bad" or easier to read or not. But, I can't help but wonder, if all CFML developers had been taught from day 1 to always prefix with the variables scope (just like form, url, session, etc...) and that was the accepted standard way of doing things, if it would still appear as "unreadable"?
I think one of the reasons it appears to look so odd is because most developers don't use it, so when you do see it it tends to stand out. While reading properly scoped form or url variables is perfectly fine because it's a widely used and accepted practice.
I think maybe it really comes down to a matter of perception. The word "variables" looks odd tacked on the front of your variable, but "arguments" or "application" look fine even though they have just as many, or more, letters than the world "variables".
I did a quick test, and the non-scoped version consistently out performs the scoped version!
I don't know if this blog will take the code format, but here it is:
<cftimer label="Non Scoped" type="outline">
<cfset myName = structNew()>
<cfloop from='1' to='10000' index="i">
<cfset myName[i] = "Eric#i#">
<cfif structKeyExists(myName,i) and myName[i] neq "">
<cfoutput>#myName[i]# </cfoutput>
</cfif>
</cfloop>
</cftimer>
<br >
<cftimer label="Scoped" type="outline">
<cfset variables.myName = structNew()>
<cfloop from='1' to='10000' index="variables.j">
<cfset variables.myName[variables.j] = "Eric#variables.j#">
<cfif structKeyExists(variables.myName,variables.j) and variables.myName[variables.j] neq "">
<cfoutput>#variables.myName[variables.j]# </cfoutput>
</cfif>
</cfloop>
</cftimer>
cftimer must be enabled to see the results.
Yeah, and it's the best argument for changing the scoping rules I can think of. ARGUMENTS and APPLICATION and FORM are *foreign*. They are *other*. I don't have a problem referring to them explicitly, and it frankly baffles me that they're *allowed* to pollute the local namespace. If I want to talk about a URL variable I'll bleeding well tell you to go look in the URL scope.
I'd settle for a compromise, you know. If I have to preface a local variable with a single symbol, like $ or something, I could go with that. I survived doing it in BASIC, after all. But all this "VARIABLES." business savages clarity -- a line with three references gets thirty characters longer!
<cfquery name = "qSomeQuery">
SELECT someValue
FROM someTable
</cfquery>
<cfset someValue = "x">
<cfset form.someValue = "y">
<cfset url.someValue = "z">
<cfoutput query = "qSomeQuery">#someValue#</cfotuput>
To me it is not clear what the programmer intended and now i have to maintain it. I try to write code so anyone who follows knows what my intentions are.
Thanks