Using a Java Object to replace blanks in a list.

{ Posted By : Eric Cobb on March 26, 2010 }
4021 Views
Related Categories: Tips 'n Tricks, CFML, Java

Today I was working on parsing through a CSV file, when I came across the all too familiar "ColdFusion ignores blank list elements" situation. This is nothing new, it has been around forever, and there's even an excellent udf on cflib that does a great job of handling this. But, I was feeling a little creative and decided to see if I could tap into Java and accomplish the same thing.

As it turns out, it was actually pretty easy to do. First, let's start out with a basic list:

<cfset variables.myList = "CFML, ColdFusion, ColdFusion is da bomb,, CFgears">

Notice that we're missing an element between "da bomb" and "CFgears". If you count the items in the list, you'll see there are 5, but ColdFusion only sees 4 since it ignores empty list elements. For the sake of this example, let's replace the blank list element with two single quotes (''). At first glance, it seems like an easy fix. We can just use Java's replace method to make the change.

<!--- create a java object, initialize it with our list, then replace the empty list elements. --->
<cfset variables.myList = CreateObject("java","java.lang.String").init(trim(variables.myList)).replace(",,",",'',")>

There, that should get it. But wait, what happens if there is a space between the two commas? Or two spaces? Or two dozen spaces? In my particular situation, it was highly likely that I could have a list like this in my CSV file:

<cfset variables.myList = "CFML, ColdFusion, ColdFusion is da bomb,         , CFgears">

Since we have no idea how many spaces may be in our list, a simple replace isn't going to cut it. After a little trial and error, I came up with this regular expression to replace the blank list elements, regardless of how many spaces are between the two commas.

<!--- try our java object again, this time using replaceAll and a regular expression to replace the empty list elements. --->
<cfset variables.newList = CreateObject("java","java.lang.String").init(trim(variables.myList)).replaceAll(",[ ]*,",",'',")>

It's important to note that, while in the first example we used the replace method, when working with regular expressions we have to use the replaceAll or replaceFirst methods.

This works nicely and will convert all of our empty list elements to two single quotes (''), regardless of how many spaces are in the list element. That is, unless the first or last list elements happen to be missing, like so:

<cfset variables.myList = ", ColdFusion, ColdFusion is da bomb,         , CFgears">

Now what? Never fear! Java has built in StartsWith and EndsWith methods. So, we can easily check the beginning and end of our list for empty elements:

<!--- check the beginning of the list. --->
<cfif variables.newList.StartsWith(',')>
    <cfset variables.newList = "''" & variables.newList>
</cfif>
<!--- check the end of the list. --->
<cfif variables.newList.EndsWith(',')>
    <cfset variables.newList = variables.newList & "''">
</cfif>

(I would love to have played around with the string concatenation in Java, but ran out of time.)

There's one thing that I want to point out. If you'll notice, whenever I pass our list into the Java init() method, I'm trimming it. This is important. Trimming the list removes all excess spaces at the beginning and end of the list. Without the Trim() function, our StartsWith and EndsWith methods would not work on lists that have excess spaces at the beginning or end. Java's picky about stuff like that.

So, here's the completed code sample:

<!--- create our list. --->
<cfset variables.myList = ", ColdFusion, ColdFusion is da bomb,         , CFgears">
<!--- create a java object, initialize it with our list, then use replaceAll and a regular expression to replace the empty list elements. --->
<cfset variables.newList = CreateObject("java","java.lang.String").init(trim(variables.myList)).replaceAll(",[ ]*,",",'',")>
<!--- check the beginning of the list. --->
<cfif variables.newList.StartsWith(',')>
    <cfset variables.newList = "''" & variables.newList>
</cfif>
<!--- check the end of the list. --->
<cfif variables.newList.EndsWith(',')>
    <cfset variables.newList = variables.newList & "''">
</cfif>
<!--- show me whatcha got! --->
<cfoutput>#variables.newList#</cfoutput>

Now, in all honesty, there's nothing special about using Java for this. You could easily accomplish the same thing using CFML's REReplace() function. Simply replace the Java object call with:

<cfset variables.myList = REReplace(variables.myList,",[ ]*,",",'',")>

So why did I use a Java object instead of plain ole' CFML? Well, the main reason is, I just wanted to. One of my goals for this year is to learn Java, so little experiments like this help me out. While some would say that tapping into Java directly might be a performance boost (and technically it is), in a case like this any performance gain is so minuscule that it would surprise me if it's even measurable.

It's not so much that I was looking for a better way, just a different way to do the same thing.

Comments
Peter Boughton's Gravatar All seems overly complicated when you can just do one Java call...

Variables.MyList.split(',')

:)
# Posted By Peter Boughton | 3/26/10 10:10 AM
Eric Cobb's Gravatar @Peter - I think I'm missing something...how does splitting the list into an array replace the blank elements with a specified value? The Java Docs also state that "Trailing empty strings are therefore not included in the resulting array" (http://java.sun.com/j2se/1.4.2/docs/api/java/lang/...), so a blank list element at the end would get dropped from the array, right?
# Posted By Eric Cobb | 3/26/10 10:19 AM
Peter Boughton's Gravatar Ah, now that's annoying! Not encountered that and have been assuming all empties were included when splitting.

As you say, ones at the end are ignored, so ",,a,," produces an array of ["","","a"] and excludes the last two parts.

I guess since split is deliberately broken, you'd have to do the inverse...
rematch( '[^,]*' , ',,a,,' )

Just checked and that one does give the five expected items.
# Posted By Peter Boughton | 3/26/10 10:32 AM
Ben Nadel's Gravatar I like the exploration; the one thing I'm not sure about, however, is why you are considering spaces to be empty list items? If you execute:

listToArray( "a, ,c")

... you get an array with 3 items. ColdFusion does not consider spaces to be empty. Perhaps this was something specific to your CSV parsing?
# Posted By Ben Nadel | 3/30/10 9:23 AM
Eric Cobb's Gravatar @Ben - You're exactly right. I got so caught up in the specifics of my CSV file that I totally forgot that spaces can be considered legitimate list items in other cases. In my particular situation, any list item that contains only spaces or blanks is to be treated as empty. Thanks for pointing that out!
# Posted By Eric Cobb | 3/30/10 9:56 AM
Ben Nadel's Gravatar @Eric,

No problem - I figured it was a CSV-specific thing :)
# Posted By Ben Nadel | 3/30/10 12:44 PM