You might have caught on that I’m a front-end developer – while I *have* done some ASP, the occasional PHP, minor SQL and even the rather obscure MIVAScript (DBF’s with index files) I definitely don’t consider myself a back-end guy. So, recently I was given the task of updating an e-commerce site, a global change across all products where the range of colors for certain items have all changed. As I originally setup the site (its a CoreCommerce site) I am the most familiar within the organization as to how it works and what such a change entails. My process would be to export the Personalization Table (this describes product-specific options), modify as needed and then import/apply the updates. Simple enough, right?
Most people I imagine would import the csv into Access or import into a temporary database on a server, do some SQL, export a new CSV, import into the system and be done with it. Well, as I’ve said, thats not quite my bag.
Easiest way to accomplish the above is in Notepad++. In a few minutes I had what I wanted. I did a search and replace on the single quote character, replacing it with the escaped version (\’) – I probably should have just replaced all single quotes with the entity equivalent but, oh well, either way works. Next, I did a search/replace on the newline character utilizing the “Extended” feature of Notepadd++’s Replace dialogue (a regular expression would have worked here as well) and replaced it with a comma so as to have each line ending with a comma. Finally, I removed all carriage returns with another Extended Search operation – the search looked for \r and replaced with nothing resulted with one single line. All that’s needed then is to assign the result to variable by placing “var data = ” before the string, surrounding the string with single quotes and tacking a semi-colon on the end. The finished string had a length of 1,939,099 characters.
Continuing in Notepad++ I created a simple HTML document and “imported” the data string via a Script tag, which looked like this:
Lets load the page in Chrome and consult the console (F12):
Note the message – the file has been loaded. No need to worry about the mime type. Lets verify that its there – lets check for the length of the string. Recall that I declared the variable as “data”, so lets do the following: data.length:
Ok then, lets move on. Lets convert this thing to an array. We know that the comma is the value delimiter so lets use the split array method using the comma as the required argument:
As you can see my function is called “rip” so I’ll reload the page and fire it manually by typing “rip()”:
A few milliseconds later its done. Note the “undefined” – there’s nothing wrong here – the rip() function doesn’t return anything so the “undefined” as the result of my function call is Chrome’s way of telling me as much. Its not that *nothing* has happened, its just that nothing was returned. If you like you could add a return statement to the end of the function but its totaly unecesary as we know why “undefined” appears.
So anyway now that we’ve fired rip() lets see if it worked. Lets check the length of the new array that was assigned to the “dataStepOne” variable:
You can see that to check the array’s length I typed the following into the console: “dataStepOne.length” – the length of the array is 467,728.
At this point I know that the data had 41 columns. What I want to do now is to recreate each row in some way – I decide to create a multidimensional array where each index of the 1st dimension represents a row and where the array at that dimension reflects the values for the row. This next bit accomplishes that task:
Again, within Chrome's console I execute the rip() function. This time I check the length of the dataStepTwo array by typing "dataStepTwo.length" and then hitting the enter key - the length is 11,408 - which is the number of rows in the original spreadsheet. To verify that the resulting data construct is good I decide to take a look at a "random" row by typing dataStepTwo. Chrome displays the array at index 200 for me. See the following image:
I cross reference the above array values against the 200th row in the spreadsheet and the values are identical. For kicks I also check the first and second rows of the array - the first row should contain all the column names, which it does, and the second row shouldn't have any columns names in it - a sanity to check to ensure that the column count is correct and that no row data is being erroneously placed in other rows. Yes, given the first test I already knew this to be the case but I wanted to look anyway. Everything is correct! Moving on....
So now that I have a representation of the data its time to start manipulating it. The way CoreCommerce's table is setup is a bit odd - you can see that one of the values has a pipe in it - this is a delimiter as there are actually two values in that field. I've no idea why they designed it this way - whatever, it is what it is. The thing to note here is that this reflects only a single color for the given product. There are 21 colors, so there are 21 rows representing the color options per drop-down selection list per product with the only difference being the name of the color in each row. See, the left value before the pipe represents the name of the drop-down list and the right side of the pipe represents the value that goes in that list. Yeah, I know, kinda crazy but that's what it is.
So knowing how things are structured I need to loop through the dataStepTwo array and remove/modify as needed.
First thing though is to create the new data that is to replace the old data. Here is the new data to replace the old:
Note that newData[x] is just the value that I'm concerned with - in the actual table its the pipe delimited value that I mentioned previously. As you continue reading you will see in my code that I am splitting the originally pipe-delimited value, replacing the  with a new value then joining and inserting the updated string.
After putting the above together and thinking about what I needed to do I realized that the only things that need to change are the values in index 4 and index 15 of the arrays representing each individual table row. So, most of the newData array as seen above is not necessary as I will only be using newData[x] and newData[x]. At this point I let it be as I wasn't sure if I would need it for some reason yet to be discovered. Anyway, no harm done, lets continue...
So, now about the logic - there are 14 new colors. Any color name that contains the words "Powder Coat" are ones that need to have their values changed. Caveat is that some existing products have more colors than what we will end up with. So I will need to edit the first 14 colors to reflect the new values and then delete any left over colors. Here's what I came up with:
To test this I capped the for loop at 100 and then as before took a look at specific dataStepTwo indexes via the console. I discovered that the script was working fine, modifying/deleting rows as needed. Next I wanted to do the full monty - rip through everything, rebuild the CSV and output to the screen via a textarea tag. Here then is the entire script:
And a pic of the final product - 10,752 rows, which is quite a few less than the original but that's expected as I have deleted unnecessary rows. All that is left is to copy and paste into a text file:
The best part is how little code was necessary to do this and there was no need to setup a temporary database and mess with SQL, it was more straightforward than what I thought it would be.