Summarize a huge amount of files that have line based columns and data in to a single file with the first line the headers found in all files and the actual data as per row for each file, while the headers might change throughout the source files and need to be added dynamically.
This is a special script I wrote for someone else that had about 45k files to process. It is crazy enough to be worth posting here 🙂 and can be found on Spiceworks as well.
Situation:
- many .CSV files
- all have the columns per line instead of in the first line
- the data looks like
- column,data
- column,data
- he needs all files transferred in to one file in this format
- header,header,header
- data,data,data
- data,data,data
- from per line to one line as a header and the data in each line per file
- additional challenge
- the headers might change throughout the files and add more headers
What the script does:
- cycle through all files
- detect all headers
- cycle a second time through all files
- detect all the data
- write the data in the right column per line per file
Flaws:
- The script does not obey if there is data with a comma “,” – it would ignore what is behind that comma
Output:
- Output file is a single .CSV file, comma separated columns
Execute this way:
- Source Directory – where the .csv files reside
- Target Directory – where the new output .csv will be created
- open CMD / command prompt
- go to the script-directory (where you saved it)
- CSCRIPT scriptname.vbs “c:\sourcedirectory” “c:\targetdirectory”
CSCRIPT will avoid that you see a million message boxes – it will output directory on your CMD / command prompt window…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | Dim strPath, strOutputPath If WScript.Arguments.Count = 2 Then strPath = WScript.Arguments.Item(0) strOutputPath = WScript.Arguments.Item(1) Else Wscript.Echo "Usage: cscript AddContent.vbs ""C:\path"" ""C:\outputpath""" Wscript.Quit End If Const OutPutFileName = "OutputFile.csv" Dim objFS Set objFS = CreateObject("Scripting.FileSystemObject") If objFS.FolderExists(strPath) And objFS.FolderExists(strOutputPath) Then WScript.Echo "Working in directory: " & strPath Dim objFolder Set objFolder = objFS.GetFolder(strPath) Set outputFile = objFS.OpenTextFile(strOutputPath & "\" & OutPutFileName, 2, True) 'write/replace - don't append - create Dim strHeader, strLine For Each objFile in objFolder.Files If LCase(Right(objFile.Name, 4)) = LCase(".csv") Then 'only for .CSV files WScript.Echo "Processing file for headers: " & objFile.Name Set inputFile = objFS.OpenTextFile(strPath & "\" & objFile.Name, 1) 'reading Do While Not inputFile.AtEndOfStream strLine = Split(inputFile.ReadLine, ",", -1, 1) If Ubound(strLine) > 0 Then 'no content in this line If Not Instr(1, (strHeader & ","), (strLine(0) & ","), 1) > 0 Then 'find if we have this header If Len(strHeader) > 0 Then strHeader = strHeader & "," strHeader = strHeader & strLine(0) End If End If Loop inputFile.Close Set inputFile = Nothing End If Next outputFile.WriteLine strHeader 'let's write our first line - the headers we have WScript.Echo "..Finsihed processing the headers - wrote to file: " & strOutputPath & "\" & OutPutFileName Dim arrSplitHeader, iHeaders, cntHeaders arrSplitHeader = Split(strHeader, ",", -1, 1) iHeaders = Ubound(arrSplitHeader) Dim arrLines() ReDim arrLines((iHeaders)) Dim strOutLine For Each objFile in objFolder.Files For cntHeaders = 0 To iHeaders 'cleanup arrLines(cntHeaders) = "" Next If LCase(Right(objFile.Name, 4)) = LCase(".csv") Then 'only for .CSV files WScript.Echo "Processing file for data: " & objFile.Name Set inputFile = objFS.OpenTextFile(strPath & "\" & objFile.Name, 1) 'reading Do While Not inputFile.AtEndOfStream strLine = Split(inputFile.ReadLine, ",", -1, 1) If Ubound(strLine) > 0 Then 'we do have data For cntHeaders = 0 To iHeaders If arrSplitHeader(cntHeaders) = strLine(0) Then arrLines(cntHeaders) = strLine(1) Exit For End If Next End If Loop inputFile.Close Set inputFile = Nothing strOutLine = "" For cntHeaders = 0 To iHeaders 'writeout If Len(strOutLine) > 0 Then strOutLine = strOutLine & "," strOutLine = strOutLine & arrLines(cntHeaders) Next outputFile.WriteLine strOutLine End If Next outputFile.Close Set outputFile = Nothing WScript.Echo "..Finsihed - wrote to file: " & strOutputPath & "\" & OutPutFileName Set objFolder = Nothing End If Set objFS = Nothing |