Author Topic: CSV Line Formatting Function  (Read 14646 times)

Support

  • Administrator
  • *****
  • Posts: 1
    • View Profile
CSV Line Formatting Function
« on: November 10, 2014, 08:27:42 pm »
This Script BASIC function makes it easy to quickly format raw CSV (comma-separated values) files that has been a standard way of normalizing data for export / import. While searching for sample CSV data to play with I came across a site that posted the Sacramento crime log for January 1st, 2006 (24 hour period)

FYI: You can use either of the numeric formatting masks Script BASIC offers (BASIC or C style) and if you leave out the format info for a column, it will be ignored. This function will be added to the T.bas (Tools) extension module include file for the 2.2 release.



' result = FormatLine(in_str, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB

FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  SPLITAQ ln BY "," QUOTE qc TO col
  SPLITA fmtstr BY "|" TO fmtcmd
  rs = ""
  FOR x = 0 to UBOUND(col)
    SPLITA fmtcmd[x] BY ":" TO fmt
    IF fmt[0] = "L" THEN
      tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "R" THEN
      IF fmt[2] <> undef THEN
        tmp = FORMAT(fmt[2],col[x])
      ELSE
        tmp = col[x]
      END IF
      tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "C" THEN
      pad = fmt[1] - LEN(col[x])
      pboth = pad \ 2
      prt = pad % 2
      tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
      GOSUB Margin
    END IF
  NEXT
  GOTO Done

  Margin:
  IF nsp = -1 THEN
    tmp &= "\t"
  ELSE
    tmp &= STRING(nsp," ")
  END IF
  rs &= tmp  
  RETURN
 
  Done:
  FormatLine = rs
END FUNCTION

OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
OPEN "sac.fmt" FOR OUTPUT AS #2
fmtstr = "L:15|L:30|R:4|L:4|R:6|L:35|L:6|R:10:%~-##0.0000~|R:10:%~-##0.0000~"
LINE INPUT #1, hdr
WHILE NOT EOF(1)
  LINE INPUT #1, csvln
  csvln = CHOMP(csvln)
  PRINT #2, FormatLine(csvln,fmtstr,"",2),"\n"
WEND  

CLOSE(1)
CLOSE(2)
 
Output (7584 rows)

jrs@laptop:~/sb/sb22/test$ time scriba fmtline.sb

real   0m0.454s
user   0m0.415s
sys    0m0.036s
jrs@laptop:~/sb/sb22/test$

Code: [Select]
1/1/06 0:00      3108 OCCIDENTAL DR                 3  3C      1115  10851(A)VC TAKE VEH W/O OWNER        2404       38.5504   -121.3914 
1/1/06 0:00      2082 EXPEDITION WAY                5  5A      1512  459 PC  BURGLARY RESIDENCE           2204       38.4735   -121.4902 
1/1/06 0:00      4 PALEN CT                         2  2A       212  10851(A)VC TAKE VEH W/O OWNER        2404       38.6578   -121.4621 
1/1/06 0:00      22 BECKFORD CT                     6  6C      1443  476 PC PASS FICTICIOUS CHECK         2501       38.5068   -121.4270 
1/1/06 0:00      3421 AUBURN BLVD                   2  2A       508  459 PC  BURGLARY-UNSPECIFIED         2299       38.6374   -121.3846 
« Last Edit: November 11, 2014, 04:03:55 am by support »

Support

  • Administrator
  • *****
  • Posts: 1
    • View Profile
CSV to SQLite3 function
« Reply #1 on: November 12, 2014, 06:56:27 pm »
I thought I would take this another step and create a CSV2SQL function for the T.bas (Tools) extension module include file. This is just my proof of concept attempt and I'll make a function call out of the following example for the 2.2 release.

I attached an example of the final CSV2SQL SUB running on Android Linux native.

IMPORT sqlite.bas

OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
db = sqlite::open("sac116.db")
fmtstr = "SSISISIRR"
LINE INPUT #1, hdr
hdr = CHOMP(hdr)
SPLITA hdr BY "," TO col
SPLITA fmtstr BY "" TO typ
lastcol = UBOUND(col)
sql = "CREATE TABLE crime ("
FOR x = 0 TO lastcol
  tmp = ""
  IF typ[x] = "S" THEN
    tstr = " TEXT"
  ELSE IF typ[x] = "I" THEN
    tstr = " INTEGER"
  ELSE IF typ[x] = "R" THEN
    tstr = " REAL"
  END IF
  tmp &= col[x] & tstr
  IF x <> lastcol THEN tmp &= ", "
  sql &= tmp
NEXT
sql &= ");"
sqlite::execute(db, sql)
sqlite::execute(db, "BEGIN TRANSACTION")
WHILE NOT EOF(1)
  sql = "INSERT INTO crime VALUES ("
  LINE INPUT #1, csvln
  csvln = CHOMP(csvln)
  SPLITAQ csvln BY "," QUOTE "" TO col
  FOR x = 0 TO lastcol
    IF typ[x] = "S" THEN
      tmp = "'" & col[x] & "'"
    ELSE
      tmp = col[x]
    END IF
    IF x <> lastcol THEN tmp &= ", "
    sql &= tmp
  NEXT
  sql &= ");"
  sqlite::execute(db, sql)
WEND
sqlite::execute(db, "END TRANSACTION")
sqlite::close(db)
CLOSE(1)
 
Output

jrs@laptop:~/sb/sb22/test$ time scriba csv2sql.sb

real   0m0.763s
user   0m0.457s
sys    0m0.016s
jrs@laptop:~/sb/sb22/test$ sqlite3
SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .open sac116.db
sqlite> SELECT COUNT(*) FROM crime;
7584
sqlite> .q
jrs@laptop:~/sb/sb22/test$

« Last Edit: November 14, 2014, 01:50:01 am by support »

Support

  • Administrator
  • *****
  • Posts: 1
    • View Profile
SQLite/CSV Line Formatting Function
« Reply #2 on: November 14, 2014, 05:39:05 am »
I modified the fmtline() function to support either CSV or SQLite3 rows.

' result = FormatLine(in_str/array, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB

IMPORT sqlite.bas

FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  IF ISARRAY(ln) THEN
    col = ln
  ELSE
    SPLITAQ ln BY "," QUOTE qc TO col
  END IF
     
  SPLITA fmtstr BY "|" TO fmtcmd
  rs = ""
  FOR x = 0 to UBOUND(col)
    SPLITA fmtcmd[x] BY ":" TO fmt
    IF fmt[0] = "L" THEN
      tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "R" THEN
      IF fmt[2] <> undef THEN
        tmp = FORMAT(fmt[2],col[x])
      ELSE
        tmp = col[x]
      END IF
      tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "C" THEN
      pad = fmt[1] - LEN(col[x])
      pboth = pad \ 2
      prt = pad % 2
      tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
      GOSUB Margin
    END IF
  NEXT
  GOTO Done

  Margin:
  IF nsp = -1 THEN
    tmp &= "\t"
  ELSE
    tmp &= STRING(nsp," ")
  END IF
  rs &= tmp  
  RETURN
 
  Done:
  FormatLine = rs
END FUNCTION

db = sqlite::open("sac16.db")
stmt = sqlite::query(db,"SELECT * FROM crime LIMIT 1")
sqlite::row(stmt)
sqlite::FetchArray(stmt,columns)
fmtstr = "L:15|L:30|R:4|L:4|R:6|L:35|L:6|R:10:%~-##0.0000~|R:10:%~-##0.0000~"
PRINT FormatLine(columns,fmtstr,"",2),"\n"
sqlite::close(db)
 
Output
Code: [Select]
jrs@laptop:~/sb/sb22/test$ scriba fmtsqlrow.sb
1/1/06 0:00      3108 OCCIDENTAL DR                 3  3C      1115  10851(A)VC TAKE VEH W/O OWNER        2404       38.5504   -121.3914 
jrs@laptop:~/sb/sb22/test$