Extending UNIX grep to databases

Figure 943. UNIX grep sample Slide presentation
> grep --color  -i  fraction App.java
package de.hdm_stuttgart.mi.sd1.fraction;
 * Playing with fraction objects.
    final Fraction threeSeven = 
        new Fraction(3, 7);
    final Fraction

Colorize matched text in output

Ignore case when matching text.


The grep command among with its myriad of options allows searching text files for the occurrence of strings or patterns. This can be extended both to RDBMS and non-SQL databases like Mongodb.

The database query problem however is more complex than simply query lines of text files. A practically useful CLI must address:

Figure 944. Database grep requirements Slide presentation
  • Connection profile handling

  • Search level specification:

    • Whole database

    • Table(s)

    • Table column(s)

    • Recordsets

  • Output formatting, limiting/filtering and paging


Figure 945. Connection profile handling Slide presentation
Command File ~/.dbgrep/Profiles/postgresTest.cfg
dbgrep --profile postgresTest ...
host=myserver.companynet.com
port=5432
database=main_test_data
user=testuser
password=secret
driver=Driver/postgresql-42.6.0.jar

Figure 946. Search level specification Slide presentation
dbgrep ... 
dbgrep ... --table User --table Stocks ...
dbgrep ... --column User.userId  ...
dbgrep ... --table Stocks --column User.userId ... 

Search whole database i.e. all tables (and views?).

Search all compatible columns from tables User and Stocks.

Search column userId from table User.

Combined search of and .


Figure 947. Property based searches Slide presentation
dbgrep ... --equal 237 
dbgrep ... --greater 4.43 
dbgrep ... --like 'Smit%' 
dbgrep ... --like 'Smit%' --and --greater 4 
dbgrep ... --range [-3:17] 

Search for integer values equal to 237. Depending on the corresponding search level we distinguish:

Database

Search all tables for compatible integer columns. Return either a list of all tables containing at least one occurrence of 237 or the data records themselves depending on an output formatting rule,


A tool might be implemented either as a single command or as a terminal based application keeping state information.